|
Ramanathapuram, Rajesh
2011-04-20, 13:43
Erick Erickson
2011-04-20, 15:59
Ramanathapuram, Rajesh
2011-04-20, 18:50
Ramanathapuram, Rajesh
2011-04-22, 16:23
Koji Sekiguchi
2011-04-23, 00:37
Ramanathapuram, Rajesh
2011-04-24, 01:18
Ramanathapuram, Rajesh
2011-04-24, 01:34
Koji Sekiguchi
2011-04-24, 01:50
Ramanathapuram, Rajesh
2011-04-24, 02:50
Koji Sekiguchi
2011-04-24, 03:06
Ramanathapuram, Rajesh
2011-04-24, 03:36
Robert Muir
2011-04-24, 04:52
Ramanathapuram, Rajesh
2011-04-24, 05:57
Ramanathapuram, Rajesh
2011-04-25, 18:07
|
-
Solr - Multi Term highlighting issueRamanathapuram, Rajesh 2011-04-20, 13:43
Hello,
I am dealing with a highlighting issue in SOLR, I will try to explain the issue. When I search for a single term in solr, it wraps tag around the words I want to highlight, all works well. But if I search multiple term, for most part highlighting works good and then for some of the terms, the highlight return multiple terms in a sing tag ... srchtrm1) <br><b><p>.... srchtrm2 I expect solr to return highlight terms like ... srchtrm1) <br><b><p>... srchtrm2 When I search for 'US mec chile', here is how my result appears ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST FIRES: We had ... with US and Chile ..., (MEC)</b></p><p></p><p><b>US .... This is what I was expecting it to be ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST FIRES: We had ... with US and Chile ..., (MEC)</b></p><p></p><p><b>US .... Here is my query params - <response> - <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">26</int> - <lst name="params"> <str name="hl.fragsize">100000</str> <str name="explainOther" /> <str name="indent">on</str> <str name="hl.fl">story, slug</str> <str name="wt">standard</str> <str name="hl">on</str> <str name="rows">10</str> <str name="version">2.2</str> <str name="hl.highlightMultiTerm">true</str> <str name="fl">*</str> <str name="start">0</str> <str name="q">mec us chile</str> <str name="qt">standard</str> <str name="hl.usePhraseHighlighter">true</str> <str name="fq">storyid="XXXX XXXX XXXXX"</str> </lst> </lst> Here are some other links I found in the forum, but no real conclusion http://www.lucidimagination.com/search/document/ac64e4f0abb6e4fc/solr_hi ghlighting_question#78163c42a67cb533 I am going to try this patch, which also had no conclusive results https://issues.apache.org/jira/browse/SOLR-1394 Has anyone come across this issue? Any suggestions on how to fix this issue is much appreciated. thanks & regards, Rajesh Ramana +
Ramanathapuram, Rajesh 2011-04-20, 13:43
-
Re: Solr - Multi Term highlighting issueErick Erickson 2011-04-20, 15:59
Does your configuration have "hl.mergeContiguous" set to true by any
chance? And what happens if you explicitly set this to "false" on your query? Best Erick On Wed, Apr 20, 2011 at 9:43 AM, Ramanathapuram, Rajesh <[EMAIL PROTECTED]> wrote: > Hello, > > I am dealing with a highlighting issue in SOLR, I will try to explain > the issue. > > When I search for a single term in solr, it wraps tag around the > words I want to highlight, all works well. > But if I search multiple term, for most part highlighting works good and > then for some of the terms, > the highlight return multiple terms in a sing tag ... > srchtrm1) <br><b><p>.... srchtrm2 > I expect solr to return highlight terms like ... srchtrm1) > <br><b><p>... srchtrm2 > > When I search for 'US mec chile', here is how my result appears > ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST FIRES: We > had ... with US and Chile ..., > (MEC)</b></p><p></p><p><b>US .... > > This is what I was expecting it to be > ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST > FIRES: We had ... with US and Chile ..., > (MEC)</b></p><p></p><p><b>US .... > > Here is my query params > - <response> > - <lst name="responseHeader"> > <int name="status">0</int> > <int name="QTime">26</int> > - <lst name="params"> > <str name="hl.fragsize">100000</str> > <str name="explainOther" /> > <str name="indent">on</str> > <str name="hl.fl">story, slug</str> > <str name="wt">standard</str> > <str name="hl">on</str> > <str name="rows">10</str> > <str name="version">2.2</str> > <str name="hl.highlightMultiTerm">true</str> > <str name="fl">*</str> > <str name="start">0</str> > <str name="q">mec us chile</str> > <str name="qt">standard</str> > <str name="hl.usePhraseHighlighter">true</str> > <str name="fq">storyid="XXXX XXXX XXXXX"</str> > </lst> > </lst> > > Here are some other links I found in the forum, but no real conclusion > > http://www.lucidimagination.com/search/document/ac64e4f0abb6e4fc/solr_hi > ghlighting_question#78163c42a67cb533 > > I am going to try this patch, which also had no conclusive results > https://issues.apache.org/jira/browse/SOLR-1394 > > Has anyone come across this issue? > Any suggestions on how to fix this issue is much appreciated. > > > thanks & regards, > Rajesh Ramana > +
Erick Erickson 2011-04-20, 15:59
-
RE: Solr - Multi Term highlighting issueRamanathapuram, Rajesh 2011-04-20, 18:50
Thanks Erick.
I tried your suggestion, the issue still exists. http://localhost:8983/searchsolr/mainCore/select?indent=on&version=2.2&q=mec+us+chile&fq=storyid%3DXXXXXXX%22&start=0&rows=10&fl=*&qt=standard&wt=standard&explainOther=&hl=on&hl.fl=story%2C+slug&hl.fragsize=100000&hl.highlightMultiTerm=true&hl.usePhraseHighlighter=true&hl.mergeContiguous=false - <lst name="params"> <str name="hl.fragsize">100000</str> <str name="explainOther" /> <str name="indent">on</str> <str name="hl.mergeContiguous">false</str> .... ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST FIRES ... thanks & regards, Rajesh Ramana -----Original Message----- From: Erick Erickson [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 20, 2011 11:59 AM To: [EMAIL PROTECTED] Subject: Re: Solr - Multi Term highlighting issue Does your configuration have "hl.mergeContiguous" set to true by any chance? And what happens if you explicitly set this to "false" on your query? Best Erick On Wed, Apr 20, 2011 at 9:43 AM, Ramanathapuram, Rajesh <[EMAIL PROTECTED]> wrote: > Hello, > > I am dealing with a highlighting issue in SOLR, I will try to explain > the issue. > > When I search for a single term in solr, it wraps tag around the > words I want to highlight, all works well. > But if I search multiple term, for most part highlighting works good > and then for some of the terms, the highlight return multiple terms in > a sing tag ... > srchtrm1) <br><b><p>.... srchtrm2 I expect solr to return > highlight terms like ... srchtrm1) <br><b><p>... > srchtrm2 > > When I search for 'US mec chile', here is how my result appears > ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST FIRES: > We had ... with US and Chile ..., > (MEC)</b></p><p></p><p><b>US .... > > This is what I was expecting it to be > ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST > FIRES: We had ... with US and Chile ..., > (MEC)</b></p><p></p><p><b>US .... > > Here is my query params > - <response> > - <lst name="responseHeader"> > <int name="status">0</int> > <int name="QTime">26</int> > - <lst name="params"> > <str name="hl.fragsize">100000</str> > <str name="explainOther" /> > <str name="indent">on</str> > <str name="hl.fl">story, slug</str> > <str name="wt">standard</str> > <str name="hl">on</str> > <str name="rows">10</str> > <str name="version">2.2</str> > <str name="hl.highlightMultiTerm">true</str> > <str name="fl">*</str> > <str name="start">0</str> > <str name="q">mec us chile</str> > <str name="qt">standard</str> > <str name="hl.usePhraseHighlighter">true</str> > <str name="fq">storyid="XXXX XXXX XXXXX"</str> > </lst> > </lst> > > Here are some other links I found in the forum, but no real conclusion > > http://www.lucidimagination.com/search/document/ac64e4f0abb6e4fc/solr_ > hi > ghlighting_question#78163c42a67cb533 > > I am going to try this patch, which also had no conclusive results > https://issues.apache.org/jira/browse/SOLR-1394 > > Has anyone come across this issue? > Any suggestions on how to fix this issue is much appreciated. > > > thanks & regards, > Rajesh Ramana > +
Ramanathapuram, Rajesh 2011-04-20, 18:50
-
RE: Solr - Multi Term highlighting issueRamanathapuram, Rajesh 2011-04-22, 16:23
Does anybody has other suggestions?
thanks & regards, Rajesh Ramana Enterprise Applications, Turner Broadcasting System, Inc. 404.878.7474 -----Original Message----- From: Ramanathapuram, Rajesh [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 20, 2011 2:51 PM To: [EMAIL PROTECTED] Subject: RE: Solr - Multi Term highlighting issue Thanks Erick. I tried your suggestion, the issue still exists. http://localhost:8983/searchsolr/mainCore/select?indent=on&version=2.2&q=mec+us+chile&fq=storyid%3DXXXXXXX%22&start=0&rows=10&fl=*&qt=standard&wt=standard&explainOther=&hl=on&hl.fl=story%2C+slug&hl.fragsize=100000&hl.highlightMultiTerm=true&hl.usePhraseHighlighter=true&hl.mergeContiguous=false - <lst name="params"> <str name="hl.fragsize">100000</str> <str name="explainOther" /> <str name="indent">on</str> <str name="hl.mergeContiguous">false</str> .... ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST FIRES ... thanks & regards, Rajesh Ramana -----Original Message----- From: Erick Erickson [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 20, 2011 11:59 AM To: [EMAIL PROTECTED] Subject: Re: Solr - Multi Term highlighting issue Does your configuration have "hl.mergeContiguous" set to true by any chance? And what happens if you explicitly set this to "false" on your query? Best Erick On Wed, Apr 20, 2011 at 9:43 AM, Ramanathapuram, Rajesh <[EMAIL PROTECTED]> wrote: > Hello, > > I am dealing with a highlighting issue in SOLR, I will try to explain > the issue. > > When I search for a single term in solr, it wraps tag around the > words I want to highlight, all works well. > But if I search multiple term, for most part highlighting works good > and then for some of the terms, the highlight return multiple terms in > a sing tag ... > srchtrm1) <br><b><p>.... srchtrm2 I expect solr to return > highlight terms like ... srchtrm1) <br><b><p>... > srchtrm2 > > When I search for 'US mec chile', here is how my result appears > ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST FIRES: > We had ... with US and Chile ..., > (MEC)</b></p><p></p><p><b>US .... > > This is what I was expecting it to be > ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST > FIRES: We had ... with US and Chile ..., > (MEC)</b></p><p></p><p><b>US .... > > Here is my query params > - <response> > - <lst name="responseHeader"> > <int name="status">0</int> > <int name="QTime">26</int> > - <lst name="params"> > <str name="hl.fragsize">100000</str> > <str name="explainOther" /> > <str name="indent">on</str> > <str name="hl.fl">story, slug</str> > <str name="wt">standard</str> > <str name="hl">on</str> > <str name="rows">10</str> > <str name="version">2.2</str> > <str name="hl.highlightMultiTerm">true</str> > <str name="fl">*</str> > <str name="start">0</str> > <str name="q">mec us chile</str> > <str name="qt">standard</str> > <str name="hl.usePhraseHighlighter">true</str> > <str name="fq">storyid="XXXX XXXX XXXXX"</str> > </lst> > </lst> > > Here are some other links I found in the forum, but no real conclusion > > http://www.lucidimagination.com/search/document/ac64e4f0abb6e4fc/solr_ > hi > ghlighting_question#78163c42a67cb533 > > I am going to try this patch, which also had no conclusive results > https://issues.apache.org/jira/browse/SOLR-1394 > > Has anyone come across this issue? > Any suggestions on how to fix this issue is much appreciated. > > > thanks & regards, > Rajesh Ramana > +
Ramanathapuram, Rajesh 2011-04-22, 16:23
-
Re: Solr - Multi Term highlighting issueKoji Sekiguchi 2011-04-23, 00:37
How are your hl.fl fields defined in schema.xml?
Koji -- http://www.rondhuit.com/en/ (11/04/23 1:23), Ramanathapuram, Rajesh wrote: > Does anybody has other suggestions? > > thanks& regards, > Rajesh Ramana > Enterprise Applications, Turner Broadcasting System, Inc. > 404.878.7474 > > > -----Original Message----- > From: Ramanathapuram, Rajesh [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, April 20, 2011 2:51 PM > To: [EMAIL PROTECTED] > Subject: RE: Solr - Multi Term highlighting issue > > Thanks Erick. > > I tried your suggestion, the issue still exists. > > http://localhost:8983/searchsolr/mainCore/select?indent=on&version=2.2&q=mec+us+chile&fq=storyid%3DXXXXXXX%22&start=0&rows=10&fl=*&qt=standard&wt=standard&explainOther=&hl=on&hl.fl=story%2C+slug&hl.fragsize=100000&hl.highlightMultiTerm=true&hl.usePhraseHighlighter=true&hl.mergeContiguous=false > > -<lst name="params"> > <str name="hl.fragsize">100000</str> > <str name="explainOther" /> > <str name="indent">on</str> > <str name="hl.mergeContiguous">false</str> .... > > > ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST FIRES ... > > > thanks& regards, > Rajesh Ramana > > > -----Original Message----- > From: Erick Erickson [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, April 20, 2011 11:59 AM > To: [EMAIL PROTECTED] > Subject: Re: Solr - Multi Term highlighting issue > > Does your configuration have "hl.mergeContiguous" set to true by any chance? And what happens if you explicitly set this to "false" on your query? > > Best > Erick > > On Wed, Apr 20, 2011 at 9:43 AM, Ramanathapuram, Rajesh<[EMAIL PROTECTED]> wrote: >> Hello, >> >> I am dealing with a highlighting issue in SOLR, I will try to explain >> the issue. >> >> When I search for a single term in solr, it wraps tag around the >> words I want to highlight, all works well. >> But if I search multiple term, for most part highlighting works good >> and then for some of the terms, the highlight return multiple terms in >> a sing tag ... >> srchtrm1)<br><b><p>.... srchtrm2 I expect solr to return >> highlight terms like ...srchtrm1)<br><b><p>... >> srchtrm2 >> >> When I search for 'US mec chile', here is how my result appears >> ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST FIRES: >> We had ... withUS andChile ..., >> (MEC)</b></p><p></p><p><b>US .... >> >> This is what I was expecting it to be >> ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST >> FIRES: We had ... withUS andChile ..., >> (MEC)</b></p><p></p><p><b>US .... >> >> Here is my query params >> -<response> >> -<lst name="responseHeader"> >> <int name="status">0</int> >> <int name="QTime">26</int> >> -<lst name="params"> >> <str name="hl.fragsize">100000</str> >> <str name="explainOther" /> >> <str name="indent">on</str> >> <str name="hl.fl">story, slug</str> >> <str name="wt">standard</str> >> <str name="hl">on</str> >> <str name="rows">10</str> >> <str name="version">2.2</str> >> <str name="hl.highlightMultiTerm">true</str> >> <str name="fl">*</str> >> <str name="start">0</str> >> <str name="q">mec us chile</str> >> <str name="qt">standard</str> >> <str name="hl.usePhraseHighlighter">true</str> >> <str name="fq">storyid="XXXX XXXX XXXXX"</str> >> </lst> >> </lst> >> >> Here are some other links I found in the forum, but no real conclusion >> >> http://www.lucidimagination.com/search/document/ac64e4f0abb6e4fc/solr_ >> hi >> ghlighting_question#78163c42a67cb533 >> >> I am going to try this patch, which also had no conclusive results >> https://issues.apache.org/jira/browse/SOLR-1394 >> >> Has anyone come across this issue? >> Any suggestions on how to fix this issue is much appreciated. +
Koji Sekiguchi 2011-04-23, 00:37
-
RE: Solr - Multi Term highlighting issueRamanathapuram, Rajesh 2011-04-24, 01:18
I don't have hl.fl defined in my schema.xml, I am passing it in as my
query parameters <str name="hl.fl">story, slug</str> The elongated parameters is sent like this... 'hl' => 'on', 'hl.fragsize' => $fragsize, 'hl.maxAnalyzedChars' => $fragsize, 'hl.fl' => 'slug,story', 'hl.simple.pre' => '<span class="' .$className . '">', 'hl.simple.post' => '</span>', Here is my query params in response -<response> -<lst name="responseHeader"> <int name="status">0</int> <int name="QTime">26</int> -<lst name="params"> <str name="hl.fragsize">100000</str> <str name="explainOther" /> <str name="indent">on</str> <str name="hl.fl">story, slug</str> <str name="wt">standard</str> <str name="hl">on</str> <str name="rows">10</str> <str name="version">2.2</str> <str name="hl.highlightMultiTerm">true</str> <str name="fl">*</str> <str name="start">0</str> <str name="q">mec us chile</str> <str name="qt">standard</str> <str name="hl.usePhraseHighlighter">true</str> <str name="fq">storyid="XXXX XXXX XXXXX"</str> </lst> </lst> Please let me know. thanks & regards, Rajesh Ramana -----Original Message----- From: Koji Sekiguchi [mailto:[EMAIL PROTECTED]] Sent: Friday, April 22, 2011 8:38 PM To: [EMAIL PROTECTED] Subject: Re: Solr - Multi Term highlighting issue How are your hl.fl fields defined in schema.xml? Koji -- http://www.rondhuit.com/en/ (11/04/23 1:23), Ramanathapuram, Rajesh wrote: > Does anybody has other suggestions? > > thanks& regards, > Rajesh Ramana > Enterprise Applications, Turner Broadcasting System, Inc. > 404.878.7474 > > > -----Original Message----- > From: Ramanathapuram, Rajesh [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, April 20, 2011 2:51 PM > To: [EMAIL PROTECTED] > Subject: RE: Solr - Multi Term highlighting issue > > Thanks Erick. > > I tried your suggestion, the issue still exists. > > http://localhost:8983/searchsolr/mainCore/select?indent=on&version=2.2 > &q=mec+us+chile&fq=storyid%3DXXXXXXX%22&start=0&rows=10&fl=*&qt=standa > rd&wt=standard&explainOther=&hl=on&hl.fl=story%2C+slug&hl.fragsize=100 > 000&hl.highlightMultiTerm=true&hl.usePhraseHighlighter=true&hl.mergeCo > ntiguous=false > > -<lst name="params"> > <str name="hl.fragsize">100000</str> > <str name="explainOther" /> > <str name="indent">on</str> > <str name="hl.mergeContiguous">false</str> .... > > > ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST FIRES ... > > > thanks& regards, > Rajesh Ramana > > > -----Original Message----- > From: Erick Erickson [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, April 20, 2011 11:59 AM > To: [EMAIL PROTECTED] > Subject: Re: Solr - Multi Term highlighting issue > > Does your configuration have "hl.mergeContiguous" set to true by any chance? And what happens if you explicitly set this to "false" on your query? > > Best > Erick > > On Wed, Apr 20, 2011 at 9:43 AM, Ramanathapuram, Rajesh<[EMAIL PROTECTED]> wrote: >> Hello, >> >> I am dealing with a highlighting issue in SOLR, I will try to explain >> the issue. >> >> When I search for a single term in solr, it wraps tag around the >> words I want to highlight, all works well. >> But if I search multiple term, for most part highlighting works good >> and then for some of the terms, the highlight return multiple terms in >> a sing tag ... >> srchtrm1)<br><b><p>.... srchtrm2 I expect solr to return >> highlight terms like ...srchtrm1)<br><b><p>... >> srchtrm2 >> >> When I search for 'US mec chile', here is how my result appears >> ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST FIRES: >> We had ... withUS andChile ..., >> (MEC)</b></p><p></p><p><b>US .... >> >> This is what I was expecting it to be +
Ramanathapuram, Rajesh 2011-04-24, 01:18
-
RE: Solr - Multi Term highlighting issueRamanathapuram, Rajesh 2011-04-24, 01:34
Also, I found this in SolrConfig.xml ...
<requestHandler name="dismax" class="solr.SearchHandler" > <lst name="defaults"> <str name="defType">dismax</str> <str name="echoParams">explicit</str> <float name="tie">0.01</float> <str name="qf"> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 </str> <str name="pf"> text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9 </str> <str name="bf"> popularity^0.5 recip(price,1,1000,1000)^0.3 </str> <str name="fl"> id,name,price,score </str> <str name="mm"> 2<-1 5<-2 6<90% </str> <int name="ps">100</int> <str name="q.alt">*:*</str> <!-- example highlighter config, enable per-query with hl=true --> <str name="hl.fl">text features name</str> <!-- for this field, we want no fragmenting, just highlighting --> <str name="f.name.hl.fragsize">0</str> <!-- instructs Solr to return the field itself if no query terms are found --> <str name="f.name.hl.alternateField">name</str> <str name="f.text.hl.fragmenter">regex</str> <!-- defined below --> </lst> </requestHandler> And also this .... <highlighting> <!-- Configure the standard fragmenter --> <!-- This could most likely be commented out in the "default" case --> <fragmenter name="gap" class="org.apache.solr.highlight.GapFragmenter" default="true"> <lst name="defaults"> <int name="hl.fragsize">100</int> </lst> </fragmenter> <!-- A regular-expression-based fragmenter (f.i., for sentence extraction) --> <fragmenter name="regex" class="org.apache.solr.highlight.RegexFragmenter"> <lst name="defaults"> <!-- slightly smaller fragsizes work better because of slop --> <int name="hl.fragsize">70</int> <!-- allow 50% slop on fragment sizes --> <float name="hl.regex.slop">0.5</float> <!-- a basic sentence pattern --> <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</str> </lst> </fragmenter> <!-- Configure the standard formatter --> <formatter name="html" class="org.apache.solr.highlight.HtmlFormatter" default="true"> <lst name="defaults"> <str name="hl.simple.pre"><![CDATA[]]></str> <str name="hl.simple.post"><![CDATA[]]></str> </lst> </formatter> </highlighting> Hope this sheds some light on identifying this issue. thanks & regards, Rajesh Ramana -----Original Message----- From: Ramanathapuram, Rajesh [mailto:[EMAIL PROTECTED]] Sent: Saturday, April 23, 2011 9:18 PM To: [EMAIL PROTECTED] Subject: RE: Solr - Multi Term highlighting issue I don't have hl.fl defined in my schema.xml, I am passing it in as my query parameters <str name="hl.fl">story, slug</str> The elongated parameters is sent like this... 'hl' => 'on', 'hl.fragsize' => $fragsize, 'hl.maxAnalyzedChars' => $fragsize, 'hl.fl' => 'slug,story', 'hl.simple.pre' => '<span class="' .$className . '">', 'hl.simple.post' => '</span>', Here is my query params in response -<response> -<lst name="responseHeader"> <int name="status">0</int> <int name="QTime">26</int> -<lst name="params"> <str name="hl.fragsize">100000</str> <str name="explainOther" /> <str name="indent">on</str> <str name="hl.fl">story, slug</str> <str name="wt">standard</str> <str name="hl">on</str> <str name="rows">10</str> <str name="version">2.2</str> <str name="hl.highlightMultiTerm">true</str> <str name="fl">*</str> <str name="start">0</str> <str name="q">mec us chile</str> <str name="qt">standard</str> <str name="hl.usePhraseHighlighter">true</str> <str name="fq">storyid="XXXX XXXX XXXXX"</str> </lst> </lst> Please let me know. thanks & regards, Rajesh Ramana -----Original Message----- From: Koji Sekiguchi [mailto:[EMAIL PROTECTED]] Sent: Friday, April 22, 2011 8:38 PM To: [EMAIL PROTECTED] Subject: Re: Solr - Multi Term highlighting issue How are your hl.fl fields defined in schema.xml? Koji http://www.rondhuit.com/en/ (11/04/23 1:23), Ramanathapuram, Rajesh wrote: chance? And what happens if you explicitly set this to "false" on your query? Rajesh<[EMAIL PROTECTED]> wrote: in +
Ramanathapuram, Rajesh 2011-04-24, 01:34
-
Re: Solr - Multi Term highlighting issueKoji Sekiguchi 2011-04-24, 01:50
Hi Rajesh,
My question was how story and slug fields are defined in schema.xml. In other words, please show us your <fieldType/> and <field/> for those fields. Koji -- http://www.rondhuit.com/en/ (11/04/24 10:18), Ramanathapuram, Rajesh wrote: > I don't have hl.fl defined in my schema.xml, I am passing it in as my > query parameters > > <str name="hl.fl">story, slug</str> > > The elongated parameters is sent like this... > 'hl' => 'on', > 'hl.fragsize' => $fragsize, > 'hl.maxAnalyzedChars' => $fragsize, > 'hl.fl' => 'slug,story', > 'hl.simple.pre' => '<span class="' .$className . '">', > 'hl.simple.post' => '</span>', > > Here is my query params in response > > -<response> > -<lst name="responseHeader"> > <int name="status">0</int> > <int name="QTime">26</int> > -<lst name="params"> > <str name="hl.fragsize">100000</str> > <str name="explainOther" /> > <str name="indent">on</str> > <str name="hl.fl">story, slug</str> > <str name="wt">standard</str> > <str name="hl">on</str> > <str name="rows">10</str> > <str name="version">2.2</str> > <str name="hl.highlightMultiTerm">true</str> > <str name="fl">*</str> > <str name="start">0</str> > <str name="q">mec us chile</str> > <str name="qt">standard</str> > <str name="hl.usePhraseHighlighter">true</str> > <str name="fq">storyid="XXXX XXXX XXXXX"</str> > </lst> > </lst> > > Please let me know. > > thanks& regards, > Rajesh Ramana > > > -----Original Message----- > From: Koji Sekiguchi [mailto:[EMAIL PROTECTED]] > Sent: Friday, April 22, 2011 8:38 PM > To: [EMAIL PROTECTED] > Subject: Re: Solr - Multi Term highlighting issue > > How are your hl.fl fields defined in schema.xml? > > Koji > -- > http://www.rondhuit.com/en/ > > (11/04/23 1:23), Ramanathapuram, Rajesh wrote: >> Does anybody has other suggestions? >> >> thanks& regards, >> Rajesh Ramana >> Enterprise Applications, Turner Broadcasting System, Inc. >> 404.878.7474 >> >> >> -----Original Message----- >> From: Ramanathapuram, Rajesh [mailto:[EMAIL PROTECTED]] >> Sent: Wednesday, April 20, 2011 2:51 PM >> To: [EMAIL PROTECTED] >> Subject: RE: Solr - Multi Term highlighting issue >> >> Thanks Erick. >> >> I tried your suggestion, the issue still exists. >> >> http://localhost:8983/searchsolr/mainCore/select?indent=on&version=2.2 >> &q=mec+us+chile&fq=storyid%3DXXXXXXX%22&start=0&rows=10&fl=*&qt=standa >> rd&wt=standard&explainOther=&hl=on&hl.fl=story%2C+slug&hl.fragsize=100 >> 000&hl.highlightMultiTerm=true&hl.usePhraseHighlighter=true&hl.mergeCo >> ntiguous=false >> >> -<lst name="params"> >> <str name="hl.fragsize">100000</str> >> <str name="explainOther" /> >> <str name="indent">on</str> >> <str name="hl.mergeContiguous">false</str> .... >> >> >> ... Corboba. (MEC)</b></p><p></p><p><b>CHILE/FOREST FIRES ... >> >> >> thanks& regards, >> Rajesh Ramana >> >> >> -----Original Message----- >> From: Erick Erickson [mailto:[EMAIL PROTECTED]] >> Sent: Wednesday, April 20, 2011 11:59 AM >> To: [EMAIL PROTECTED] >> Subject: Re: Solr - Multi Term highlighting issue >> >> Does your configuration have "hl.mergeContiguous" set to true by any > chance? And what happens if you explicitly set this to "false" on your > query? >> >> Best >> Erick >> >> On Wed, Apr 20, 2011 at 9:43 AM, Ramanathapuram, > Rajesh<[EMAIL PROTECTED]> wrote: >>> Hello, >>> >>> I am dealing with a highlighting issue in SOLR, I will try to explain > >>> the issue. >>> >>> When I search for a single term in solr, it wraps tag around the > >>> words I want to highlight, all works well. >>> But if I search multiple term, for most part highlighting works good >>> and then for some of the terms, the highlight return multiple terms > in +
Koji Sekiguchi 2011-04-24, 01:50
-
RE: Solr - Multi Term highlighting issueRamanathapuram, Rajesh 2011-04-24, 02:50
Hi Koji,
My apologies for misunderstanding the question ... here is Fields ... <fields> <field name="storyid" type="string" indexed="true" stored="true" required="true" /> <field name="slug" type="text" indexed="false" stored="true" /> <field name="author" type="string" indexed="true" stored="true" /> <field name="status" type="string" indexed="false" stored="true" /> <field name="docdate" type="tdate" indexed="true" stored="true" /> <field name="createdate" type="tdate" indexed="false" stored="true" /> <field name="modifyby" type="string" indexed="true" stored="true" /> <field name="story" type="text" indexed="false" stored="true" /> <field name="queue" type="lowercase" indexed="true" stored="true" /> <field name="modifydate" type="tdate" indexed="false" stored="true" /> <field name="endorser" type="string" indexed="false" stored="true" /> <field name="slug_sort" type="lowercase" indexed="true" stored="false" /> <field name="url" type="string" indexed="false" stored="true" /> <field name="showtitle" type="string" indexed="true" stored="true" /> <field name="date_sort" type="pdate" indexed="true" stored="false" sortMissingFirst="true" /> <field name="site" type="string" stored="true" indexed="true"/> <field name="segment" type="string" stored="true" indexed="false"/> <field name="digest" type="string" stored="true" indexed="false"/> <field name="boost" type="float" stored="true" indexed="false"/> <field name="host" type="url" stored="false" indexed="true"/> <field name="tstamp" type="long" stored="true" indexed="false" /> <field name="anchor" type="string" stored="true" indexed="true" multiValued="true"/> <field name="headline" type="string" indexed="true" stored="true" /> <field name="highlight" type="string" indexed="true" stored="true" /> <field name="guests" type="string" indexed="true" stored="true" /> <field name="transcriptnum" type="string" indexed="false" stored="true" /> <field name="additionalinewsfields" type="text" indexed="false" stored="true" /> <field name="all_text" type="text" indexed="true" stored="false" multiValued="true"/> <field name="timestamp" type="date" indexed="true" stored="true" default="NOW" multiValued="false"/> <dynamicField name="*_kstem" type="text_kstem" indexed="true" stored="true" multiValued="true"/> </fields> <uniqueKey>storyid</uniqueKey> And here is Types ... <types> <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/> <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true"/> <fieldtype name="binary" class="solr.BinaryField"/> <fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/> <fieldType name="float" class="solr.TrieFloatField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/> <fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/> <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/> <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/> <fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/> <fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/> <fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/> <fieldType name="date" class="solr.TrieDateField" omitNorms="true" precisionStep="0" positionIncrementGap="0"/> <fieldType name="tdate" class="solr.TrieDateField" omitNorms="true" precisionStep="6" positionIncrementGap="0"/> <fieldType name="pint" class="solr.IntField" omitNorms="true"/> <fieldType name="plong" class="solr.LongField" omitNorms="true"/> <fieldType name="pfloat" class="solr.FloatField" omitNorms="true"/> <fieldType name="pdouble" class="solr.DoubleField" omitNorms="true"/> <fieldType name="pdate" class="solr.DateField" sortMissingLast="true" omitNorms="true"/> <fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/> <fieldType name="slong" class="solr.SortableLongField" sortMissingLast="true" omitNorms="true"/> <fieldType name="sfloat" class="solr.SortableFloatField" sortMissingLast="true" omitNorms="true"/> <fieldType name="sdouble" class="solr.SortableDoubleField" sortMissingLast="true" omitNorms="true"/> <fieldType name="random" class="solr.RandomSortField" indexed="true" /> <!-- A text field that only splits on whitespace for exact matching of words --> <fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType> <fieldType name="text_kstem" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="false" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="com.lucidimagination.solrworks.analysis.LucidKStemFilterFactory" protected="protwords.txt"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers= +
Ramanathapuram, Rajesh 2011-04-24, 02:50
-
Re: Solr - Multi Term highlighting issueKoji Sekiguchi 2011-04-24, 03:06
Thank you for sending the definitions. I thought you defined n-gram based
field for story and slug, but your definitions looks fine for me. I don't understand why you got such strange snippets. I think you can open a jira issue for this problem (sorry I cannot take it at this moment) with a test case that reproduces the problem would be much helpful. Koji (11/04/24 11:50), Ramanathapuram, Rajesh wrote: > Hi Koji, > > My apologies for misunderstanding the question ... > > here is Fields ... > > <fields> > > <field name="storyid" type="string" indexed="true" stored="true" > required="true" /> > <field name="slug" type="text" indexed="false" stored="true" /> > <field name="author" type="string" indexed="true" stored="true" /> > <field name="status" type="string" indexed="false" stored="true" /> > <field name="docdate" type="tdate" indexed="true" stored="true" /> > <field name="createdate" type="tdate" indexed="false" stored="true" > /> > <field name="modifyby" type="string" indexed="true" stored="true" /> > <field name="story" type="text" indexed="false" stored="true" /> > <field name="queue" type="lowercase" indexed="true" > stored="true" /> > <field name="modifydate" type="tdate" indexed="false" > stored="true" /> > <field name="endorser" type="string" indexed="false" > stored="true" /> > <field name="slug_sort" type="lowercase" indexed="true" > stored="false" /> > <field name="url" type="string" indexed="false" stored="true" /> > <field name="showtitle" type="string" indexed="true" > stored="true" /> > <field name="date_sort" type="pdate" indexed="true" > stored="false" sortMissingFirst="true" /> > > <field name="site" type="string" stored="true" indexed="true"/> > <field name="segment" type="string" stored="true" > indexed="false"/> > <field name="digest" type="string" stored="true" indexed="false"/> > <field name="boost" type="float" stored="true" indexed="false"/> > <field name="host" type="url" stored="false" indexed="true"/> > <field name="tstamp" type="long" stored="true" indexed="false" > /> > <field name="anchor" type="string" stored="true" indexed="true" > multiValued="true"/> > <field name="headline" type="string" indexed="true" > stored="true" /> > <field name="highlight" type="string" indexed="true" > stored="true" /> > <field name="guests" type="string" indexed="true" stored="true" > /> > <field name="transcriptnum" type="string" indexed="false" > stored="true" /> > <field name="additionalinewsfields" type="text" indexed="false" > stored="true" /> > > <field name="all_text" type="text" indexed="true" stored="false" > multiValued="true"/> > > <field name="timestamp" type="date" indexed="true" stored="true" > default="NOW" multiValued="false"/> > > <dynamicField name="*_kstem" type="text_kstem" indexed="true" > stored="true" multiValued="true"/> > </fields> > <uniqueKey>storyid</uniqueKey> > > > And here is Types ... > > <types> > <fieldType name="string" class="solr.StrField" > sortMissingLast="true" omitNorms="true"/> > > <fieldType name="boolean" class="solr.BoolField" > sortMissingLast="true" omitNorms="true"/> > > <fieldtype name="binary" class="solr.BinaryField"/> > > <fieldType name="int" class="solr.TrieIntField" precisionStep="0" > omitNorms="true" positionIncrementGap="0"/> > <fieldType name="float" class="solr.TrieFloatField" > precisionStep="0" omitNorms="true" positionIncrementGap="0"/> > <fieldType name="long" class="solr.TrieLongField" precisionStep="0" > omitNorms="true" positionIncrementGap="0"/> > <fieldType name="double" class="solr.TrieDoubleField" > precisionStep="0" omitNorms="true" positionIncrementGap="0"/> > > <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" > omitNorms="true" positionIncrementGap="0"/> > <fieldType name="tfloat" class="solr.TrieFloatField" > precisionStep="8" omitNorms="true" positionIncrementGap="0"/> > <fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" http://www.rondhuit.com/en/ +
Koji Sekiguchi 2011-04-24, 03:06
-
RE: Solr - Multi Term highlighting issueRamanathapuram, Rajesh 2011-04-24, 03:36
Hi Koji,
Thanks for taking time to look into this issue, I really appreciate your efforts. I am wondering the problem might be a document format issue(just my guess). What is really weird is if I search for srchterm1 and srchterm2 separately, the results come up fine. If I search for multiple terms, this issue seems to happen when the terms are separated by html tags and special characters like ') / \' etc... I am fairly new to SOLR, still trying to understand, how things work. My guess is somewhere or somehow a whitespace is missed, and the highlight regex based fragmentor is messing things up. Here it is (for review) from solrconfig.xml, if you can think of anything obvious.... <highlighting> <!-- Configure the standard fragmenter --> <!-- This could most likely be commented out in the "default" case --> <fragmenter name="gap" class="org.apache.solr.highlight.GapFragmenter" default="true"> <lst name="defaults"> <int name="hl.fragsize">100</int> </lst> </fragmenter> <!-- A regular-expression-based fragmenter (f.i., for sentence extraction) --> <fragmenter name="regex" class="org.apache.solr.highlight.RegexFragmenter"> <lst name="defaults"> <!-- slightly smaller fragsizes work better because of slop --> <int name="hl.fragsize">70</int> <!-- allow 50% slop on fragment sizes --> <float name="hl.regex.slop">0.5</float> <!-- a basic sentence pattern --> <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</str> </lst> </fragmenter> <!-- Configure the standard formatter --> <formatter name="html" class="org.apache.solr.highlight.HtmlFormatter" default="true"> <lst name="defaults"> <str name="hl.simple.pre"><![CDATA[]]></str> <str name="hl.simple.post"><![CDATA[]]></str> </lst> </formatter> </highlighting> I will try to open a JIRA issue in the next couple of weeks, when my schedule slows down. Once again, thanks much for your help. thanks & regards, Rajesh Ramana -----Original Message----- From: Koji Sekiguchi [mailto:[EMAIL PROTECTED]] Sent: Saturday, April 23, 2011 11:07 PM To: [EMAIL PROTECTED] Subject: Re: Solr - Multi Term highlighting issue Thank you for sending the definitions. I thought you defined n-gram based field for story and slug, but your definitions looks fine for me. I don't understand why you got such strange snippets. I think you can open a jira issue for this problem (sorry I cannot take it at this moment) with a test case that reproduces the problem would be much helpful. Koji (11/04/24 11:50), Ramanathapuram, Rajesh wrote: > Hi Koji, > > My apologies for misunderstanding the question ... > > here is Fields ... > > <fields> > > <field name="storyid" type="string" indexed="true" stored="true" > required="true" /> > <field name="slug" type="text" indexed="false" stored="true" /> > <field name="author" type="string" indexed="true" stored="true" /> > <field name="status" type="string" indexed="false" stored="true" /> > <field name="docdate" type="tdate" indexed="true" stored="true" /> > <field name="createdate" type="tdate" indexed="false" stored="true" > /> > <field name="modifyby" type="string" indexed="true" stored="true" /> > <field name="story" type="text" indexed="false" stored="true" /> > <field name="queue" type="lowercase" indexed="true" > stored="true" /> > <field name="modifydate" type="tdate" indexed="false" > stored="true" /> > <field name="endorser" type="string" indexed="false" > stored="true" /> > <field name="slug_sort" type="lowercase" indexed="true" > stored="false" /> > <field name="url" type="string" indexed="false" stored="true" /> > <field name="showtitle" type="string" indexed="true" > stored="true" /> > <field name="date_sort" type="pdate" indexed="true" > stored="false" sortMissingFirst="true" /> > > <field name="site" type="string" stored="true" indexed="true"/> > <field name="segment" type="string" stored="true" indexed="false"/> precisionStep="0" precisionStep="8" precisionStep="8" omitNorms="true" omitNorms="true" omitNorms="true"/> indexed="true" class="com.lucidimagination.solrworks.analysis.LucidKStemFilterFactory" class="com.lucidimagination.solrworks.analysis.LucidKStemFilterFactory" return http://www.rondhuit.com/en/ +
Ramanathapuram, Rajesh 2011-04-24, 03:36
-
Re: Solr - Multi Term highlighting issueRobert Muir 2011-04-24, 04:52
On Sat, Apr 23, 2011 at 11:36 PM, Ramanathapuram, Rajesh
<[EMAIL PROTECTED]> wrote: > What is really weird is if I search for srchterm1 and srchterm2 > separately, the results come up fine. If I search for multiple terms, > this issue seems to happen when the terms are separated by html tags and > special characters like ') / \' etc... > What version of Solr are you using? Because you are saying the issue only happens when terms involve special characters, its possible it could be this bug: https://issues.apache.org/jira/browse/LUCENE-2874, with the overlapping terms being created by the WordDelimiterFilter. This is fixed in 3.1. +
Robert Muir 2011-04-24, 04:52
-
Re: Solr - Multi Term highlighting issueRamanathapuram, Rajesh 2011-04-24, 05:57
I think I am using ver 1.4, I 'll try to review the link you provided later today.
Rajesh Ramana On Apr 24, 2011, at 12:52 AM, "Robert Muir" <[EMAIL PROTECTED]> wrote: > On Sat, Apr 23, 2011 at 11:36 PM, Ramanathapuram, Rajesh > <[EMAIL PROTECTED]> wrote: >> What is really weird is if I search for srchterm1 and srchterm2 >> separately, the results come up fine. If I search for multiple terms, >> this issue seems to happen when the terms are separated by html tags and >> special characters like ') / \' etc... >> > > What version of Solr are you using? Because you are saying the issue > only happens when terms involve special characters, its possible it > could be this bug: https://issues.apache.org/jira/browse/LUCENE-2874, > with the overlapping terms being created by the WordDelimiterFilter. > > This is fixed in 3.1. +
Ramanathapuram, Rajesh 2011-04-24, 05:57
-
RE: Solr - Multi Term highlighting issueRamanathapuram, Rajesh 2011-04-25, 18:07
Hi Robert,
Thanks for your help. This looks much closer to my issue(may be not). Unfortunately, I can't switch to solr version 3.1 yet. I hope to revisit and update this post when I do. Thanks thanks & regards, Rajesh Ramana Enterprise Applications, Turner Broadcasting System, Inc. 404.878.7474 -----Original Message----- From: Ramanathapuram, Rajesh [mailto:[EMAIL PROTECTED]] Sent: Sunday, April 24, 2011 1:58 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: Solr - Multi Term highlighting issue I think I am using ver 1.4, I 'll try to review the link you provided later today. Rajesh Ramana On Apr 24, 2011, at 12:52 AM, "Robert Muir" <[EMAIL PROTECTED]> wrote: > On Sat, Apr 23, 2011 at 11:36 PM, Ramanathapuram, Rajesh > <[EMAIL PROTECTED]> wrote: >> What is really weird is if I search for srchterm1 and srchterm2 >> separately, the results come up fine. If I search for multiple terms, >> this issue seems to happen when the terms are separated by html tags >> and special characters like ') / \' etc... >> > > What version of Solr are you using? Because you are saying the issue > only happens when terms involve special characters, its possible it > could be this bug: https://issues.apache.org/jira/browse/LUCENE-2874, > with the overlapping terms being created by the WordDelimiterFilter. > > This is fixed in 3.1. +
Ramanathapuram, Rajesh 2011-04-25, 18:07
|