|
|
Jan Høydahl 2012-04-11, 23:40
Testing pf2 and pf3. I thought that when using pf2=myfield, and q=foo bar, you would get a phrase query "foo bar", but you don't, unless there are at least 3 terms in the query. Is this intentional? I think of "pf2" as boosting any two words in the query, even if there are only two words. The offending code is:
if (null == fields || fields.isEmpty() || null == clauses || clauses.size() <= shingleSize ) return;
-- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com ---------------------------------------------------------------------
+
Jan Høydahl 2012-04-11, 23:40
-
Re: eDismax pf2 and pf3
Yonik Seeley 2012-04-12, 00:10
On Wed, Apr 11, 2012 at 7:40 PM, Jan Høydahl <[EMAIL PROTECTED]> wrote: > Testing pf2 and pf3. I thought that when using pf2=myfield, and q=foo bar, you would get a phrase query "foo bar", but you don't, unless there are at least 3 terms in the query. Is this intentional?
Nope.
> I think of "pf2" as boosting any two words in the query, even if there are only two words.
Correct.
> The offending code is: > > if (null == fields || fields.isEmpty() || > null == clauses || clauses.size() <= shingleSize ) > return;
Correct. Looks like a bug probably introduced during a refactor (since I don't recall using the "shingle" terminology).
-Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10
---------------------------------------------------------------------
+
Yonik Seeley 2012-04-12, 00:10
-
Re: eDismax pf2 and pf3
Jan Høydahl 2012-04-12, 00:28
SOLR-3352
-- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com
On 12. apr. 2012, at 02:10, Yonik Seeley wrote:
> On Wed, Apr 11, 2012 at 7:40 PM, Jan Høydahl <[EMAIL PROTECTED]> wrote: >> Testing pf2 and pf3. I thought that when using pf2=myfield, and q=foo bar, you would get a phrase query "foo bar", but you don't, unless there are at least 3 terms in the query. Is this intentional? > > Nope. > >> I think of "pf2" as boosting any two words in the query, even if there are only two words. > > Correct. > >> The offending code is: >> >> if (null == fields || fields.isEmpty() || >> null == clauses || clauses.size() <= shingleSize ) >> return; > > Correct. Looks like a bug probably introduced during a refactor > (since I don't recall using the "shingle" terminology). > > -Yonik > lucenerevolution.com - Lucene/Solr Open Source Search Conference. > Boston May 7-10 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > ---------------------------------------------------------------------
+
Jan Høydahl 2012-04-12, 00:28
-
Re: eDismax pf2 and pf3
Chris Hostetter 2012-04-12, 02:24
: > Testing pf2 and pf3. I thought that when using pf2=myfield, and q=foo : bar, you would get a phrase query "foo bar", but you don't, unless there : are at least 3 terms in the query. Is this intentional? : : Nope. : : > I think of "pf2" as boosting any two words in the query, even if there : are only two words. : : Correct.
-0 ... getting "double the boosting" on the original query if you use both pf and pf2 smells weird to me in a way i can't fully describe, but i can certainly understand how the consistency would at least be easier to understand.
: Correct. Looks like a bug probably introduced during a refactor : (since I don't recall using the "shingle" terminology).
FWIW: i did the refactoring of that method and introduced those variables, but the same logic is in the original SOLR-1553 patch...
+ Map<String,Float> pf = phraseFields; + if (normalClauses.size() >= 2 && pf.size() > 0) { + StringBuilder sb = new StringBuilder(); + for (int i=0; i<normalClauses.size()-1; i++) { ... + pf = phraseFields3; + if (normalClauses.size() >= 3 && pf.size() > 0) { + StringBuilder sb = new StringBuilder(); + for (int i=0; i<normalClauses.size()-2; i++) {
...so it was't a bug introduced later, it was written out that way explicitly in the begining for some reason. -Hoss
---------------------------------------------------------------------
+
Chris Hostetter 2012-04-12, 02:24
-
Re: eDismax pf2 and pf3
Yonik Seeley 2012-04-12, 04:29
On Wed, Apr 11, 2012 at 10:24 PM, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > : > Testing pf2 and pf3. I thought that when using pf2=myfield, and q=foo > : bar, you would get a phrase query "foo bar", but you don't, unless there > : are at least 3 terms in the query. Is this intentional? > : > : Nope. > : > : > I think of "pf2" as boosting any two words in the query, even if there > : are only two words. > : > : Correct. > > -0 ... getting "double the boosting" on the original query if you use > both pf and pf2 smells weird to me in a way i can't fully describe, but i > can certainly understand how the consistency would at least be easier to > understand.
And if "pf2" is the only pf parameter? I don't know what the right behavior is if multiple "pf" parameters are used, but it certainly seems like you should always get phrase boosting if possible if you are using only one parameter, and that should be the common case. > : Correct. Looks like a bug probably introduced during a refactor > : (since I don't recall using the "shingle" terminology). > > FWIW: i did the refactoring of that method and introduced those variables, > but the same logic is in the original SOLR-1553 patch... > > + Map<String,Float> pf = phraseFields; > + if (normalClauses.size() >= 2 && pf.size() > 0) { > + StringBuilder sb = new StringBuilder(); > + for (int i=0; i<normalClauses.size()-1; i++) { > ... > + pf = phraseFields3; > + if (normalClauses.size() >= 3 && pf.size() > 0) { > + StringBuilder sb = new StringBuilder(); > + for (int i=0; i<normalClauses.size()-2; i++) { > > ...so it was't a bug introduced later, it was written out that way > explicitly in the begining for some reason. Just glancing at it quickly... but it seems like the original code quoted above would add phrases if there were 2 terms (keeping in mind that "pf" in the original patch was eventually changed to "pf2".)
-Yonik
---------------------------------------------------------------------
+
Yonik Seeley 2012-04-12, 04:29
-
Re: eDismax pf2 and pf3
Chris Hostetter 2012-04-12, 18:47
: Just glancing at it quickly... but it seems like the original code : quoted above would add phrases if there were 2 terms (keeping in mind : that "pf" in the original patch was eventually changed to "pf2".)
BAH!!!! ... you are absolutely correct ...
aparently i made the same mistake *twice* ... once when refactoring it, and once yesterday when reading it to see if i screwed up the refactoring.
-Hoss
---------------------------------------------------------------------
+
Chris Hostetter 2012-04-12, 18:47
|
|