Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Solr, mail # user - custom field default qf of requestHandler


Copy link to this message
-
Re: custom field default qf of requestHandler
Chris Hostetter 2012-04-05, 17:11
:       <analyzer type="query">
:         <tokenizer class="solr.StandardTokenizerFactory"/>
:         <filter class="solr.StandardFilterFactory"/>
:   <filter class="solr.LowerCaseFilterFactory" />
:         <filter class="solr.ShingleFilterFactory" outputUnigrams="false" maxShingleSize="2"/>
:       </analyzer>
:      </fieldType>

i'm pretty sure what you are seeing here is a variation on the "stopwords"
confusion people tend to have about dismax (and edismax)

just like hte lucene qparser, "whitespace" in the query string is
significant, and is used to denote the individual clauses of the input,
which are then *individually* passed to the analysers for each field in
the qf -- if one of your qf fields produces no tokens for an individual
clause (in this case: because it is configured not to output unigrams, and
unigrams is all that it can produce based on only getting one clause at a
time) then it gets droped out...

http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/

(note in particular the latter half starting with "Where people tend to
get tripped up, is in thinking about how SolrοΏ½s per-field analysis
configuration...")

if you quoted some portion of hte input, then the entire quoted portion
would be treated as a single clause and passed to your analyser.

altenatly: if you used thta field in the "pf" (where the entire input is
treated as one phrase) you would also start to see some shingles i believe
-Hoss