Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Solr, mail # user - wild card search and lower-casing


Copy link to this message
-
wild card search and lower-casing
Dmitry Kan 2011-11-18, 11:23
Hello,

Here is one puzzle I couldn't yet find a key for:

for the wild-card query:

*ocvd

SOLR 3.4 returns hits. But for

*OCVD

it doesn't

On the indexing side two following tokenizers/filters are defined:

<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ReversedWildcardFilterFactory" withOriginal="true"
maxPosAsterisk="3" maxPosQuestion="2" maxFractionAsterisk="0.33"/>

On the query side:
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>

SOLR analysis tool shows, that OCVD gets lower-cased to ocvd. Does SOLR
skip a lower-casing step when doing the actual wild-card search?

BTW, the same issue for a trailing wild-card:

mocv*

produces hits, while

MOCV*

doesn't. Appreciate any help or pointers.
--
Regards,

Dmitry Kan