|
|
-
Re: Spell checking: Is there a way to exclude words known to be wrong?Erik Hatcher 2009-07-14, 13:07
Use the stopwords feature with a custom mispeled_words.txt and a
StopFilterFactory on the spell check field ;) Erik On Jul 13, 2009, at 8:27 PM, Jay Hill wrote: > We're building a spell index from a field in our main index with the > following configuration: > <searchComponent name="spellcheck" class="solr.SpellCheckComponent"> > <str name="queryAnalyzerFieldType">textSpell</str> > <lst name="spellchecker"> > <str name="name">default</str> > <str name="field">spell</str> > <str name="spellcheckIndexDir">./spellchecker</str> > <str name="buildOnCommit">true</str> > </lst> > </searchComponent> > > This works great and re-builds the spelling index on commits as > expected. > However, we know there are misspellings in the "spell" field of our > main > index. We could remove these from the spelling index using Luke, > however > they will be added again on commits. What we need is something > similar to > how the protwords.txt file is used. So that when we notice > misspelled words > such as "beginnning" being pulled from our main index we could add > them to > an exclusion file so they are not added to the spelling index again. > > Any tricks to make this possible? > > -Jay |