Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Solr, mail # user - Error with distributed search and Suggester component (Solr 3.4)


Copy link to this message
-
Re: Error with distributed search and Suggester component (Solr 3.4)
Robert Muir 2012-05-02, 19:34
On Wed, May 2, 2012 at 12:16 PM, Ken Krugler
<[EMAIL PROTECTED]> wrote:

> What confuses me is that Suggester says it's based on SpellChecker, which supposedly does work with shards.
>

It is based on spellchecker apis, but spellchecker's ranking is based
on simple comparators like string similarity, whereas suggesters use
weights.

when spellchecker merges from shards, it just merges all their top-N
into one set and recomputes this same distance stuff over again.

so, suggester can't possibly work like this correctly (forget about
any technical details), as how can it make assumptions about these
weights you provided. if they were e.g. log() weights from your query
logs then it needs to do log-summation across the shards, etc for the
final combined weight to be correct. This is specific to how you
originally computed the weights you gave it. it certainly cannot be
recomputing anything like spellchecker does :)

Anyways, if you really want to do it, maybe
https://issues.apache.org/jira/browse/SOLR-2848 is helpful. The
background is in 3.x there is really only one spellchecker impl
(AbstractLucene or something like that). I don't think distributed
spellcheck works with any other SpellChecker subclasses in 3.x, i
think its "wired" to only work with the Abstract-Lucene ones.

When we added another subclass to 4.0, DirectSpellChecker, he saw that
it was broken here and cleaned up the APIs so that spellcheckers can
override this merge() operation. Unfortunately I forgot to commit
those refactorings James did (which lets any spellchecker override
merge()ing) to the 3.x branch, but the ideas might be useful.

--
lucidimagination.com