Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Solr, mail # user - SOLR Performance Tuning: Fuzzy Searches, Distance, BK-Tree


Copy link to this message
-
RE: SOLR Performance Tuning: Fuzzy Searches, Distance, BK-Tree
Fuad Efendi 2010-01-23, 05:11
http://issues.apache.org/jira/browse/LUCENE-2230
Enjoy!
> -----Original Message-----
> From: Fuad Efendi [mailto:[EMAIL PROTECTED]]
> Sent: January-19-10 11:32 PM
> To: [EMAIL PROTECTED]
> Subject: SOLR Performance Tuning: Fuzzy Searches, Distance, BK-Tree
>
> Hi,
>
>
> I am wondering: will SOLR or Lucene use caches for fuzzy searches? I
> mean
> per-term caching or something, internal to Lucene, or may be SOLR (SOLR
> may
> use own query parser)...
>
> Anyway, I implemented BK-Tree and playing with it right now, I altered
> FuzzyTermEnum class of Lucene...
> http://en.wikipedia.org/wiki/BK-tree
>
> - it seems performance of fuzzy searches boosted at least hundred times,
> but
> I need to do more tests... repeated similar (slightly different) queries
> run
> with better performance, probably because of OS-level file caching...
> but it
> could be that of BK-Tree distance! (although I need to use classic int
> instead of float distance by Lucene/Levenstein etc.)
>
> Thanks,
> Fuad Efendi
> +1 416-993-2060
> http://www.tokenizer.ca/
> Data Mining, Vertical Search
>
>
>