Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Solr, mail # user - SOLR Performance Tuning: Fuzzy Searches, Distance, BK-Tree


Copy link to this message
-
SOLR Performance Tuning: Fuzzy Searches, Distance, BK-Tree
Fuad Efendi 2010-01-20, 04:32
Hi,
I am wondering: will SOLR or Lucene use caches for fuzzy searches? I mean
per-term caching or something, internal to Lucene, or may be SOLR (SOLR may
use own query parser)...

Anyway, I implemented BK-Tree and playing with it right now, I altered
FuzzyTermEnum class of Lucene...
http://en.wikipedia.org/wiki/BK-tree

- it seems performance of fuzzy searches boosted at least hundred times, but
I need to do more tests... repeated similar (slightly different) queries run
with better performance, probably because of OS-level file caching... but it
could be that of BK-Tree distance! (although I need to use classic int
instead of float distance by Lucene/Levenstein etc.)

Thanks,
Fuad Efendi
+1 416-993-2060
http://www.tokenizer.ca/
Data Mining, Vertical Search