|
|
-
Term ordering for IndexReader.termDocs()
Ype Kingma 2002-01-25, 15:02
Hello,
I'm creating a filter from a set of terms that are read from a file, and I find that IndexReader.termDocs(Term(fieldName, valueFromFile)) does this quite well (around 0.1 secs elapsed time in jython code.)
Would it be advantageous to sort the values from the file before using them in this way? This could help to reduce the nr. of disk seeks, but I have no idea about the way the segments are organized on disk.
I did not yet profile this, because I have only tried it with less then 100 terms on a relatively small index. I wonder whether performance it still as good at say 20000 terms.
Thanks in advance, Ype Kingma
--
--
-
RE: Term ordering for IndexReader.termDocs()
Doug Cutting 2002-01-25, 16:49
> From: Ype Kingma [mailto:[EMAIL PROTECTED]] > > I'm creating a filter from a set of terms that are read from > a file, and I find that IndexReader.termDocs(Term(fieldName, > valueFromFile)) > does this quite well (around 0.1 secs elapsed time in jython code.) > > Would it be advantageous to sort the values from the file before > using them in this way?
Yes, that would be faster. The term dictionary is sorted and this would reduce both i/o and computation.
Doug
--
-
RE: Term ordering for IndexReader.termDocs()
Ype Kingma 2002-01-25, 18:42
Doug,
> > From: Ype Kingma [mailto:[EMAIL PROTECTED]] >> >> I'm creating a filter from a set of terms that are read from >> a file, and I find that IndexReader.termDocs(Term(fieldName, >> valueFromFile)) >> does this quite well (around 0.1 secs elapsed time in jython code.) >> >> Would it be advantageous to sort the values from the file before >> using them in this way? > >Yes, that would be faster. The term dictionary is sorted and this would >reduce both i/o and computation.
Thanks. I suppose it would be correct to assume that the sorting order is java.lang.String.compareTo() ?
Regards, Ype --
--
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext