On Mon, Nov 13, 2017 at 8:14 PM, Chris Hostetter
<[EMAIL PROTECTED]> wrote:

It may be the case: the problem we found there is that the previous
BM25 did not obey the monotonicity requirements needed for score-based
optimizations such as LUCENE-4100 and LUCENE-7993. These algorithms
can greatly speed up our slowest queries (disjunctions, and phrase)
but need the similarity to be well-behaved in this way in order to be
correct.

In the BM25 case, scores would decrease in some situations with very
high TF values because of floating point issues, e.g. so
score(freq=100,000) would be unexpectedly less than
score(freq=99,999), all other things being equal. There may be other
ways to re-arrange the code to avoid this problem, feel free to open
an issue if you can optimize the code better while still behaving
properly!

---------------------------------------------------------------------
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB