Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Solr, mail # user - Omitting tf but not positions


Copy link to this message
-
Re: Omitting tf but not positions
Jan Høydahl 2011-02-25, 18:57
I also have a case (yellow-page) where IDF comes in and destroys the rank.
A company listing with a word which occurs in few other listings is not necessarily better than others just because of that. When it gets to the extreme value of IDF=1, we get an artificially high IDF boost.

It is not killed by omitNorms, neither by omitTermFrequencyAndPositions. Any per-field way to get rid of the IDF effect?
Or should I override idf() in Similarity?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 15. des. 2010, at 13.27, Robert Muir wrote:

> On Wed, Dec 15, 2010 at 3:09 AM, Jan Høydahl / Cominvent
> <[EMAIL PROTECTED]> wrote:
>> Any way to disable TF/IDF normalization without also disabling positions?
>>
>
> see Similarity.tf(float) and Similarity.tf(int)
>
> if you want to change this for both terms and phrases just override
> Similarity.tf(float), since by default Similarity.tf(int) delegates to
> that.
> otherwise, override both.
>
> of course the big limitation being you cant customize Similarity per-field yet.