|
|
-
Re: filter by term frequencyJack Krupansky 2012-06-16, 21:26
If you were a *Solr* user, I could say "try the 'termfreq' function query":
termfreq(field,term) returns the number of times the term appears in the field for that document. Example Syntax: termfreq(text,'memory') See: http://wiki.apache.org/solr/FunctionQuery#tf Lucene does have "FunctionQuery", "ValueSource", and "TermFreqValueSource". See: http://lucene.apache.org/solr/api/org/apache/solr/search/function/FunctionQuery.html -- Jack Krupansky -----Original Message----- From: Mike Sokolov Sent: Saturday, June 16, 2012 2:33 PM To: [EMAIL PROTECTED] Subject: filter by term frequency I imagine this is a question that comes up from time to time, but I haven't been able to find a definitive answer anywhere, so... I'm wondering whether there is some type of Lucene query that filters by term frequency. For example, suppose I want to find all documents that have exactly 2 occurrences of some word. I know that the frequency is stored and used in scoring , but I don't think it is exposed in a simple way at the query level. It looks to me as if CustomScoreQuery might be a convenient way to monkey with scores? But it doesn't seem to use that for filtering, just sorting. Perhaps a Collector could then impose a score threshold later? Any suggestions here? -Mike --------------------------------------------------------------------- --------------------------------------------------------------------- |