|
|
-
Re: What is the "docs" number in Solr explain query results for fieldnorm?Andrzej Bialecki 2012-05-25, 18:35
On 25/05/2012 20:13, Tom Burton-West wrote:
> Hello all, > > I am trying to understand the output of Solr explain for a one word query. > I am querying on the "ocr" field with no stemming/synonyms or stopwords. > And no query or index time boosting. > > The query is "ocr:the" > > The document (result below) which contains two words "The Aeroplane" gets > more hits than documents with 50 or more occurances of the word "the" > Since the idf is the same I am assuming this is a result of length norms. > > The explain (debugQuery) shows the following for fieldnorm: > 0.625 = fieldNorm(field=ocr, doc=16624) > What does the "doc=16624" mean? It certainly can not represent either the > length of the field (as an integer) since there are only two terms in the > field. > It can't represent the number of docs with the query term (the idf output > shows the word "the" occurs in 16,219 docs. Hi Tom, This is an internal document number within a Lucene index. This number is useless from the level of Solr APIs because you can't use it to actually do anything. At the Lucene level (e.g. in Luke) you could navigate to this number and for example retrieve stored fields of this document. As it's shown in the Explanation-s, it can be only used to co-ordinate parts of the query that matched the same document number. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com |