Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Lucene, mail # user - Getting the frequencies by corresponding order of documents were indexed


Copy link to this message
-
Re: Getting the frequencies by corresponding order of documents were indexed
Ian Lea 2012-05-11, 11:22
Can't spot anything obviously wrong in your code and what you are
trying to do should work.  Are you positive that what you think is the
second doc is really being added second?  You only show one doc being
added.  Are there already 7 docs in the index before you start?
--
Ian.
On Fri, May 11, 2012 at 8:58 AM, Kasun Perera <[EMAIL PROTECTED]> wrote:
> I have collection of documents (say 10 documents)and i'm indexing them this
> way, by storing the term vector
>
> StringReader strRdElt = new StringReader(content);
>
>
>    Document doc = new Document();
>
>    String docname=docNames[docNo];
>
>    doc.add(new Field("doccontent", strRdElt, Field.TermVector.YES));
>
>    IndexWriter iW;
>    try {
>
>        NIOFSDirectory dir = new NIOFSDirectory(new File(pathToIndex)) ;
>
>        iW = new IndexWriter(dir, new IndexWriterConfig(Version.LUCENE_35,
>
>                new StandardAnalyzer(Version.LUCENE_35)));
>
>        iW.addDocument(doc);
>        iW.close();
>
>    }
>
> After Index all the documents, i'm getting the term-frequencies of each
> document this way
>
>
> IndexReader re = IndexReader.open(FSDirectory.open(new
> File(pathToIndex)), true) ;
> TermFreqVector termsFreq[];
> for(int i=0;i<noOfDocs;i++){
>        termsFreq[i] = re.getTermFreqVector(i, "doccontent");
>
>      }
>
> my problem is i'm not getting the termfreqncy vector correspondingly. Say
> for 2nd document that I have indexed i'm getting it's corresponding
> termfrequncies and terms at "termsFreq[9]"
>
> What is the reason for that?, how can I get the corresponding
> termfrequncies by the order that I have indexed the documents?
>
>
> --
> Regards
>
> Kasun Perera

---------------------------------------------------------------------