|
Karthik N S
2004-05-27, 05:00
Ype Kingma
2004-05-27, 07:15
Karthik N S
2004-05-27, 07:37
Erik Hatcher
2004-05-27, 09:20
Otis Gospodnetic
2004-05-27, 12:58
Ype Kingma
2004-05-27, 17:32
Karthik N S
2004-05-28, 03:54
Ype Kingma
2004-05-28, 07:43
Karthik N S
2004-05-28, 08:54
Erik Hatcher
2004-05-28, 10:07
Karthik N S
2004-05-28, 11:14
Ype Kingma
2004-05-28, 18:40
Karthik N S
2004-05-31, 04:12
Ype Kingma
2004-05-31, 08:17
Karthik N S
2004-05-31, 09:00
Karthik N S
2004-05-31, 09:09
Ype Kingma
2004-05-31, 09:22
Karthik N S
2004-05-31, 11:47
Erik Hatcher
2004-05-31, 13:20
Ype Kingma
2004-05-31, 15:10
Karthik N S
2004-06-01, 12:10
Erik Hatcher
2004-06-01, 13:07
Karthik N S
2004-06-02, 10:20
Erik Hatcher
2004-06-02, 12:46
Ype Kingma
2004-06-02, 18:55
Karthik N S
2004-06-03, 05:10
Ype Kingma
2004-06-03, 06:53
|
-
Range Query Sombody HELP pleaseKarthik N S 2004-05-27, 05:00
Hi Lucene developers Is it possible to do Search and retrieve relevant information on the Indexed Document within in specific range settings which may be similar to an Query in SQL = select * from BOOKSHELF where book1 between 100 and 200 ex:- "search_word" , Book between 100 AND 200 [ Note:- where Book uniquefield hit info which is already Indexed ] Sombody Please Help me :( with regards Karthik ---------------------------------------------------------------------
-
Re: Range Query Sombody HELP pleaseYpe Kingma 2004-05-27, 07:15
On Thursday 27 May 2004 07:00, Karthik N S wrote:
> Hi > Lucene developers > > Is it possible to do Search and retrieve relevant information on the > Indexed Document > within in specific range settings which may be similar to an > > Query in SQL = select * from BOOKSHELF where book1 between 100 and > 200 > > ex:- > > "search_word" , Book between 100 AND 200 > > [ Note:- where Book uniquefield hit info which is already Indexed ] The query parser can construct this query for you (assuming search_word is in the query default field): +search_word +(book:[100 TO 200]) See also: http://jakarta.apache.org/lucene/docs/queryparsersyntax.html One problem you might run into is that Lucene does not support numbers directly, only strings are indexed. You can index these numbers with sufficient zero's prefixed and add these prefix zero's in the query. Erik Hatcher wrote an article on how to do make the query: http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html You'll need to override the getRangeQuery() method. Have fun, Ype ---------------------------------------------------------------------
-
RE: Range Query Sombody HELP pleaseKarthik N S 2004-05-27, 07:37
Hi
Lucene -Developer My main intention was Search for an word hit in a Unique Field between ranges say book100 - book 200 indexed numbers It's something like creating a SUBSEARCH with in the SEARCHINDEX. This is similar to a SQL select * from BOOKSHELF. or select * from BOOKSHELF where book1 between 100 and 200. with regards Karthik -----Original Message----- From: Ype Kingma [mailto:[EMAIL PROTECTED]] Sent: Thursday, May 27, 2004 12:46 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Thursday 27 May 2004 07:00, Karthik N S wrote: > Hi > Lucene developers > > Is it possible to do Search and retrieve relevant information on the > Indexed Document > within in specific range settings which may be similar to an > > Query in SQL = select * from BOOKSHELF where book1 between 100 and > 200 > > ex:- > > "search_word" , Book between 100 AND 200 > > [ Note:- where Book uniquefield hit info which is already Indexed ] The query parser can construct this query for you (assuming search_word is in the query default field): +search_word +(book:[100 TO 200]) See also: http://jakarta.apache.org/lucene/docs/queryparsersyntax.html One problem you might run into is that Lucene does not support numbers directly, only strings are indexed. You can index these numbers with sufficient zero's prefixed and add these prefix zero's in the query. Erik Hatcher wrote an article on how to do make the query: http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html You'll need to override the getRangeQuery() method. Have fun, Ype --------------------------------------------------------------------- ---------------------------------------------------------------------
-
Re: Range Query Sombody HELP pleaseErik Hatcher 2004-05-27, 09:20
On May 27, 2004, at 3:37 AM, Karthik N S wrote:
> Hi > Lucene -Developer My main intention was > > Search for an word hit in a Unique Field between ranges say > book100 - book 200 indexed numbers > It's something like creating a SUBSEARCH with in the SEARCHINDEX. > > This is similar to a SQL > > select * from BOOKSHELF. > or > select * from BOOKSHELF where book1 between 100 and 200. Karthik - I'm having a hard time understanding your questions unfortunately. Ype replied with solution suggestion by overriding getRangeQuery on a custom QueryParser subclass. You need to ensure you are indexing numbers in a padded fashion: http://wiki.apache.org/jakarta-lucene/SearchNumericalFields Erik ---------------------------------------------------------------------
-
Re: Range Query Sombody HELP pleaseOtis Gospodnetic 2004-05-27, 12:58
Karthik, namaste!
I seem to be getting multiple copies of your email. I received 4 copies of this email. Could you please limit things to 1 message per subject? I get hundreds of messages every day as is. :( Thank you, Otis --- Karthik N S <[EMAIL PROTECTED]> wrote: > > Hi > Lucene developers > > Is it possible to do Search and retrieve relevant information on the > Indexed > Document > within in specific range settings which may be similar to an > > Query in SQL = select * from BOOKSHELF where book1 between 100 > and 200 > > ex:- > > "search_word" , Book between 100 AND 200 > > [ Note:- where Book uniquefield hit info which is already Indexed ] > > > Sombody Please Help me :( > > > with regards > Karthik > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > ---------------------------------------------------------------------
-
Re: Range Query Sombody HELP pleaseYpe Kingma 2004-05-27, 17:32
On Thursday 27 May 2004 09:37, Karthik N S wrote:
> Hi > Lucene -Developer My main intention was > > Search for an word hit in a Unique Field between ranges say > book100 - book 200 indexed numbers > It's something like creating a SUBSEARCH with in the SEARCHINDEX. You don't need to shout (uppercase), I've been teaching SQL. Could you explain what you mean by subsearch? I suppose you might want to have a look at the various filter classes in the org.apache.lucene.search package. Regards, Ype ---------------------------------------------------------------------
-
RE: Range Query Sombody HELP pleaseKarthik N S 2004-05-28, 03:54
Hey Ype
Apologies for the misconduct. Weh we do a search in SQL using '*' we all know that the result would be total no of records in the table,but when we want to get limit our record we apply range between 2 specific row records [Which we call it as subsearch] Similarly on a indexed record I would like perform the same tecnique as above. In fact I was looking at the url u sent me in the last mail on using getRange Queries and was working on the same http://jakarta.apache.org/lucene/docs/queryparsersyntax.html and http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html but witou results for the last 12 hrs. If u could spare a few minuts and please expalin or provide a simple [ full ] example using and over riding the getRange() method . with regards Karthik -----Original Message----- From: Ype Kingma [mailto:[EMAIL PROTECTED]] Sent: Thursday, May 27, 2004 11:03 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Thursday 27 May 2004 09:37, Karthik N S wrote: > Hi > Lucene -Developer My main intention was > > Search for an word hit in a Unique Field between ranges say > book100 - book 200 indexed numbers > It's something like creating a SUBSEARCH with in the SEARCHINDEX. You don't need to shout (uppercase), I've been teaching SQL. Could you explain what you mean by subsearch? I suppose you might want to have a look at the various filter classes in the org.apache.lucene.search package. Regards, Ype --------------------------------------------------------------------- ---------------------------------------------------------------------
-
Re: Range Query Sombody HELP pleaseYpe Kingma 2004-05-28, 07:43
Karthik,
On Friday 28 May 2004 05:54, Karthik N S wrote: ... > Weh we do a search in SQL using '*' we all know that the result would be > total no of records in the table,but when we want to get limit our record > we apply range between 2 specific row records [Which we call it as > subsearch] > > > Similarly on a indexed record I would like perform the same tecnique > as above. In case you need to reuse the limitation a filter is the way to go in Lucene. However it seems to be better to get the range query working first. > In fact I was looking at the url u sent me in the last mail on using > getRange Queries > and was working on the same > > http://jakarta.apache.org/lucene/docs/queryparsersyntax.html The query I gave uses two +'s prefixed to the query parts: +search_word +(book:[100 TO 200]) Both query parts are required because of the +'s, ie. it works as the AND operator in SQL. The TO operator queries the range in the book field. > and > > http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html > > but witou results for the last 12 hrs. You have probably seen a lot of different things that will be useful later. > If u could spare a few minuts and please expalin or provide a simple [ > full ] example using and > over riding the getRange() method . The problem you'll probably run into is that Lucene does not support numbers directly, you'll have to index them as strings, eg. by prefixing zero's: As Erik indicated: http://wiki.apache.org/jakarta-lucene/SearchNumericalFields You may have to reindex your data for this. In case you have a lot of data consider setting up a test first. Then in the getRangeQuery() method of your parser you'll need to prefix the queried numbers in the same way. The example in the article is about date fields, but the adaptation to numbers shouldn't be a problem. When you override this in your query parser: getRangeQuery(String field, Analyzer analyzer, String start, String end, boolean inclusive) it will be called for the example query with start = "100" and end = "200". (See http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html under Customizing query parser). In the overriding method you can then call the super method with the start and end prefixed with zero's as indicated in searching numerical fields referred to above. Have fun, you'll get it working, Ype > with regards > Karthik > > -----Original Message----- > From: Ype Kingma [mailto:[EMAIL PROTECTED]] > Sent: Thursday, May 27, 2004 11:03 PM > To: [EMAIL PROTECTED] > Subject: Re: Range Query Sombody HELP please > > On Thursday 27 May 2004 09:37, Karthik N S wrote: > > Hi > > Lucene -Developer My main intention was > > > > Search for an word hit in a Unique Field between ranges say > > book100 - book 200 indexed numbers > > It's something like creating a SUBSEARCH with in the SEARCHINDEX. ... > Could you explain what you mean by subsearch? > I suppose you might want to have a look at the various filter classes > in the org.apache.lucene.search package. > > Regards, > Ype > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] ---------------------------------------------------------------------
-
Range Query Sombody HELP pleaseKarthik N S 2004-05-28, 08:54
Hey ype
Thx for the advice but still I need to get the exact situation working , 1) I have a unique Field [ called filename ] which is indexed of type Text. It accepts the name of the HTML files as the indexing parameter , Also there is another Field called "Contents" which stores all the contents of that indicated unique named html file. 2) The indexer complete indexes for about 5000 html files sucessfully . 3) When I do a search for word ,it returns a hit of 400 on various html files Now in this situation if I want to limit the hits between First 200 to 400 html Page Names only what exactly should I do to using getRange() method. Please advise on how to proceed ... with regards Karthik -----Original Message----- From: Ype Kingma [mailto:[EMAIL PROTECTED]] Sent: Friday, May 28, 2004 1:14 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please Karthik, On Friday 28 May 2004 05:54, Karthik N S wrote: ... > Weh we do a search in SQL using '*' we all know that the result would be > total no of records in the table,but when we want to get limit our record > we apply range between 2 specific row records [Which we call it as > subsearch] > > > Similarly on a indexed record I would like perform the same tecnique > as above. In case you need to reuse the limitation a filter is the way to go in Lucene. However it seems to be better to get the range query working first. > In fact I was looking at the url u sent me in the last mail on using > getRange Queries > and was working on the same > > http://jakarta.apache.org/lucene/docs/queryparsersyntax.html The query I gave uses two +'s prefixed to the query parts: +search_word +(book:[100 TO 200]) Both query parts are required because of the +'s, ie. it works as the AND operator in SQL. The TO operator queries the range in the book field. > and > > http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html > > but witou results for the last 12 hrs. You have probably seen a lot of different things that will be useful later. > If u could spare a few minuts and please expalin or provide a simple [ > full ] example using and > over riding the getRange() method . The problem you'll probably run into is that Lucene does not support numbers directly, you'll have to index them as strings, eg. by prefixing zero's: As Erik indicated:�http://wiki.apache.org/jakarta-lucene/SearchNumericalFields You may have to reindex your data for this. In case you have a lot of data consider setting up a test first. Then in the getRangeQuery() method of your parser you'll need to prefix the queried numbers in the same way. The example in the article is about date fields, but the adaptation to numbers shouldn't be a problem. When you override this in your query parser: getRangeQuery(String field, Analyzer analyzer, String start, String end, boolean inclusive) it will be called for the example query with start = "100" and end = "200". (See http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html under Customizing query parser). In the overriding method you can then call the super method with the start and end prefixed with zero's as indicated in searching numerical fields referred to above. Have fun, you'll get it working, Ype > with regards > Karthik > > -----Original Message----- > From: Ype Kingma [mailto:[EMAIL PROTECTED]] > Sent: Thursday, May 27, 2004 11:03 PM > To: [EMAIL PROTECTED] > Subject: Re: Range Query Sombody HELP please > > On Thursday 27 May 2004 09:37, Karthik N S wrote: > > Hi > > Lucene -Developer My main intention was > > > > Search for an word hit in a Unique Field between ranges say > > book100 - book 200 indexed numbers > > It's something like creating a SUBSEARCH with in the SEARCHINDEX. ... > Could you explain what you mean by subsearch? > I suppose you might want to have a look at the various filter classes > in the org.apache.lucene.search package.
-
Re: Range Query Sombody HELP pleaseErik Hatcher 2004-05-28, 10:07
On May 28, 2004, at 4:54 AM, Karthik N S wrote:
> 1) I have a unique Field [ called filename ] which is indexed of type > Text. You probably do not want to use Field.Text for a filename. Use Field.Keyword instead. > 2) The indexer complete indexes for about 5000 html files sucessfully > . Now use Luke (Google for _luke lucene_) to browse your index, and check that you are getting what you think. You can do ad-hoc queries there also. > Now in this situation if I want to limit the hits between First 200 > to > 400 html Page Names only > what exactly should I do to using getRange() method. If you want the first 200 - 400, start your Hits walking at index 200, and proceed through 400. Is there some field you want to key off to do the range? Or do you just want the 200th - 400th hits from the search, which is an entirely different question than about ranges. > Please advise on how to proceed ... Please send (succinct) code examples in the future to really keep this discussion concrete and clear. Erik ---------------------------------------------------------------------
-
Range Query Sombody HELP pleaseKarthik N S 2004-05-28, 11:14
Hey Erik Apologies again [ You probably do not want to use Field.Text for a filename. Use Field.Keyword instead. ] 1) When changing the Field type from Text to Keyword, I do not get the hits at all [Since most of parameters avaliable to this Field are of String type ... file[i].getName() ] 2) On successfull Indexing the search hits retun me 400 numbers on various html files presence of the SearchWord in content Field. 3) If I have to limit the hits between file name (file[100].getName() and file[200].getName() ) on the Field type "filename" for the SearchedWord. I did the way YPE advised in his last mail but still no improvement in sitution. I need to get hit samples in between the 2 files [ 100 files between ] and not the max no of hits. Please advise me How do I proceed.... 4) I Installed luke [ via Java webstart ] from http://www.getopt.org/luke/webstart.html but since my Index files are built on a custom made Analyzer [ not the set of standard analyzer avaliable from drop box] , Will it search the index for the same. with regards Karthik -----Original Message----- From: Erik Hatcher [mailto:[EMAIL PROTECTED]] Sent: Friday, May 28, 2004 3:38 PM To: Lucene Users List Subject: Re: Range Query Sombody HELP please On May 28, 2004, at 4:54 AM, Karthik N S wrote: > 1) I have a unique Field [ called filename ] which is indexed of type > Text. You probably do not want to use Field.Text for a filename. Use Field.Keyword instead. > 2) The indexer complete indexes for about 5000 html files sucessfully > . Now use Luke (Google for _luke lucene_) to browse your index, and check that you are getting what you think. You can do ad-hoc queries there also. > Now in this situation if I want to limit the hits between First 200 > to > 400 html Page Names only > what exactly should I do to using getRange() method. If you want the first 200 - 400, start your Hits walking at index 200, and proceed through 400. Is there some field you want to key off to do the range? Or do you just want the 200th - 400th hits from the search, which is an entirely different question than about ranges. > Please advise on how to proceed ... Please send (succinct) code examples in the future to really keep this discussion concrete and clear. Erik --------------------------------------------------------------------- ---------------------------------------------------------------------
-
Re: Range Query Sombody HELP pleaseYpe Kingma 2004-05-28, 18:40
On Friday 28 May 2004 10:54, Karthik N S wrote:
> Hey ype > > Thx for the advice but still I need to get the exact situation working , > > 1) I have a unique Field [ called filename ] which is indexed of type Text. > It accepts the name of the HTML files as the indexing parameter , > Also there is another Field called "Contents" which stores all the > contents of that > indicated unique named html file. > > 2) The indexer complete indexes for about 5000 html files sucessfully . > > 3) When I do a search for word ,it returns a hit of 400 on various html > files > > Now in this situation if I want to limit the hits between First 200 to > 400 html Page Names only > what exactly should I do to using getRange() method. A range query will provide a range of indexed values, and I thought you needed to add the record number as an indexed field in each record. However, you seem to use the 200 and 400 here as the order number for each record in the result of the query on the Contents field. Is that correct? When so, in which order do you expect the results of your query? Kind regards, Ype ---------------------------------------------------------------------
-
Range Query Sombody HELP pleaseKarthik N S 2004-05-31, 04:12
Hey Ype
Apologies please Have a look at the Search Factor hits in the O/p sample of my indexed file ================== Start Searching =========================Search Keyword : king~ Source path [ E:/po/aaaa ] : e:/indexer/b10181 Query: ['king~'] in Folder e:/indexer/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': Not a Found document(s) that matched query Field 'bookid': Not a Found document(s) that matched query Field 'creation': Not a Found document(s) that matched query Field 'chapNme': Not a Found document(s) that matched query Field 'itmName': Found document(s) that matched : 'king~' no of hits :'67' in query Field :'contents' File Name : B10181_P703 File Path : E:\po\catalog\B10181\B10181_P703 Modified Date : 1080036442000 Bookid : B10181 Chapter Name : Item Name : File Name : B10181_P702 File Path : E:\po\catalog\B10181\B10181_P702 Modified Date : 1080036442000 Bookid : B10181 Chapter Name : Item Name : File Name : B10181_P512 File Path : E:\po\catalog\B10181\B10181_P512 Modified Date : 1080036438000 Bookid : B10181 Chapter Name : Item Name : File Name : B10181_P40 File Path : E:\po\catalog\B10181\B10181_P40 Modified Date : 1080036444000 Bookid : B10181 Chapter Name : Item Name : File Name : B10181_P355 File Path : E:\po\catalog\B10181\B10181_P355 Modified Date : 1080036436000 Bookid : B10181 Chapter Name : Item Name : File Name : B10181_P379 File Path : E:\po\catalog\B10181\B10181_P379 Modified Date : 1080036436000 Bookid : B10181 Chapter Name : Item Name : . . . . . . 328 Total milliseconds ================== End Searching =========== The o/p says a hit of 67 in total [ I have sniped out most of them for view case ] , the search word is present in field "Contents" where the content part of html file is indexed. If u see the Field " File Name" it's Unique and is indexed/ Viewed / as per Windows O/s Explore case. My Question now is, If I want to Use Range Query to get search hits between fileName "B10181_P702" and "B10181_P355" only Instead of all the 67 hits , How Do I do it [Please state with clear Example or send me an attachement for the same , I overrided the getRange() Query method as per u'r last mail ,but still not able to achive the Results ]. with regards Karthik -----Original Message----- From: Ype Kingma [mailto:[EMAIL PROTECTED]] Sent: Saturday, May 29, 2004 12:10 AM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Friday 28 May 2004 10:54, Karthik N S wrote: > Hey ype > > Thx for the advice but still I need to get the exact situation working , > > 1) I have a unique Field [ called filename ] which is indexed of type Text. > It accepts the name of the HTML files as the indexing parameter , > Also there is another Field called "Contents" which stores all the > contents of that > indicated unique named html file. > > 2) The indexer complete indexes for about 5000 html files sucessfully . > > 3) When I do a search for word ,it returns a hit of 400 on various html > files > > Now in this situation if I want to limit the hits between First 200 to > 400 html Page Names only > what exactly should I do to using getRange() method. A range query will provide a range of indexed values, and I thought you needed to add the record number as an indexed field in each record. However, you seem to use the 200 and 400 here as the order number for each record in the result of the query on the Contents field. Is that correct? When so, in which order do you expect the results of your query? Kind regards, Ype
-
Re: Range Query Sombody HELP pleaseYpe Kingma 2004-05-31, 08:17
Karthik,
On Monday 31 May 2004 06:12, Karthik N S wrote: > Hey Ype ... > > My Question now is, If I want to Use Range Query to get search hits > between > fileName "B10181_P702" and "B10181_P355" only Instead of all the 67 hits > , > In this case there is no need to override range query, just use +fileName:[B10181_P702 TO B10181_P355] as part of the query. Kind regards, Ype ---------------------------------------------------------------------
-
Range Query Sombody HELP pleaseKarthik N S 2004-05-31, 09:00
Hey YPE
Apologies again I did as per u'r mail but see the ERROR... Search Keyword : +king+filename:[b10181_p702 TO b01081_p355] Source path [ E:/po/aaaa ] : e:/indexer3/b10181 The Exception Raised file = SearchFiles.searchIndx0 java.lang.NegativeArraySizeException at org.apache.lucene.index.TermInfosReader.readIndex(TermInfosReader.java:106) at org.apache.lucene.index.TermInfosReader.<init>(TermInfosReader.java:82) at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:141) at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:120) at org.apache.lucene.index.IndexReader$1.doBody(IndexReader.java:118) at org.apache.lucene.store.Lock$With.run(Lock.java:148) at org.apache.lucene.index.IndexReader.open(IndexReader.java:111) at org.apache.lucene.search.IndexSearcher.<init>(IndexSearcher.java:80) at com.controlnet.indexing.search.SearchFiles.searchIndex0(SearchFiles.java:68) at com.controlnet.indexing.search.SearchFiles.main(SearchFiles.java:240) [Note the Field filename is in lower case not fileName ,sorry about that] Am I doing some thing wrong in here........ With regards Karthik -----Original Message----- From: Ype Kingma [mailto:[EMAIL PROTECTED]] Sent: Monday, May 31, 2004 1:47 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please Karthik, On Monday 31 May 2004 06:12, Karthik N S wrote: > Hey Ype ... > > My Question now is, If I want to Use Range Query to get search hits > between > fileName "B10181_P702" and "B10181_P355" only Instead of all the 67 hits > , > In this case there is no need to override range query, just use +fileName:[B10181_P702 TO B10181_P355] as part of the query. Kind regards, Ype --------------------------------------------------------------------- ---------------------------------------------------------------------
-
Range Query Sombody HELP pleaseKarthik N S 2004-05-31, 09:09
Hey Ype
Sorry Once again Apologies for my last mail I re indexed my folder 10181 [Seem's to be corrupted] Now I am getting the hits as.... D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles Search Keyword : +button+filename:[B10181_P702 TO B01081_P355] Source path [ E:/po/aaaa ] : e:/indexer3/b10181 Query: ['+button+filename:[B10181_P702 TO B01081_P355]'] in Folder e:/indexer3/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': Not a Found document(s) that matched query Field 'bookid': Not a Found document(s) that matched query Field 'creation': Not a Found document(s) that matched query Field 'contents': Not a Found document(s) that matched query Field 'chapNme': Not a Found document(s) that matched query Field 'itmName': 204 Total milliseconds D:\JAVA\lucene\src\demo>java java org.lucene.src.indexer.search.SearchFiles Search Keyword : button+filename:[B10181_P702 TO B01081_P355] Source path [ E:/po/aaaa ] : e:/indexer3/b10181 Query: ['button+filename:[B10181_P702 TO B01081_P355]'] in Folder e:/indexer3/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': Not a Found document(s) that matched query Field 'bookid': Not a Found document(s) that matched query Field 'creation': Not a Found document(s) that matched query Field 'contents': Not a Found document(s) that matched query Field 'chapNme': Not a Found document(s) that matched query Field 'itmName': Is this Correct...... Or something still wrong as per Query parse String is concerned. with regards Karthik -----Original Message----- From: Ype Kingma [mailto:[EMAIL PROTECTED]] Sent: Monday, May 31, 2004 1:47 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please Karthik, On Monday 31 May 2004 06:12, Karthik N S wrote: > Hey Ype ... > > My Question now is, If I want to Use Range Query to get search hits > between > fileName "B10181_P702" and "B10181_P355" only Instead of all the 67 hits > , > In this case there is no need to override range query, just use +fileName:[B10181_P702 TO B10181_P355] as part of the query. Kind regards, Ype --------------------------------------------------------------------- ---------------------------------------------------------------------
-
Re: Range Query Sombody HELP pleaseYpe Kingma 2004-05-31, 09:22
On Monday 31 May 2004 11:09, Karthik N S wrote:
... > I re indexed my folder 10181 [Seem's to be corrupted] Was the index writer closed? > Now I am getting the hits as.... > > > D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles > Search Keyword : +button+filename:[B10181_P702 TO B01081_P355] The query needs to have space before the 2nd + : +button +filename:[B10181_P702 TO B01081_P355] > Source path [ E:/po/aaaa ] : e:/indexer3/b10181 > Query: ['+button+filename:[B10181_P702 TO B01081_P355]'] in Folder > e:/indexer3/b10181/b10181_indx_ > Not a Found document(s) that matched query Field 'filename': > Not a Found document(s) that matched query Field 'bookid': > Not a Found document(s) that matched query Field 'creation': > Not a Found document(s) that matched query Field 'contents': > Not a Found document(s) that matched query Field 'chapNme': > Not a Found document(s) that matched query Field 'itmName': You seem to use a search mechanism that searches all these fields. I'd recommend to switch this off until a query with explicit fields works, eg.: +contents:button +filename:[B10181_P702 TO B01081_P355] Btw. You'll need to make sure that a term like B10181_P702 is not split at the underscore _ by a tokenizer at indexing time. If your filename is not a keyword field, you might consider changing it into a keyword field. You seem to index book pages as Lucene documents, which is ok. However, you may also need to index larger parts of the books in order to retrieve books with multiple subjects on different pages. Is this what your original question is about? Have fun, Ype ---------------------------------------------------------------------
-
RE: Range Query Sombody HELP pleaseKarthik N S 2004-05-31, 11:47
Hey Ype...
1) I switched Off the Multi search Senerio. 2) Changing the Field type from Text to Keyword will fail When I search for the the Field type "filename" so,I still maintained it to be Text D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles Search Keyword : b10181_p388 Source path [ E:/po/aaaa ] : e:/indexer3/b10181 Query: ['b10181_p388'] in Folder e:/indexer3/b10181/b10181_indx_ Found document(s) that matched : 'b10181_p388' no of hits :'1' in query Field :'filename' File Name : B10181_P388 3)On Search for range between 2 file names B10181_P702 to B01081_P355 still returns me 0 hits [Included space before the 2nd '+' ] D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles Search Keyword : +button +filename:[b10181_p702 TO b10181_p355] Source path [ E:/po/aaaa ] : e:/indexer3/b10181 Query: ['+button +filename:[b10181_p702 TO b10181_p355]'] in Folder e:/indexer3/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': or D:\JAVA\lucene\src\demo>java com.controlnet.indexing.search.SearchFiles Search Keyword : +contents:button +filename:[b10181_p702 TO b10181_p355] Source path [ E:/po/aaaa ] : e:/indexer3/b10181 Query: ['+contents:button +filename:[b10181_p702 TO b10181_p355]'] in Folder e:/indexer3/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': Also the does the search varies on the Field Type if so My Indexed Field types as below.... doc.add(Field.Text("path", fhtml.getPath())); doc.add(Field.Keyword("modified",fhtml.lastModified()+"")); doc.add(Field.Text("filename",fhtml.getName())); doc.add(Field.Keyword("creation",CREATION_)); doc.add(Field.Keyword("bookid",BOOKID_)); doc.add(Field.Text("chapNme",CHAPNAME_)); doc.add(Field.Text("itmName",ITEMNAME_)); please do advise me. Karthik [ James Goslink says Microsoft has More Money to burn then GOD has ...on his visit to India,In an interview to MSNBC TV Last night ] -----Original Message----- From: Ype Kingma [mailto:[EMAIL PROTECTED]] Sent: Monday, May 31, 2004 2:52 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Monday 31 May 2004 11:09, Karthik N S wrote: ... > I re indexed my folder 10181 [Seem's to be corrupted] Was the index writer closed? > Now I am getting the hits as.... > > > D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles > Search Keyword : +button+filename:[B10181_P702 TO B01081_P355] The query needs to have space before the 2nd + : +button +filename:[B10181_P702 TO B01081_P355] > Source path [ E:/po/aaaa ] : e:/indexer3/b10181 > Query: ['+button+filename:[B10181_P702 TO B01081_P355]'] in Folder > e:/indexer3/b10181/b10181_indx_ > Not a Found document(s) that matched query Field 'filename': > Not a Found document(s) that matched query Field 'bookid': > Not a Found document(s) that matched query Field 'creation': > Not a Found document(s) that matched query Field 'contents': > Not a Found document(s) that matched query Field 'chapNme': > Not a Found document(s) that matched query Field 'itmName': You seem to use a search mechanism that searches all these fields. I'd recommend to switch this off until a query with explicit fields works, eg.: +contents:button +filename:[B10181_P702 TO B01081_P355] Btw. You'll need to make sure that a term like B10181_P702 is not split at the underscore _ by a tokenizer at indexing time. If your filename is not a keyword field, you might consider changing it into a keyword field. You seem to index book pages as Lucene documents, which is ok. However, you may also need to index larger parts of the books in order to retrieve books with multiple subjects on different pages. Is this what your original question is about? Have fun, Ype --------------------------------------------------------------------- ---------------------------------------------------------------------
-
Re: Range Query Sombody HELP pleaseErik Hatcher 2004-05-31, 13:20
Try my AnalysisDemo code on some filename field samples:
http://wiki.apache.org/jakarta-lucene/AnalysisParalysis You mentioned earlier, I think, that you are using a custom analyzer. Give us the output of AnalysisDemo on some samples so we can see what is coming out. If you can put together a 10-line Java program that uses RAMDirectory and has some sample hard-coded text that I can easily run standalone I would look into your situation further. As it is, you are providing far more complexity than I have time to delve into. Narrow it down to a very very simple example that we can all see in one screen. Erik On May 31, 2004, at 7:47 AM, Karthik N S wrote: > Hey Ype... > > 1) I switched Off the Multi search Senerio. > > 2) Changing the Field type from Text to Keyword > will fail When I search for the the Field type "filename" > so,I still maintained it to be Text > > D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles > Search Keyword : b10181_p388 > Source path [ E:/po/aaaa ] : e:/indexer3/b10181 > Query: ['b10181_p388'] in Folder e:/indexer3/b10181/b10181_indx_ > > Found document(s) that matched : 'b10181_p388' no of hits :'1' in query > Field :'filename' > File Name : B10181_P388 > > > 3)On Search for range between 2 file names B10181_P702 to > B01081_P355 > still returns me 0 hits [Included space before the 2nd '+' ] > > D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles > Search Keyword : +button +filename:[b10181_p702 TO b10181_p355] > Source path [ E:/po/aaaa ] : e:/indexer3/b10181 > Query: ['+button +filename:[b10181_p702 TO b10181_p355]'] in Folder > e:/indexer3/b10181/b10181_indx_ > Not a Found document(s) that matched query Field 'filename': > > or > > D:\JAVA\lucene\src\demo>java com.controlnet.indexing.search.SearchFiles > Search Keyword : +contents:button +filename:[b10181_p702 TO > b10181_p355] > Source path [ E:/po/aaaa ] : e:/indexer3/b10181 > Query: ['+contents:button +filename:[b10181_p702 TO b10181_p355]'] in > Folder > e:/indexer3/b10181/b10181_indx_ > Not a Found document(s) that matched query Field 'filename': > > > Also the does the search varies on the Field Type if so My Indexed > Field > types as below.... > > doc.add(Field.Text("path", fhtml.getPath())); > doc.add(Field.Keyword("modified",fhtml.lastModified()+"")); > doc.add(Field.Text("filename",fhtml.getName())); > doc.add(Field.Keyword("creation",CREATION_)); > doc.add(Field.Keyword("bookid",BOOKID_)); > doc.add(Field.Text("chapNme",CHAPNAME_)); > doc.add(Field.Text("itmName",ITEMNAME_)); > > > > please do advise me. > Karthik > > > > [ James Goslink says Microsoft has More Money to burn then GOD has > ...on his visit to India,In an interview to MSNBC TV Last night ] > > > > -----Original Message----- > From: Ype Kingma [mailto:[EMAIL PROTECTED]] > Sent: Monday, May 31, 2004 2:52 PM > To: [EMAIL PROTECTED] > Subject: Re: Range Query Sombody HELP please > > > On Monday 31 May 2004 11:09, Karthik N S wrote: > > ... >> I re indexed my folder 10181 [Seem's to be corrupted] > > Was the index writer closed? > >> Now I am getting the hits as.... >> >> >> D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles >> Search Keyword : +button+filename:[B10181_P702 TO B01081_P355] > > The query needs to have space before the 2nd + : > > +button +filename:[B10181_P702 TO B01081_P355] > >> Source path [ E:/po/aaaa ] : e:/indexer3/b10181 >> Query: ['+button+filename:[B10181_P702 TO B01081_P355]'] in Folder >> e:/indexer3/b10181/b10181_indx_ >> Not a Found document(s) that matched query Field 'filename': >> Not a Found document(s) that matched query Field 'bookid': >> Not a Found document(s) that matched query Field 'creation': >> Not a Found document(s) that matched query Field 'contents': >> Not a Found document(s) that matched query Field 'chapNme': >> Not a Found document(s) that matched query Field 'itmName': > > You seem to use a search mechanism that searches all these fields.
-
Re: Range Query Sombody HELP pleaseYpe Kingma 2004-05-31, 15:10
Karthik,
On Monday 31 May 2004 13:47, Karthik N S wrote: > Hey Ype... > > 1) I switched Off the Multi search Senerio. > > 2) Changing the Field type from Text to Keyword > will fail When I search for the the Field type "filename" > so,I still maintained it to be Text Just make sure the file name is indexed as you show it, ie. the underscore should be in the indexed term. The best way to do that is to index the filename as keyword. Check the output of the analyzer, or use luke to see what is in the index for the filename field. > D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles > Search Keyword : b10181_p388 > Source path [ E:/po/aaaa ] : e:/indexer3/b10181 > Query: ['b10181_p388'] in Folder e:/indexer3/b10181/b10181_indx_ > > Found document(s) that matched : 'b10181_p388' no of hits :'1' in query > Field :'filename' > File Name : B10181_P388 > > > 3)On Search for range between 2 file names B10181_P702 to B01081_P355 > still returns me 0 hits [Included space before the 2nd '+' ] > > D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles > Search Keyword : +button +filename:[b10181_p702 TO b10181_p355] Could you try this: +button +filename:[b10181_p355 TO b10181_p702] ? If this does not work, please narrow your problem down to a java test program of 10-20 lines, and post the code. Regards, Ype ---------------------------------------------------------------------
-
Range Query Sombody HELP pleaseKarthik N S 2004-06-01, 12:10
Hey Ype/Erick
Apologies please I sent u guys some code as per mail did u recieve it or shall i re send them. with regards Karthik -----Original Message----- From: Ype Kingma [mailto:[EMAIL PROTECTED]] Sent: Monday, May 31, 2004 8:41 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please Karthik, On Monday 31 May 2004 13:47, Karthik N S wrote: > Hey Ype... > > 1) I switched Off the Multi search Senerio. > > 2) Changing the Field type from Text to Keyword > will fail When I search for the the Field type "filename" > so,I still maintained it to be Text Just make sure the file name is indexed as you show it, ie. the underscore should be in the indexed term. The best way to do that is to index the filename as keyword. Check the output of the analyzer, or use luke to see what is in the index for the filename field. > D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles > Search Keyword : b10181_p388 > Source path [ E:/po/aaaa ] : e:/indexer3/b10181 > Query: ['b10181_p388'] in Folder e:/indexer3/b10181/b10181_indx_ > > Found document(s) that matched : 'b10181_p388' no of hits :'1' in query > Field :'filename' > File Name : B10181_P388 > > > 3)On Search for range between 2 file names B10181_P702 to B01081_P355 > still returns me 0 hits [Included space before the 2nd '+' ] > > D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles > Search Keyword : +button +filename:[b10181_p702 TO b10181_p355] Could you try this: +button +filename:[b10181_p355 TO b10181_p702] ? If this does not work, please narrow your problem down to a java test program of 10-20 lines, and post the code. Regards, Ype --------------------------------------------------------------------- ---------------------------------------------------------------------
-
Re: Range Query Sombody HELP pleaseErik Hatcher 2004-06-01, 13:07
On Jun 1, 2004, at 8:10 AM, Karthik N S wrote: > Hey Ype/Erick > > Apologies please > > I sent u guys some code as per mail > did u recieve it or shall i re send them. I did not send it. Please just copy/paste it into an e-mail to the list. Erik ---------------------------------------------------------------------
-
Range Query Sombody HELP pleaseKarthik N S 2004-06-02, 10:20
Hey Ype/Erick Thx in advance in helping me for the Range of Queries. Finally I was able to trace the wrong process within my code and closed them. I still have 3 small Questions. 1)While creating the Range Query Is it possible for Lucene to do somthing similar.. +(button AND shirt) +filename:[b10181_p100 TO b10181_p200] [Do you think this will work] It's not on returning hits , but it does return hits with either one of them "Shirt" or "button" Only. 2)When the indexer start indexing does it do according to alphabetic order or is it some other way... 3)The Field Type "Keyword" is not accepting name of Files as it indexes [ Try indexing filenames and then do a search on them ,the hits will return u 0 defnitly, lucene1.3-final version ] doc.add(Field.Text("filename",file.getName())) < -------------------- Will return Hits doc.add(Field.Keyword("filename",file.getName())) <-------------------- Will Not return Hits why??? with regards Karthik On Monday 31 May 2004 13:47, Karthik N S wrote: > Hey Ype... > > 1) I switched Off the Multi search Senerio. > > 2) Changing the Field type from Text to Keyword > will fail When I search for the the Field type "filename" > so,I still maintained it to be Text Just make sure the file name is indexed as you show it, ie. the underscore should be in the indexed term. The best way to do that is to index the filename as keyword. Check the output of the analyzer, or use luke to see what is in the index for the filename field. > D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles > Search Keyword : b10181_p388 > Source path [ E:/po/aaaa ] : e:/indexer3/b10181 > Query: ['b10181_p388'] in Folder e:/indexer3/b10181/b10181_indx_ > > Found document(s) that matched : 'b10181_p388' no of hits :'1' in query > Field :'filename' > File Name : B10181_P388 > > > 3)On Search for range between 2 file names B10181_P702 to B01081_P355 > still returns me 0 hits [Included space before the 2nd '+' ] > > D:\JAVA\lucene\src\demo>java org.lucene.src.indexer.search.SearchFiles > Search Keyword : +button +filename:[b10181_p702 TO b10181_p355] Could you try this: +button +filename:[b10181_p355 TO b10181_p702] ? If this does not work, please narrow your problem down to a java test program of 10-20 lines, and post the code. Regards, Ype --------------------------------------------------------------------- ---------------------------------------------------------------------
-
Re: Range Query Sombody HELP pleaseErik Hatcher 2004-06-02, 12:46
On Jun 2, 2004, at 6:20 AM, Karthik N S wrote:
> > Hey Ype/Erick If you're gonna ask for help, the least ya could do is spell my name correctly :) > I still have 3 small Questions. > > 1)While creating the Range Query Is it possible for Lucene to do > somthing > similar.. > > +(button AND shirt) +filename:[b10181_p100 TO b10181_p200] > > [Do you think this will work] It's not on returning hits , but > it does > return hits with either one of them "Shirt" or "button" Only. My guess is you have documents none of your documents in that range have button AND shirt in them. > 2)When the indexer start indexing does it do according to alphabetic > order > or is it some other way... I don't understand the question, sorry. Terms in the index are ordered lexicographically, if that is what you mean. > 3)The Field Type "Keyword" is not accepting name of Files as it > indexes > [ Try indexing filenames and then do a search on them ,the hits will > return u 0 defnitly, lucene1.3-final version ] > > doc.add(Field.Text("filename",file.getName())) > < -------------------- Will return Hits > > doc.add(Field.Keyword("filename",file.getName())) > <-------------------- Will Not return Hits > > > why??? Because of your analyzer. Try indexing as a Keyword and search using a TermQuery. Don't use QueryParser at first - it gets in the way of understanding what is really going on. For fun, look at the .toString of the Query generated by QueryParser if you like. Look at the AnalysisParalysis page on the wiki for more details. Read my java.net articles to get a better understanding. The short answer is that it is analysis that is bogging you down here. You need to decide how to index file names on how you plan on querying for them. We cannot answer this for you. Erik ---------------------------------------------------------------------
-
Re: Range Query Sombody HELP pleaseYpe Kingma 2004-06-02, 18:55
On Wednesday 02 June 2004 14:46, Erik Hatcher wrote:
> On Jun 2, 2004, at 6:20 AM, Karthik N S wrote: ... > > I still have 3 small Questions. > > > > 1)While creating the Range Query Is it possible for Lucene to do > > somthing > > similar.. > > > > +(button AND shirt) +filename:[b10181_p100 TO b10181_p200] > > > > [Do you think this will work] It's not on returning hits , but > > it does > > return hits with either one of them "Shirt" or "button" Only. > > My guess is you have documents none of your documents in that range > have button AND shirt in them. You can also try this: +button +shirt +filename:[b10181_p100 TO b10181_p200] I never got to completely understand the way the query parser deals with AND and OR, so I prefer to avoid them. Regards, Ype ---------------------------------------------------------------------
-
Range Query Sombody HELP pleaseKarthik N S 2004-06-03, 05:10
Hey
Ype the Query of range +button +shirt +filename:[b10181_p100 TO b10181_p200] did not work for me but on other way around +(button OR shirt) +filename:[b10181_p100 TO b10181_p200] resulted to me in 2 hits with either one term "button / shirt " in each page,but not both of them I found from the Html file that both words are present in more then 2 files, Are there any other possibilities for getting both words. with regards Karthik -----Original Message----- From: Ype Kingma [mailto:[EMAIL PROTECTED]] Sent: Thursday, June 03, 2004 12:26 AM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Wednesday 02 June 2004 14:46, Erik Hatcher wrote: > On Jun 2, 2004, at 6:20 AM, Karthik N S wrote: ... > > I still have 3 small Questions. > > > > 1)While creating the Range Query Is it possible for Lucene to do > > somthing > > similar.. > > > > +(button AND shirt) +filename:[b10181_p100 TO b10181_p200] > > > > [Do you think this will work] It's not on returning hits , but > > it does > > return hits with either one of them "Shirt" or "button" Only. > > My guess is you have documents none of your documents in that range > have button AND shirt in them. You can also try this: +button +shirt +filename:[b10181_p100 TO b10181_p200] I never got to completely understand the way the query parser deals with AND and OR, so I prefer to avoid them. Regards, Ype --------------------------------------------------------------------- ---------------------------------------------------------------------
-
Re: Range Query Sombody HELP pleaseYpe Kingma 2004-06-03, 06:53
On Thursday 03 June 2004 07:10, Karthik N S wrote:
> Hey > > Ype the Query of range > > +button +shirt +filename:[b10181_p100 TO b10181_p200] > > did not work for me but on other way around > > +(button OR shirt) +filename:[b10181_p100 TO b10181_p200] > > resulted to me in 2 hits with either one term "button / shirt " in each > page,but not both of them > > I found from the Html file that both words are present in more then 2 > files, > > Are there any other possibilities for getting both words. Your index contains book pages as Lucene documents. In this case you need to index larger parts of the books as Lucene documents in order to retrieve books with multiple subjects on different pages. Kind regards, Ype --------------------------------------------------------------------- |