|
Hiren Shah
2012-07-04, 06:11
Ian Lea
2012-07-04, 08:51
Ian Lea
2012-07-04, 09:00
Hiren Shah
2012-07-04, 10:49
Jack Krupansky
2012-07-04, 15:49
Hiren Shah
2012-07-04, 20:20
Jack Krupansky
2012-07-04, 21:52
Jack Krupansky
2012-07-04, 22:04
|
-
Starts with Query - Return like searchHiren Shah 2012-07-04, 06:11
I have used standardAnalyser to save the ANALYZED data in index.
Data is as below:- 1. foo bag test 2. foo bar test 3. bar india foo I used When i search using---------------> foo ba I get all results when i use ------->(+foo* +ba*) 1. I tried using "foo ba" (with double quotes) but no results come as it searches for exact word 2. I tried using "foo ba*" (with double quotes) but no results come as it searches for exact word 3. I tried using "foo bar" (with double quotes) Then 2nd result comes as both words are completed What should be done to get options 1 and 2 in results when user types foo ba*. I dont want 3rd result but want 1st 2 results. Please help. Thanks Hiren
-
Re: Starts with Query - Return like searchIan Lea 2012-07-04, 08:51
Where exactly are you using these double quoted strings? QueryParser?
It would help if you showed a code snippet. Assuming your real data is more complex and the strings you are searching for aren't necessarily at the start of the text, you'll need some mix of wildcard and proximity searching. I don't think that "foo ba*"~n will work but I'm sure you'll be able to do it with a SpanQuery or six. SpanNearQuery lets you specify slop and whether you care if matches are in order or not. See http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/ for info on spans. See also http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3F for good tips on figuring out why things aren't doing what you want. Good luck. -- Ian. On Wed, Jul 4, 2012 at 7:11 AM, Hiren Shah <[EMAIL PROTECTED]> wrote: > I have used standardAnalyser to save the ANALYZED data in index. > > Data is as below:- > > 1. foo bag test > 2. foo bar test > 3. bar india foo > > > I used > When i search using---------------> foo ba > I get all results when i use ------->(+foo* +ba*) > > 1. I tried using "foo ba" (with double quotes) but no results come as > it searches for exact word > 2. I tried using "foo ba*" (with double quotes) but no results come as > it searches for exact word > 3. I tried using "foo bar" (with double quotes) Then 2nd result comes > as both words are completed > > What should be done to get options 1 and 2 in results when user types foo > ba*. I dont want 3rd result but want 1st 2 results. > Please help. > > Thanks > Hiren ---------------------------------------------------------------------
-
Re: Starts with Query - Return like searchIan Lea 2012-07-04, 09:00
In fact there is an FAQ entry Can I combine wildcard and phrase
search, e.g. "foo ba*"? at http://wiki.apache.org/lucene-java/LuceneFAQ#Can_I_combine_wildcard_and_phrase_search.2C_e.g._.22foo_ba.2A.22.3F which suggests you extend the QueryParser to build a MultiPhraseQuery. There's also ComplexPhraseQueryParser which looks interesting. -- Ian. On Wed, Jul 4, 2012 at 9:51 AM, Ian Lea <[EMAIL PROTECTED]> wrote: > Where exactly are you using these double quoted strings? QueryParser? > It would help if you showed a code snippet. > > Assuming your real data is more complex and the strings you are > searching for aren't necessarily at the start of the text, you'll need > some mix of wildcard and proximity searching. I don't think that "foo > ba*"~n > will work but I'm sure you'll be able to do it with a SpanQuery or > six. SpanNearQuery lets you specify slop and whether you care if > matches are in order or not. > > See http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/ for > info on spans. > > See also http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3F > for good tips on figuring out why things aren't doing what you want. > > Good luck. > > > -- > Ian. > > > On Wed, Jul 4, 2012 at 7:11 AM, Hiren Shah <[EMAIL PROTECTED]> wrote: >> I have used standardAnalyser to save the ANALYZED data in index. >> >> Data is as below:- >> >> 1. foo bag test >> 2. foo bar test >> 3. bar india foo >> >> >> I used >> When i search using---------------> foo ba >> I get all results when i use ------->(+foo* +ba*) >> >> 1. I tried using "foo ba" (with double quotes) but no results come as >> it searches for exact word >> 2. I tried using "foo ba*" (with double quotes) but no results come as >> it searches for exact word >> 3. I tried using "foo bar" (with double quotes) Then 2nd result comes >> as both words are completed >> >> What should be done to get options 1 and 2 in results when user types foo >> ba*. I dont want 3rd result but want 1st 2 results. >> Please help. >> >> Thanks >> Hiren ---------------------------------------------------------------------
-
Re: Starts with Query - Return like searchHiren Shah 2012-07-04, 10:49
Please find the code here
package org.lucenesample; import org.apache.lucene.search.Query; import org.apache.lucene.*; import org.apache.lucene.analysis.*; import org.apache.lucene.analysis.standard.*; import org.apache.lucene.analysis.standard.std31.*; import org.apache.lucene.analysis.tokenattributes.*; import org.apache.lucene.collation.*; import org.apache.lucene.document.*; import org.apache.lucene.document.Field.Index; import org.apache.lucene.document.Field.Store; import org.apache.lucene.index.*; import org.apache.lucene.index.IndexWriter.MaxFieldLength; import org.apache.lucene.messages.*; import org.apache.lucene.queryParser.*; import org.apache.lucene.search.*; import org.apache.lucene.search.function.*; import org.apache.lucene.search.payloads.*; import org.apache.lucene.search.spans.*; import org.apache.lucene.store.*; import org.apache.lucene.util.*; import org.apache.lucene.util.fst.*; import org.apache.lucene.util.packed.*; import java.io.File; import java.sql.*; import java.util.HashMap; public class ExactPhrasesearchUsingStandardAnalyser { /** * @param args */ public static void main(String[] args) throws Exception { Directory directory = new RAMDirectory(); StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_35); MaxFieldLength mlf = MaxFieldLength.UNLIMITED; IndexWriter writer = new IndexWriter(directory, analyzer, true, mlf); writer.addDocument(createDocument1("1", "foo bar baz blue")); writer.addDocument(createDocument1("2", "red green blue")); writer.addDocument(createDocument1("3", "test panda foo & bar testt")); writer.addDocument(createDocument1("4", " bar test test foo in panda red blue ")); writer.addDocument(createDocument1("4", "test")); writer.close(); IndexSearcher searcher = new IndexSearcher(directory); PhraseQuery query = new PhraseQuery(); QueryParser qp2 = new QueryParser(Version.LUCENE_35, "contents", analyzer); //qp.setDefaultOperator(QueryParser.Operator.AND); Query queryx2 =qp2.parse("test foo in panda re*");//contains query Query queryx23 =qp2.parse("+red +green +blu*" );//exact phrase match query.Make last word as followed by star Query queryx234 =qp2.parse("(+red +green +blu*)& (\"red* green\") " ); /*String term = "new york"; // id and location are the fields in which i want to search the "term" MultiFieldQueryParser queryParser = new MultiFieldQueryParser( Version.LUCENE_35, { "contents"}, new KeywordAnalyzer()); Query query = queryParser.parse(term); System.out.println(query.toString());*/ QueryParser qp = new QueryParser(Version.LUCENE_35, "contents", analyzer); //qp.setDefaultOperator(QueryParser.Operator.AND); Query queryx =qp.parse("\"air quality\"~10"); System.out.println("******************Searching Code starts******************"); TopDocs topDocs = searcher.search(queryx2, 10); for (ScoreDoc scoreDoc : topDocs.scoreDocs) { Document doc = searcher.doc(scoreDoc.doc); System.out.println(doc+"testtttttttt"); } } private static Document createDocument1(String id, String content) { Document doc = new Document(); doc.add(new Field("id", id, Store.YES, Index.NOT_ANALYZED)); doc.add(new Field("contents", content, Store.YES, Index.ANALYZED, Field. TermVector.WITH_POSITIONS_OFFSETS)); System.out.println(content); return doc; } } Also please refer the below post. http://stackoverflow.com/questions/10828825/incremental-search-using-lucene On Wed, Jul 4, 2012 at 2:21 PM, Ian Lea <[EMAIL PROTECTED]> wrote: > Where exactly are you using these double quoted strings? QueryParser? > It would help if you showed a code snippet. > > Assuming your real data is more complex and the strings you are
-
Re: Starts with Query - Return like searchJack Krupansky 2012-07-04, 15:49
You might also consider using the EdgeNGram filter for your documents since
it would index "bar" as both "ba" and "bar" at the same position, eliminating the need for the use of wildcards. It makes the index bigger, but eliminates the performance degradation of wildcards. It isn't great for all situations, but maybe it would work well for your case. -- Jack Krupansky -----Original Message----- From: Ian Lea Sent: Wednesday, July 04, 2012 4:00 AM To: [EMAIL PROTECTED] Subject: Re: Starts with Query - Return like search In fact there is an FAQ entry Can I combine wildcard and phrase search, e.g. "foo ba*"? at http://wiki.apache.org/lucene-java/LuceneFAQ#Can_I_combine_wildcard_and_phrase_search.2C_e.g._.22foo_ba.2A.22.3F which suggests you extend the QueryParser to build a MultiPhraseQuery. There's also ComplexPhraseQueryParser which looks interesting. -- Ian. On Wed, Jul 4, 2012 at 9:51 AM, Ian Lea <[EMAIL PROTECTED]> wrote: > Where exactly are you using these double quoted strings? QueryParser? > It would help if you showed a code snippet. > > Assuming your real data is more complex and the strings you are > searching for aren't necessarily at the start of the text, you'll need > some mix of wildcard and proximity searching. I don't think that "foo > ba*"~n > will work but I'm sure you'll be able to do it with a SpanQuery or > six. SpanNearQuery lets you specify slop and whether you care if > matches are in order or not. > > See http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/ for > info on spans. > > See also > http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3F > for good tips on figuring out why things aren't doing what you want. > > Good luck. > > > -- > Ian. > > > On Wed, Jul 4, 2012 at 7:11 AM, Hiren Shah <[EMAIL PROTECTED]> wrote: >> I have used standardAnalyser to save the ANALYZED data in index. >> >> Data is as below:- >> >> 1. foo bag test >> 2. foo bar test >> 3. bar india foo >> >> >> I used >> When i search using---------------> foo ba >> I get all results when i use ------->(+foo* +ba*) >> >> 1. I tried using "foo ba" (with double quotes) but no results come as >> it searches for exact word >> 2. I tried using "foo ba*" (with double quotes) but no results come >> as >> it searches for exact word >> 3. I tried using "foo bar" (with double quotes) Then 2nd result comes >> as both words are completed >> >> What should be done to get options 1 and 2 in results when user types >> foo >> ba*. I dont want 3rd result but want 1st 2 results. >> Please help. >> >> Thanks >> Hiren --------------------------------------------------------------------- ---------------------------------------------------------------------
-
Re: Starts with Query - Return like searchHiren Shah 2012-07-04, 20:20
Hi Jack
This needs to be taken care while indexing?Where can i get the code for the edgegram indexing and then searching.? -Hiren On Wed, Jul 4, 2012 at 9:19 PM, Jack Krupansky <[EMAIL PROTECTED]>wrote: > You might also consider using the EdgeNGram filter for your documents > since it would index "bar" as both "ba" and "bar" at the same position, > eliminating the need for the use of wildcards. It makes the index bigger, > but eliminates the performance degradation of wildcards. It isn't great for > all situations, but maybe it would work well for your case. > > -- Jack Krupansky > > -----Original Message----- From: Ian Lea > Sent: Wednesday, July 04, 2012 4:00 AM > To: [EMAIL PROTECTED] > Subject: Re: Starts with Query - Return like search > > > In fact there is an FAQ entry Can I combine wildcard and phrase > search, e.g. "foo ba*"? at > http://wiki.apache.org/lucene-**java/LuceneFAQ#Can_I_combine_** > wildcard_and_phrase_search.2C_**e.g._.22foo_ba.2A.22.3F<http://wiki.apache.org/lucene-java/LuceneFAQ#Can_I_combine_wildcard_and_phrase_search.2C_e.g._.22foo_ba.2A.22.3F> > which suggests you extend the QueryParser to build a MultiPhraseQuery. > There's also ComplexPhraseQueryParser which looks interesting. > > > -- > Ian. > > > On Wed, Jul 4, 2012 at 9:51 AM, Ian Lea <[EMAIL PROTECTED]> wrote: > >> Where exactly are you using these double quoted strings? QueryParser? >> It would help if you showed a code snippet. >> >> Assuming your real data is more complex and the strings you are >> searching for aren't necessarily at the start of the text, you'll need >> some mix of wildcard and proximity searching. I don't think that "foo >> ba*"~n >> will work but I'm sure you'll be able to do it with a SpanQuery or >> six. SpanNearQuery lets you specify slop and whether you care if >> matches are in order or not. >> >> See http://www.lucidimagination.**com/blog/2009/07/18/the-**spanquery/<http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/>for >> info on spans. >> >> See also http://wiki.apache.org/lucene-**java/LuceneFAQ#Why_am_I_** >> getting_no_hits_.2BAC8_**incorrect_hits.3F<http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3F> >> for good tips on figuring out why things aren't doing what you want. >> >> Good luck. >> >> >> -- >> Ian. >> >> >> On Wed, Jul 4, 2012 at 7:11 AM, Hiren Shah <[EMAIL PROTECTED]> >> wrote: >> >>> I have used standardAnalyser to save the ANALYZED data in index. >>> >>> Data is as below:- >>> >>> 1. foo bag test >>> 2. foo bar test >>> 3. bar india foo >>> >>> >>> I used >>> When i search using---------------> foo ba >>> I get all results when i use ------->(+foo* +ba*) >>> >>> 1. I tried using "foo ba" (with double quotes) but no results come as >>> it searches for exact word >>> 2. I tried using "foo ba*" (with double quotes) but no results come >>> as >>> it searches for exact word >>> 3. I tried using "foo bar" (with double quotes) Then 2nd result comes >>> as both words are completed >>> >>> What should be done to get options 1 and 2 in results when user types >>> foo >>> ba*. I dont want 3rd result but want 1st 2 results. >>> Please help. >>> >>> Thanks >>> Hiren >>> >> > ------------------------------**------------------------------**--------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<[EMAIL PROTECTED]> > For additional commands, e-mail: [EMAIL PROTECTED]he.**org<[EMAIL PROTECTED]> > > ------------------------------**------------------------------**--------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<[EMAIL PROTECTED]> > For additional commands, e-mail: [EMAIL PROTECTED]he.**org<[EMAIL PROTECTED]> > >
-
Re: Starts with Query - Return like searchJack Krupansky 2012-07-04, 21:52
Here's a Solr field type that supports edge n-grams:
<fieldType name="text_general_edge_ngram" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.LowerCaseTokenizerFactory"/> <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.LowerCaseTokenizerFactory"/> </analyzer> </fieldType> In Lucene, you would use the EdgeNGramFilter. This is for Lucene/Solr 3.6. -- Jack Krupansky -----Original Message----- From: Hiren Shah Sent: Wednesday, July 04, 2012 3:20 PM To: [EMAIL PROTECTED] Subject: Re: Starts with Query - Return like search Hi Jack This needs to be taken care while indexing?Where can i get the code for the edgegram indexing and then searching.? -Hiren On Wed, Jul 4, 2012 at 9:19 PM, Jack Krupansky <[EMAIL PROTECTED]>wrote: > You might also consider using the EdgeNGram filter for your documents > since it would index "bar" as both "ba" and "bar" at the same position, > eliminating the need for the use of wildcards. It makes the index bigger, > but eliminates the performance degradation of wildcards. It isn't great > for > all situations, but maybe it would work well for your case. > > -- Jack Krupansky > > -----Original Message----- From: Ian Lea > Sent: Wednesday, July 04, 2012 4:00 AM > To: [EMAIL PROTECTED] > Subject: Re: Starts with Query - Return like search > > > In fact there is an FAQ entry Can I combine wildcard and phrase > search, e.g. "foo ba*"? at > http://wiki.apache.org/lucene-**java/LuceneFAQ#Can_I_combine_** > wildcard_and_phrase_search.2C_**e.g._.22foo_ba.2A.22.3F<http://wiki.apache.org/lucene-java/LuceneFAQ#Can_I_combine_wildcard_and_phrase_search.2C_e.g._.22foo_ba.2A.22.3F> > which suggests you extend the QueryParser to build a MultiPhraseQuery. > There's also ComplexPhraseQueryParser which looks interesting. > > > -- > Ian. > > > On Wed, Jul 4, 2012 at 9:51 AM, Ian Lea <[EMAIL PROTECTED]> wrote: > >> Where exactly are you using these double quoted strings? QueryParser? >> It would help if you showed a code snippet. >> >> Assuming your real data is more complex and the strings you are >> searching for aren't necessarily at the start of the text, you'll need >> some mix of wildcard and proximity searching. I don't think that "foo >> ba*"~n >> will work but I'm sure you'll be able to do it with a SpanQuery or >> six. SpanNearQuery lets you specify slop and whether you care if >> matches are in order or not. >> >> See >> http://www.lucidimagination.**com/blog/2009/07/18/the-**spanquery/<http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/>for >> info on spans. >> >> See also http://wiki.apache.org/lucene-**java/LuceneFAQ#Why_am_I_** >> getting_no_hits_.2BAC8_**incorrect_hits.3F<http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3F> >> for good tips on figuring out why things aren't doing what you want. >> >> Good luck. >> >> >> -- >> Ian. >> >> >> On Wed, Jul 4, 2012 at 7:11 AM, Hiren Shah <[EMAIL PROTECTED]> >> wrote: >> >>> I have used standardAnalyser to save the ANALYZED data in index. >>> >>> Data is as below:- >>> >>> 1. foo bag test >>> 2. foo bar test >>> 3. bar india foo >>> >>> >>> I used >>> When i search using---------------> foo ba >>> I get all results when i use ------->(+foo* +ba*) >>> >>> 1. I tried using "foo ba" (with double quotes) but no results come >>> as >>> it searches for exact word >>> 2. I tried using "foo ba*" (with double quotes) but no results come >>> as >>> it searches for exact word >>> 3. I tried using "foo bar" (with double quotes) Then 2nd result >>> comes >>> as both words are completed >>> >>> What should be done to get options 1 and 2 in results when user types >>> foo >>> ba*. I dont want 3rd result but want 1st 2 results. >>> Please help. >>> >>> Thanks >>> Hiren >>> >> > ------------------------------**------------------------------**---------
-
Re: Starts with Query - Return like searchJack Krupansky 2012-07-04, 22:04
Oops... that's EdgeNGramTokenFilter in Lucene.
-- Jack Krupansky -----Original Message----- From: Jack Krupansky Sent: Wednesday, July 04, 2012 4:52 PM To: [EMAIL PROTECTED] Subject: Re: Starts with Query - Return like search Here's a Solr field type that supports edge n-grams: <fieldType name="text_general_edge_ngram" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.LowerCaseTokenizerFactory"/> <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.LowerCaseTokenizerFactory"/> </analyzer> </fieldType> In Lucene, you would use the EdgeNGramFilter. This is for Lucene/Solr 3.6. -- Jack Krupansky -----Original Message----- From: Hiren Shah Sent: Wednesday, July 04, 2012 3:20 PM To: [EMAIL PROTECTED] Subject: Re: Starts with Query - Return like search Hi Jack This needs to be taken care while indexing?Where can i get the code for the edgegram indexing and then searching.? -Hiren On Wed, Jul 4, 2012 at 9:19 PM, Jack Krupansky <[EMAIL PROTECTED]>wrote: > You might also consider using the EdgeNGram filter for your documents > since it would index "bar" as both "ba" and "bar" at the same position, > eliminating the need for the use of wildcards. It makes the index bigger, > but eliminates the performance degradation of wildcards. It isn't great > for > all situations, but maybe it would work well for your case. > > -- Jack Krupansky > > -----Original Message----- From: Ian Lea > Sent: Wednesday, July 04, 2012 4:00 AM > To: [EMAIL PROTECTED] > Subject: Re: Starts with Query - Return like search > > > In fact there is an FAQ entry Can I combine wildcard and phrase > search, e.g. "foo ba*"? at > http://wiki.apache.org/lucene-**java/LuceneFAQ#Can_I_combine_** > wildcard_and_phrase_search.2C_**e.g._.22foo_ba.2A.22.3F<http://wiki.apache.org/lucene-java/LuceneFAQ#Can_I_combine_wildcard_and_phrase_search.2C_e.g._.22foo_ba.2A.22.3F> > which suggests you extend the QueryParser to build a MultiPhraseQuery. > There's also ComplexPhraseQueryParser which looks interesting. > > > -- > Ian. > > > On Wed, Jul 4, 2012 at 9:51 AM, Ian Lea <[EMAIL PROTECTED]> wrote: > >> Where exactly are you using these double quoted strings? QueryParser? >> It would help if you showed a code snippet. >> >> Assuming your real data is more complex and the strings you are >> searching for aren't necessarily at the start of the text, you'll need >> some mix of wildcard and proximity searching. I don't think that "foo >> ba*"~n >> will work but I'm sure you'll be able to do it with a SpanQuery or >> six. SpanNearQuery lets you specify slop and whether you care if >> matches are in order or not. >> >> See >> http://www.lucidimagination.**com/blog/2009/07/18/the-**spanquery/<http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/>for >> info on spans. >> >> See also http://wiki.apache.org/lucene-**java/LuceneFAQ#Why_am_I_** >> getting_no_hits_.2BAC8_**incorrect_hits.3F<http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3F> >> for good tips on figuring out why things aren't doing what you want. >> >> Good luck. >> >> >> -- >> Ian. >> >> >> On Wed, Jul 4, 2012 at 7:11 AM, Hiren Shah <[EMAIL PROTECTED]> >> wrote: >> >>> I have used standardAnalyser to save the ANALYZED data in index. >>> >>> Data is as below:- >>> >>> 1. foo bag test >>> 2. foo bar test >>> 3. bar india foo >>> >>> >>> I used >>> When i search using---------------> foo ba >>> I get all results when i use ------->(+foo* +ba*) >>> >>> 1. I tried using "foo ba" (with double quotes) but no results come >>> as >>> it searches for exact word >>> 2. I tried using "foo ba*" (with double quotes) but no results come >>> as >>> it searches for exact word >>> 3. I tried using "foo bar" (with double quotes) Then 2nd result > |