|
|
-
Re: Synonyms and hyphensJack Krupansky 2012-07-04, 17:37
There is one other detail that should clarify the situation. At query time,
the query parser itself is breaking your query into space-delimited terms, and only calling the analyzer for each of those terms, each of which will be treated as if a quoted phrase. So it doesn't matter whether it is the standard analyzer or word delimiter filter or other filter that is breaking up the compound term. And the default "query operator" only applies to the "terms" as the query parser parsed them, not for the sub-terms of a compound term like CD-ROM or gb-mb. -- Jack Krupansky -----Original Message----- From: Alireza Salimi Sent: Wednesday, July 04, 2012 12:05 PM To: [EMAIL PROTECTED] Subject: Re: Synonyms and hyphens Wow, I didn't know that. Is there a way to disable this feature? I mean, is it something coming from the Analyzer? On Wed, Jul 4, 2012 at 12:26 PM, Jack Krupansky <[EMAIL PROTECTED]>wrote: > Terms with embedded special characters are treated as phrases with spaces > in place of the special characters. So, "gb-mb" is treated as if you had > enclosed the term in quotes. > > -- Jack Krupansky > -----Original Message----- From: Alireza Salimi > Sent: Wednesday, July 04, 2012 6:50 AM > To: [EMAIL PROTECTED] > Subject: Re: Synonyms and hyphens > > > Hi, > > Does anybody know why hyphen '-' and q.op=AND causes such a big difference > between the two queries? I thought hyphens are removed by > StandardTokenizer > which means theoretically the two queries should be the same! > > Thanks > > On Tue, Jul 3, 2012 at 4:05 PM, Alireza Salimi <[EMAIL PROTECTED]>* > *wrote: > > Hi, >> >> I'm not sure if anybody has experienced this behavior before or not. >> I noticed that 'hyphen' plays a very important role here. >> I used Solr's default example directory. >> >> http://localhost:8983/solr/**select/?q=name:(gb-mb)&** >> version=2.2&start=0&rows=10&**indent=on&debugQuery=on&** >> indent=on&wt=json&q.op=AND<http://localhost:8983/solr/select/?q=name:(gb-mb)&version=2.2&start=0&rows=10&indent=on&debugQuery=on&indent=on&wt=json&q.op=AND> >> results in "parsedquery":"+name:gb +name:gib +name:gigabyte >> +name:gigabytes +name:mb +name:mib +name:megabyte +name:megabytes", >> >> While searching http://localhost:8984/solr/** >> select/?q=name:(gbmb)&version=**2.2&start=0&rows=10&indent=on&** >> debugQuery=on&indent=on&wt=**json&q.op=AND<http://localhost:8984/solr/select/?q=name:(gbmb)&version=2.2&start=0&rows=10&indent=on&debugQuery=on&indent=on&wt=json&q.op=AND> >> results in "parsedquery":"+(name:gb name:gib name:gigabyte >> name:gigabytes) +(name:mb name:mib name:megabyte name:megabytes)", >> >> If you notice to the first query - with hyphens - you can see that the >> results of >> parsing is totally different. I know that hyphens are special characters >> in Solr, >> but there's no way that the first query returns any entry because it's >> asking for >> ALL synonyms. >> >> Am I missing something here? >> >> Thanks >> >> >> -- >> Alireza Salimi >> Java EE Developer >> >> >> >> > > -- > Alireza Salimi > Java EE Developer > -- Alireza Salimi Java EE Developer |