Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: tokenizer.   Results from 1 to 10 from 326 (0.785s).
Loading phrases to help you
refine your search...
Re: facet prefix with tokenized fields - Solr - [mail # user]
...In short, no. The problem is that faceting is working by counting documents with distinct tokens in the field. So in your example I'd expect you to see facets for "toys", "for", "children...
...". All it has to work with are the tokens, the fact that the original input was three words is completely lost at this point.  You could index these with keywordTokenizer and facet on _that...
   Author: Erick Erickson, 2012-10-29, 19:49
Re: How to retrieve tokens? - Solr - [mail # user]
...Essentially, you're talking about reconstructing the field from the tokens, and that's pretty difficult in general and lossy. For instance, if you use stemming and "running" gets stemmed...
   Author: Erick Erickson, 2012-02-23, 16:59
Re: Search match all tokens in Query Text - Solr - [mail # user]
...Take a look at &debug=all output, because one problem here is that text:a b (or even +text:a +b) parses as +text:a +defaultfield:b  And you haven't said whether you're using edismax or ...
   Author: Erick Erickson, 2013-02-03, 12:55
Re: Split token - Solr - [mail # user]
...What you've shown would be handled with WhitespaceTokenizer, but you'd have to prevent filters from stripping the parens. If you have to handle things like blah ( stuff ) WhitespaceTokenizer...
... wouldn't work.  PatternTokenizerFactory might work for you, see: http://lucene.apache.org/solr/api/org/apache/solr/analysis/PatternTokenizerFactory.html  Best Erick  On Tue, Apr 12, 2011 at 6...
   Author: Erick Erickson, 2011-04-15, 17:50
Re: Which tokenizer or analizer should use and field type - Solr - [mail # user]
...try executing these with &debug=all and examine the resulting parsed query, that'll show you exactly how the query is parsed.  Also, the query language is not strictly boolean, see: htt...
   Author: Erick Erickson, 2013-04-15, 11:14
Re: Synonym/Tokenizer for Hyphanated Words - Solr - [mail # user]
...what does "having a problem" mean? Index-time? Query time?  But your problem is most likely the tokenizer as you suspect. Try something like WhitespaceTokenizer and build up from there...
....  Three friends: 1> admin/analysis page 2> admin/schema-browser 3> &debugQuery=on The first will show you what the happend to tokens _after_ they get through the tokenization. Be aware...
[+ show more]
   Author: Erick Erickson, 2012-11-17, 14:20
Re: PatternTokenizer failure - Solr - [mail # user]
...Hmmm, I tried this in straight Java, no Solr/Lucene involved and the behavior I'm seeing is that no example works if it has more than one whitespace character after the hyphen, including you...
   Author: Erick Erickson, 2011-11-29, 14:20
Re: Retrieving Tokens - Solr - [mail # user]
...I think that what Yonik wants is a higher-level response. *Why* do you want to process the tokens later? What is the use case you're trying to satisfy?  Best Erick  On Dec 20, 2007 1:37 AM...
   Author: Erick Erickson, 2007-12-20, 14:45
Re: [Free Text] Field Tokenizing - Solr - [mail # user]
...The KeywordTokenizer doesn't do anything to break up the input stream, it just treats the whole input to the field as a single token. So I don't think you'll be able to "extract" anything...
... starting with that tokenizer.  Look at the admin/analysis page to see a step-by-step breakdown of what your analyzer chain does. Be sure to check the "verbose" checkbox....  Best Erick  On Thu, Jun...
   Author: Erick Erickson, 2011-06-09, 16:50
Re: [Free Text] Field Tokenizing - Solr - [mail # user]
...The problem here is that none of the built-in filters or tokenizers have a prayer of recognizing what #you# think are phrases, since it'll be unique to your situation.  If you have a list...
... of phrases you care about, you could substitute a single token for the phrases you care about...  But the overriding question is what determines a phrase you're interested in? Is it a list...
   Author: Erick Erickson, 2011-06-09, 16:26
Sort:
project
Lucene (333)
Solr (312)
type
mail # user (309)
mail # dev (14)
issue (3)
date
last 7 days (3)
last 30 days (6)
last 90 days (10)
last 6 months (17)
last 9 months (326)
author
Chris Hostetter (392)
Erick Erickson (326)
Yonik Seeley (272)
Robert Muir (258)
Uwe Schindler (205)
Michael McCandless (185)
Jack Krupansky (181)
Grant Ingersoll (174)
Otis Gospodnetic (164)
Ahmet Arslan (151)
Mark Miller (104)
Fuad Efendi (97)
Erik Hatcher (95)
Jonathan Rochkind (78)
Ryan McKinley (76)