| clear query|facets|time |
Search criteria: tokenizer.
Results from 1 to 10 from
326 (0.785s).
|
|
|
Loading phrases to help you refine your search...
|
|
Re: facet prefix with tokenized fields - Solr - [mail # user]
|
|
...In short, no. The problem is that faceting is working by counting documents with distinct tokens in the field. So in your example I'd expect you to see facets for "toys", "for", "children...
|
|
...". All it has to work with are the tokens, the fact that the original input was three words is completely lost at this point. You could index these with keywordTokenizer and facet on _that...
|
|
|
Author: Erick Erickson,
2012-10-29, 19:49
|
|
|
Re: How to retrieve tokens? - Solr - [mail # user]
|
|
...Essentially, you're talking about reconstructing the field from the tokens, and that's pretty difficult in general and lossy. For instance, if you use stemming and "running" gets stemmed...
|
|
|
Author: Erick Erickson,
2012-02-23, 16:59
|
|
|
Re: Search match all tokens in Query Text - Solr - [mail # user]
|
|
...Take a look at &debug=all output, because one problem here is that text:a b (or even +text:a +b) parses as +text:a +defaultfield:b And you haven't said whether you're using edismax or ...
|
|
|
Author: Erick Erickson,
2013-02-03, 12:55
|
|
|
Re: Split token - Solr - [mail # user]
|
|
...What you've shown would be handled with WhitespaceTokenizer, but you'd have to prevent filters from stripping the parens. If you have to handle things like blah ( stuff ) WhitespaceTokenizer...
|
|
... wouldn't work. PatternTokenizerFactory might work for you, see: http://lucene.apache.org/solr/api/org/apache/solr/analysis/PatternTokenizerFactory.html Best Erick On Tue, Apr 12, 2011 at 6...
|
|
|
Author: Erick Erickson,
2011-04-15, 17:50
|
|
|
Re: Which tokenizer or analizer should use and field type - Solr - [mail # user]
|
|
...try executing these with &debug=all and examine the resulting parsed query, that'll show you exactly how the query is parsed. Also, the query language is not strictly boolean, see: htt...
|
|
|
Author: Erick Erickson,
2013-04-15, 11:14
|
|
|
Re: Synonym/Tokenizer for Hyphanated Words - Solr - [mail # user]
|
|
...what does "having a problem" mean? Index-time? Query time? But your problem is most likely the tokenizer as you suspect. Try something like WhitespaceTokenizer and build up from there...
|
|
.... Three friends: 1> admin/analysis page 2> admin/schema-browser 3> &debugQuery=on The first will show you what the happend to tokens _after_ they get through the tokenization. Be aware...
|
[+ show more]
[- hide]
| ... that this probably isn't entirely helpful when your problem is in the tokenization step. The second shows you what terms are actually in your index. The third shows you what your parsed query looks like... |
| .... Couple of other things: 1> there's no need to put in all the capitalization forms _if_ you put LowerCaseFilter in front of your synonyms filter. 2> WhiteSpaceTokenizer is pretty simple... |
| .... For instance, punctuation will be part of the tokens (e.g. periods at the end of sentences). So it's a place to _start_ but you'll have to think about what you really want from your tokenization... |
|
|
Author: Erick Erickson,
2012-11-17, 14:20
|
|
|
Re: PatternTokenizer failure - Solr - [mail # user]
|
|
...Hmmm, I tried this in straight Java, no Solr/Lucene involved and the behavior I'm seeing is that no example works if it has more than one whitespace character after the hyphen, including you...
|
|
|
Author: Erick Erickson,
2011-11-29, 14:20
|
|
|
Re: Retrieving Tokens - Solr - [mail # user]
|
|
...I think that what Yonik wants is a higher-level response. *Why* do you want to process the tokens later? What is the use case you're trying to satisfy? Best Erick On Dec 20, 2007 1:37 AM...
|
|
|
Author: Erick Erickson,
2007-12-20, 14:45
|
|
|
Re: [Free Text] Field Tokenizing - Solr - [mail # user]
|
|
...The KeywordTokenizer doesn't do anything to break up the input stream, it just treats the whole input to the field as a single token. So I don't think you'll be able to "extract" anything...
|
|
... starting with that tokenizer. Look at the admin/analysis page to see a step-by-step breakdown of what your analyzer chain does. Be sure to check the "verbose" checkbox.... Best Erick On Thu, Jun...
|
|
|
Author: Erick Erickson,
2011-06-09, 16:50
|
|
|
Re: [Free Text] Field Tokenizing - Solr - [mail # user]
|
|
...The problem here is that none of the built-in filters or tokenizers have a prayer of recognizing what #you# think are phrases, since it'll be unique to your situation. If you have a list...
|
|
... of phrases you care about, you could substitute a single token for the phrases you care about... But the overriding question is what determines a phrase you're interested in? Is it a list...
|
|
|
Author: Erick Erickson,
2011-06-09, 16:26
|
|
|
|