Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Plain View
Solr, mail # user - Using synonyms in combination with facets


+
kirchheimer 2010-11-25, 18:02
Copy link to this message
-
Re: Using synonyms in combination with facets
Chris Hostetter 2010-12-11, 00:05

: I have a field that I use for facetting.  I do not tokenize this field. It
: has entries like:
:
: AWB artikel 2, lid 1
: AWB artikel 8:75
: Algemene Wet Bestuursrecht artikel 8:75

I assume those are names of laws, followed by page/paragram numbers in
various formats? (and evidently "lid" is dutch for "section" ?)

: a facet for each law, instead for each pragraph of the law. I tried to do
: this with a SynonymFilterFactory using rules like
...:
: But that doesn't work. And even if it would work, it would not be a good
: solution, since I will never be able to come up with a complete list, as
: long as I cannot use wildcards.

i don't know enough about your source data to know all the posible
permutations you have to deal with, but i would tackle this with something
like...

 * KeywordTokenizerFactory
 * PatternReplaceFilterFactory
   - regex to strip off any \d+:\d+ at the end of tokens
 * PatternReplaceFilterFactory
   - regex to strip off any \d+,\s+lid\s+\d+ at the end of tokens
 * PatternReplaceFilterFactory
   - regex to strip off "\s+artikel" from the end of docs
 * TrimFilterFacotry
 * SynonymFilterFactory
   - mapping things lke "Algemene Wet Bestuursrecht" to "AWB"

-Hoss
+
kirchheimer 2010-12-12, 20:11