Home
|
About
|
Sematext
search-lucene.com
search-hadoop.com
Search
clear
query
|
facets
|
time
Search criteria:
tokenizer
. Results from
1
to
3
from
3
(0.457s).
Loading phrases to help you
refine your search...
ImproveIndexingSpeed - Lucene
-
Lucene
- [wiki]
... is to always add the same fields in the same order to each document you index. Re-use a single
Token
instance in your analyzer Analyzers often create a new
Token
for each term in sequence that needs...
... to be indexed from a Field. You can save substantial GC cost by re-using a single
Token
instance instead. Use the char[] API in
Token
instead of the String API to represent
token
Text As of Lucene...
[+ show more]
[- hide]
... 2.3, a
Token
can represent its text as a slice into a char array, which saves the GC cost of new'ing and then reclaiming String instances. By re-using a single
Token
instance and using...
... the char[] API you can avoid new'ing any objects for each term. See
Token
for details. Use autoCommit=false when you open your IndexWriter In Lucene 2.3 there are substantial optimizations...
http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
Author:
RobertMuir
, 2011-11-03, 23:17
UnicodeCollation
-
Solr
- [wiki]
... search purposes. Unicode Collation in Solr is fast, all the work is done at index time. The way it works is that instead of just using a Keyword
Tokenizer
Factory to create a sort field, you use...
... Keyword
Tokenizer
Factory followed by CollationKeyFilterFactory. At index time this indexes special "sort keys" into the sort field, so that at search you just sort on the sort field, and it comes...
[+ show more]
[- hide]
...="collatedGERMAN" class="solr.TextField"> <analyzer> <
tokenizer
class="solr.Keyword
Tokenizer
Factory"/> <filter class="solr.CollationKeyFilterFactory" language...
... than the standard Solr sort. <fieldType name="collatedROOT" class="solr.TextField"> <analyzer> <
tokenizer
class="solr.Keyword
Tokenizer
Factory"/> <filter class...
..."); This file of rules can now be used for custom collation in Solr. <fieldType name="collatedCUSTOM" class="solr.TextField"> <analyzer> <
tokenizer
class="solr.Keyword
Tokenizer
...
http://wiki.apache.org/solr/UnicodeCollation
Author:
RobertMuir
, 2011-03-03, 03:23
GdataServer/HowTo - Lucene
-
Lucene
- [wiki]
...;analyzer> org.apache.lucene.analysis.standard.StandardAnalyzer </analyzer> <store>YES</store> <index>
TOKENIZED
</index> </field> To define a specific...
....apache.lucene.analysis.Analyzer the server will fail at startup. The elements store and index take values of constants defined in Store and Index. These elements are not requiered the default value is Index.
TOKENIZED
and Store...
[+ show more]
[- hide]
...) gdatadate - RFC 822 date format format keyword - treats the content as a keyword (Index.UN_
TOKENIZED
) category - entry category If no predefined strategy matches, custom implementation can...
...;/analyzer> <store>YES</store> <index>
TOKENIZED
</index> </field> <mixed name="content" boost="1.0"> <...
...;/analyzer> <store>YES</store> <index>
TOKENIZED
</index> </mixed> </service> Server Components Functional parts of the GData - Server are devided...
http://wiki.apache.org/lucene-java/GdataServer/HowTo
Author:
RobertMuir
, 2010-04-13, 13:52
1
Sort:
time-biased relevance
relevancy
newest on top
oldest on top
project
Lucene (2)
Solr (1)
type
wiki (3)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (0)
last 9 months (3)
author
Erick Erickson (640)
Chris Hostetter (580)
Robert Muir (413)
Uwe Schindler (407)
Erik Hatcher (329)
Yonik Seeley (318)
Grant Ingersoll (311)
Michael McCandless (300)
Otis Gospodnetic (285)
Mark Miller (213)
Jack Krupansky (210)
Ahmet Arslan (177)
Steven A Rowe (145)
Shai Erera (111)
Fuad Efendi (108)
RobertMuir
All projects made searchable here are trademarks of the Apache Software Foundation. Service operated by
Sematext