Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Plain View
Solr, mail # user - Best approach to multiple languages


+
Andrew McCombe 2009-07-22, 10:12
+
Grant Ingersoll 2009-07-22, 15:35
Copy link to this message
-
Re: Best approach to multiple languages
Ed Summers 2009-07-22, 16:31
On Wed, Jul 22, 2009 at 11:35 AM, Grant Ingersoll<[EMAIL PROTECTED]> wrote:
>> My initial thoughts are to index each description as a separate field and
>> append the language identifier to the field name, for example, three
>> fields
>> with description_en, description_de, descrtiption_fr.  Is this the best
>> approach or is there a better way?

FWIW, this approach is essentially what we did at the Library of
Congress to support multi-lingual fulltext search in the World Digital
Library [1] webapp. It seems to have paid off pretty well, since we
were able to configure analysis on a per-language basis.

In case you are curious I've attached a copy of our schema.xml to give
you an idea of what we did.

//Ed

[1] http://www.wdl.org/
+
Olivier Dobberkau 2009-07-22, 16:40
+
Andrew McCombe 2009-07-23, 07:46
+
Andrew McCombe 2009-07-22, 16:39
+
Grant Ingersoll 2009-07-22, 18:02
+
Julian Davchev 2009-07-22, 10:22