Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Plain View
Tika, mail # dev - Pluggable language detection


+
Julien Nioche 2012-03-21, 15:51
+
Ken Krugler 2012-03-21, 16:55
Copy link to this message
-
Re: Pluggable language detection
Michael McCandless 2012-03-21, 17:00
On Wed, Mar 21, 2012 at 12:55 PM, Ken Krugler
<[EMAIL PROTECTED]> wrote:
>
> On Mar 21, 2012, at 8:51am, Julien Nioche wrote:
>
>> Hi guys,
>>
>> Just wondering about the best way to make the language detection pluggable
>> instead of having it hard-wired as it is now. We now that the resources
>> that are currently in Tika are both slow and inaccurate [1] and there are
>> other libraries that we could leverage. Why not having the option to select
>> a different implementation just like we do for parsers? Obviously we'd need
>> a common interface for the parsers etc...
>>
>> What do you think?
>
> I'd be more in favor of using that time to integrate a better language detector into Tika, so that everybody wins from the work :)

+1

Mike McCandless

http://blog.mikemccandless.com
+
Julien Nioche 2012-03-22, 10:22
+
Jan Høydahl 2012-04-08, 23:16
+
Mattmann, Chris A 2012-04-09, 01:19
+
Chris A Mattmann 2012-03-21, 19:46
+
Maxim Valyanskiy 2012-03-22, 07:14