Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Solr, mail # user - Email classification with solr


Copy link to this message
-
Re: Email classification with solr
Jack Krupansky 2012-05-01, 16:48
There are a number of different routes you can go, one of which is to use
SolrCell (Tika) to parse mbox files and then add your own update processor
that does whatever mail classification analysis you desire and then
generates addition field values for the classification.

A simpler approach is to do the analysis yourself outside of Solr and then
feed the mbox data for each message into SolrCell along with the specific
literal field values derived from your classification analysis. SolrCell
(Tika) would then parse the mail message and add your literal field values.

Or, you may want to consider fully parsing the mail messages outside of Solr
so that you have full control over what gets parsed and which schema fields
are used or not used, in additional to your content analysis field values.

-- Jack Krupansky

-----Original Message-----
From: Ramo Karahasan
Sent: Tuesday, May 01, 2012 12:17 PM
To: [EMAIL PROTECTED]
Subject: Email classification with solr

Hello,

just a short question:

Is it possible to use solr/Lucene as a e-mail classifier? I mean, analyzing
an e-mail to add it automatically to a category (four are available)?

Thanks,

Ramo