Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Tika, mail # user - Re: Upgrading Solr to Tika 0.8


Copy link to this message
-
Re: Upgrading Solr to Tika 0.8
ceesjm1 2011-01-10, 14:58
Jukka Zitting <jzitting@...> writes:

>
> Hi,
>
> From: Grant Ingersoll [mailto:gsingers <at> apache.org]
> > Hmm, it does look like I'm still getting the Keywords, but this
> > AAPL:Keywords is an additional one.  Looks like it is coming from
> > PDFBox.  I will update my tests.
>
> 0.8 exposes quite a bit more document metadata, and in some cases these
additional fields duplicate
> previously exposed information. For backwards compatibility we didn't remove
the old metadata fields
> even in cases where the new field is more accurately named or formatted.
>
> In Tika 1.0 we probably should review all such cases and drop the old metadata
fields to avoid confusion
> later on, so you may want to prepare for some extra upgrade work with 1.0.
>
> BR,
>
> Jukka Zitting
>
Hi there,

Does this mean that a Solr upgrade to Tika 0.8 is fine then with the exception
that Tika will expose additional metadata?

Just about to attempt the upgrade...

Cheers,

Scot