Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Tika, mail # user - HTML not listed as supported type in the AutoDetectParser


Copy link to this message
-
HTML not listed as supported type in the AutoDetectParser
William Hays 2012-04-12, 16:06
Using the API, I have extracted the supported media types for the
AutoDetectParser in Tika 1.1
and I'm not seeing HTML or XHTML mimetypes in that list of 92 items,
though it parses such files fine.

Why would this be the case? or am I missing something?

Thanks,
Bill

--
------------
William Hays
Software Development&  Analysis
MIT Libraries E25-131
617.324.5682 (phone)
[EMAIL PROTECTED]