|
|
-
Re: Convert file before Tika processes it?Nick Burch 2012-06-21, 17:07
On Wed, 20 Jun 2012, 122jxgcn wrote:
> Hi, I'm currently working on Tika to properly process custom file type > (*.hwp file) I have a binary executable file which converts hwp file > into xml file. I'm not sure how can I include this binary file so that > when Tika encounters hwp file, it can automatically convert in to xml > file using the binary, and pass the document to XMLParser. Any > suggestions? I'd suggest you do a custom parser for your file format, which first calls out to your custom program, then feeds the result directly to Tika's XMLParser. The website has a good guide on writing your own custom parsers: http://tika.apache.org/1.1/parser_guide.html Nick |