|
|
+
122jxgcn 2012-06-21, 02:35
+
Jukka Zitting 2012-06-21, 12:08
+
Mattmann, Chris A 2012-06-21, 13:17
-
Re: Convert file before Tika processes it?Nick Burch 2012-06-21, 17:07
On Wed, 20 Jun 2012, 122jxgcn wrote:
> Hi, I'm currently working on Tika to properly process custom file type > (*.hwp file) I have a binary executable file which converts hwp file > into xml file. I'm not sure how can I include this binary file so that > when Tika encounters hwp file, it can automatically convert in to xml > file using the binary, and pass the document to XMLParser. Any > suggestions? I'd suggest you do a custom parser for your file format, which first calls out to your custom program, then feeds the result directly to Tika's XMLParser. The website has a good guide on writing your own custom parsers: http://tika.apache.org/1.1/parser_guide.html Nick |