|
|
-
Apple iWork document parsing
Arthur Meneau 2011-12-05, 22:43
I am having trouble parsing iWork documents with Tika 1.0. These documents are being saved with the appropriate versions specified by Tika's API (Keynote 5.1.1, Numbers 2.1, Pages 4.1). I have copy and pasted the error I am receiving below. How can I get iWork documents to correctly parse?
Thanks, -Arthur Meneau
Stack Trace: java.lang.NullPointerException java.lang.NullPointerException at org.apache.tika.parser.iwork.IWorkPackageParser$IWORKDocumentType.detectType(IWorkPackageParser.java:125) at org.apache.tika.parser.iwork.IWorkPackageParser$IWORKDocumentType.detectType(IWorkPackageParser.java:106) at org.apache.tika.parser.pkg.ZipContainerDetector.detectIWork(ZipContainerDetector.java:163) at org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:76) at org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:60) at org.apache.tika.Tika.detect(Tika.java:133) at org.apache.tika.Tika.detect(Tika.java:267) at org.apache.tika.Tika.detect(Tika.java:248) at xetus.util.io.FileAnalyzer.getMetadata(FileAnalyzer.java:156) at xetus.util.io.FileAnalyzer.getMetadata(FileAnalyzer.java:72) at xetus.util.io.BulkFileAnalyzerTest.testBulkFileTypeDetection(BulkFileAnalyzerTest.java:137) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:154) at junit.framework.TestCase.runBare(TestCase.java:127) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:118) at junit.framework.TestSuite.runTest(TestSuite.java:208) at junit.framework.TestSuite.run(TestSuite.java:203) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)
-
Re: Apple iWork document parsing
Nick Burch 2011-12-06, 01:02
On Mon, 5 Dec 2011, Arthur Meneau wrote: > I am having trouble parsing iWork documents with Tika 1.0. These > documents are being saved with the appropriate versions specified by > Tika's API (Keynote 5.1.1, Numbers 2.1, Pages 4.1). I have copy and > pasted the error I am receiving below. How can I get iWork documents to > correctly parse?
Any chance that you could create a new issue in JIRA, and upload a small sample file that causes the error? (Ideally the smallest file you can create that gives the problem)
Cheers Nick
-
Re: Apple iWork document parsing
Arthur Meneau 2011-12-06, 01:18
Nick,
This is done. The files I had used originally were very small test files, I included all three so you can test keynote, pages and numbers.
Thanks for the quick response, -Arthur On Dec 5, 2011, at 5:02 PM, Nick Burch wrote:
> On Mon, 5 Dec 2011, Arthur Meneau wrote: >> I am having trouble parsing iWork documents with Tika 1.0. These documents are being saved with the appropriate versions specified by Tika's API (Keynote 5.1.1, Numbers 2.1, Pages 4.1). I have copy and pasted the error I am receiving below. How can I get iWork documents to correctly parse? > > Any chance that you could create a new issue in JIRA, and upload a small sample file that causes the error? (Ideally the smallest file you can create that gives the problem) > > Cheers > Nick
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext