| clear query|facets|time |
Search criteria: .
Results from 101 to 110 from
410 (0.172s).
|
|
|
Loading phrases to help you refine your search...
|
|
Re: Tika API and field postprocessing - Tika - [mail # user]
|
|
...On Sun, 27 May 2012, Raphaᅵl wrote: I believe you'll need to ask on the SOLR list about this, as it's likely to be specific to ExtractingRequestHandler which is maintained by S...
|
|
|
Author: Nick Burch,
2012-05-27, 21:28
|
|
|
RE: A plan to improve the metadata property definitions - Tika - [mail # dev]
|
|
...On Tue, 22 May 2012, Joerg Ehrlich wrote: The only thing the current setup won't support is Structured Properties. (That hasn't changed). That will need more work, but hopefully ...
|
|
|
Author: Nick Burch,
2012-05-23, 15:23
|
|
|
Re: Unable to read default mimetypes error message - Tika - [mail # user]
|
|
...On Fri, 18 May 2012, Karthik Deivasigamani wrote: Are you sure you haven't got any other Tika jars on your classpath? And have you done something bizzare with the XML parser that...
|
|
|
Author: Nick Burch,
2012-05-22, 00:04
|
|
|
Re: Tika fails to extract text from very large files - Tika - [mail # user]
|
|
...On Thu, 17 May 2012, Alec Swan wrote: In that kind of situation, you should be looking at using something like the fork parser or the tika server That looks like a PDFBox bug, y...
|
|
|
Author: Nick Burch,
2012-05-17, 16:12
|
|
|
Re: A plan to improve the metadata property definitions - Tika - [mail # dev]
|
|
...On Thu, 17 May 2012, Mattmann, Chris A (388J) wrote: We've tried to keep all the issues and commits nice and small, so they're easy to review, but we did end up on an epic 10 hou...
|
|
|
Author: Nick Burch,
2012-05-17, 02:57
|
|
|
Re: Tika fails to extract text from very large files - Tika - [mail # user]
|
|
...On Wed, 16 May 2012, Alec Swan wrote: Not all file formats support stream based parsing, many can only be sensibly parsed in a DOM-like way. For those, the who file needs to be &...
|
|
|
Author: Nick Burch,
2012-05-16, 23:07
|
|
|
Re: Tika fails to extract text from very large files - Tika - [mail # user]
|
|
...On Wed, 16 May 2012, Alec Swan wrote: There is absolutely no way that you're going to be able to parse a PDF, DOC/DOCX or PPT/PPTX of more than about 20mb in size on a 128mb heap...
|
|
|
Author: Nick Burch,
2012-05-16, 22:08
|
|
|
Re: Tika fails to extract text from very large files - Tika - [mail # user]
|
|
...On Wed, 16 May 2012, Alec Swan wrote: Are you running out of memory? PPT/PPTX, DOC/DOCX and PDF are all formats which can only be parsed by building a DOM-like structure in memor...
|
|
|
Author: Nick Burch,
2012-05-16, 21:45
|
|
|
A plan to improve the metadata property definitions - Tika - [mail # dev]
|
|
...Hi All I've just been brainstorming with Ray Gauss, and we think we've come up with a way to move towards cleaner and clearer metadata property definitions (prefixes, prope...
|
|
|
Author: Nick Burch,
2012-05-16, 15:50
|
|
|
[TIKA-917] Parser for executables (metadata) - Tika - [issue]
|
|
...Based on the investigations for TIKA-913, it should be fairly easy to implement a parser to extract metadata from executables (PE and ELF). This could give us a similar level of information ...
|
|
|
http://issues.apache.org/jira/browse/TIKA-917
Author: Nick Burch,
2012-05-13, 19:48
|
|
|
|