| clear query|facets|time |
Search criteria: .
Results from 91 to 100 from
6048 (4.804s).
|
|
|
Loading phrases to help you refine your search...
|
|
Re: how to add more metadata to tika extraction? - Tika - [mail # dev]
|
|
...On Wed, 27 Feb 2013, eShard wrote: Looks like the metadata you want isn't being pulled out as metadata by Tika Metadata != content I'd suspect that if you look at th...
|
|
|
Author: Nick Burch,
2013-03-05, 21:33
|
|
|
Re: How to hide some Excel content - Tika - [mail # user]
|
|
...OK. I was just wondering if there was a built-in way to specify a customer handler that could do something like this to avoid compiling a custom version of the project. I see. G...
|
|
|
Author: CL,
2013-03-05, 17:34
|
|
|
Re: How to hide some Excel content - Tika - [mail # user]
|
|
...On Tue, 5 Mar 2013, CL wrote: There are several examples in Apache POI, and the code behind Tika is open source. Skipping certain slides should be fairly easy, other things will ...
|
|
|
Author: Nick Burch,
2013-03-05, 17:27
|
|
|
Re: How to hide some Excel content - Tika - [mail # user]
|
|
...Thanks for your feedback. I may go that route if I have to, but I'm not finding any good converters. I was hoping to avoid writing my own, which is why I'm trying Tika. Do you know if there'...
|
|
|
Author: CL,
2013-03-05, 17:22
|
|
|
Re: How to hide some Excel content - Tika - [mail # user]
|
|
...On Tue, 5 Mar 2013, CL wrote: If you have quite specific requirements (which it sounds liek you do), and only need to work with one file format, you're probably better off callin...
|
|
|
Author: Nick Burch,
2013-03-05, 15:32
|
|
|
How to hide some Excel content - Tika - [mail # user]
|
|
...Hi, I just started using Tika (1.3) for converting Excel (OOXML) content to HTML. Looking good. Two things I'm wondering... 1) Is there a way to convert only a specific worksheet of a ...
|
|
|
Author: CL,
2013-03-05, 15:26
|
|
|
Re: Improvement in Metadata Class - Tika - [mail # user]
|
|
...Hey Lewis, RE: #3 — it would be great to get Nutch using Tika's metadata container — I don't think we have anything special in Nutch that prevents it. RE: #2 — I committed your Tika do...
|
|
|
Author: Mattmann, Chris A,
2013-03-04, 05:41
|
|
|
Re: IdentityHtmlMapper not used by Boilerpipe? - Tika - [mail # user]
|
|
...unsubscribe On Fri, Mar 1, 2013 at 7:35 AM, Markus Jelsma wrote: Dan Klueter...
|
|
|
Author: Dan Klueter,
2013-03-02, 00:03
|
|
|
[TIKA-1085] PDF header and mime detection - Tika - [issue]
|
|
...I've found some PDF files Tika recognizes as application/octet-stream.These files differs from regularly identified PDF having a different header: the %PDF-N.n string isn't at the beginning ...
|
|
|
http://issues.apache.org/jira/browse/TIKA-1085
Author: Marco Quaranta,
2013-03-01, 13:53
|
|
|
IdentityHtmlMapper not used by Boilerpipe? - Tika - [mail # user]
|
|
...Hi, We need div elements returned when we pass the stream through Boilerpipe from Nutch. We enable includeMarkup to get markup returned in the first place, but divs are not returned. I...
|
|
|
Author: Markus Jelsma,
2013-03-01, 12:35
|
|
|
|