Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 10 from 6048 (0.138s).
Loading phrases to help you
refine your search...
encrypted PDF created with PDFMaker failed to parse - Tika - [mail # user]
...Hi,  I have a bunch of PDF files - encrypted to prohibit changes and annotations   (this matters because documents are forms) - created by Acrobat PDFMaker Tika (1.3/trunk) fails t...
   Author: Sebastian Nagel, 2013-05-23, 10:55
BodyContentHandler and a docx embedded within a PDF - Tika - [mail # user]
...I have a PDF document with a docx attachment.  I wasn't having luck getting the contents of the docx with tika.parseToString(file).  I dug around a bit in the PDFExtractor and foun...
   Author: Allison, Timothy B., 2013-05-22, 18:23
Re: Problem parsing large (15MB) text files on Ubuntu 10.10 - Tika - [mail # user]
...Thanks Ben. I have raised a JIRA ticket[1] so we can track work on this issue.  Seems like it works fine on my Mac but can replicate your issues on various versions of Ubuntu (10.04, 10...
   Author: Dave Meikle, 2013-05-19, 23:05
Tika Outlook MSG File: Ignore Attachments, Body Text Only. - Tika - [mail # user]
...I wish to do two things when processing OUTLOOK MSG files to get all the  metadata.  They seem simple enough. Goal 1. IGNORE all attachments.  I was able to ignore the attachm...
   Author: Paul Hill, 2013-05-16, 00:42
Wanting to contribute to Tika (was Re: [jira] [Commented] (TIKA-992) OpenGraph meta tags to allow multiple values) - Tika - [mail # dev]
...Thanks Pankaj. You may want to start a new thread with specific topics that you'd like to discuss. This is a thread related to JIRA and TIKA-992 specific to OpenGraph.  I suggest you: &...
   Author: Mattmann, Chris A, 2013-05-13, 20:33
Re: [jira] [Commented] (TIKA-992) OpenGraph meta tags to allow multiple values - Tika - [mail # dev]
...Hello All,  I am new learner of Apache Tika and am very much interested to do some projects using it. So, it would be very kind of you, if you could suggest me some project ideas.  ...
   Author: Pankaj Kumar, 2013-05-13, 20:04
Re: Problem parsing large (15MB) text files on Ubuntu 10.10 - Tika - [mail # user]
...I've created another text file (1.2MB) that fails to scan, as per my previous post - a copy of it is available here:  https://www.dropbox.com/s/96iw12mrufovmql/gibberish.txt  Regar...
   Author: Ben Turner, 2013-05-02, 07:05
Problem parsing large (15MB) text files on Ubuntu 10.10 - Tika - [mail # user]
...We have been using Tika to process a large variety of files, one at a time, running it in server mode as follows on an Ubuntu 10.10 machine, with Java 1.7.0_b21 :  java -jar ~/software/...
   Author: Ben Turner, 2013-05-02, 06:54
Re: Build failed in Jenkins: Tika-trunk #994 - Tika - [mail # dev]
...Yay, thanks!   On May 1, 2013, at 5:24 PM, Michael McCandless  wrote:  ...
   Author: Ray Gauss II, 2013-05-02, 01:56
Re: Build failed in Jenkins: Tika-trunk #994 - Tika - [mail # dev]
...I just kicked off another build ... (it's queued).  Mike McCandless  http://blog.mikemccandless.com   On Wed, May 1, 2013 at 5:12 PM, Ray Gauss II  wrote:...
   Author: Michael McCandless, 2013-05-01, 21:24
Sort:
project
Lucene (130004)
Solr (104001)
ElasticSearch (33859)
Mahout (31327)
Nutch (16551)
ManifoldCF (15139)
Tika (5956)
Lucene.Net (5782)
PyLucene (1905)
Droids (1668)
Lucy (1358)
OpenRelevance (286)
type
javadoc (1746)
mail # dev (1433)
mail # user (1276)
issue (1097)
source code (357)
Sematext # blog (92)
web site (38)
wiki (9)
date
last 7 days (3)
last 30 days (14)
last 90 days (120)
last 6 months (461)
last 9 months (3945)
author
Jukka Zitting (530)
Nick Burch (410)
Mattmann, Chris A (324)
Michael McCandless (176)
Ken Krugler (161)
buildbot@...)
Oleg Tikhonov (58)
Markus Jelsma (56)
Mark Kerzner (53)
Dave Meikle (49)
Maxim Valyanskiy (46)
Keith R. Bennett (45)
Ray Gauss II (40)
Antoni Mylka (37)
Benson Margulies (37)