| clear query|facets|time |
Search criteria: tika 0.8.
Results from 1 to 10 from
1075 (4.147s).
|
|
|
Did you mean:
|
|
Loading phrases to help you refine your search...
|
|
|
RE: Solrj/Tika question about content types - Lucene - [mail # dev]
|
|
...) that demonstrates what you are seeing with "application/xml; charset=UTF-8" getting sent over the wire even though you explicitly provide a diff content-type in the ContentStream? -Hoss ...
|
|
|
Author: Chris Hostetter,
2013-02-13, 19:53
|
|
|
Re: Solr 1.4.1 and Tika 0.9 - some tests not passing - Solr - [mail # user]
|
|
... i'm not really up to speed on what might have changed in Tika 0.9 to cause this, but the best thing to do would probably be to look at what *does* work compared to what doesn't work...
|
|
...... ...in the context of what you said tika 0.9 gives you for that doc on the command line... ... ...if that basic little bit of info can't be extracted, then i'm guessing nothing is being...
|
[+ show more]
[- hide]
| ... extracted. I would suggest you run the example (with the 0.9 tika jars) and manually attempt to index one document, and then use the schema browser to see exactly what gets indexed. you may need... |
|
|
Author: Chris Hostetter,
2011-04-01, 02:19
|
|
|
Re: memory leak in pdfbox--SolrCel needs to call COSName.clearResources? - Solr - [mail # user]
|
|
...). 3) heck: with the new ScriptUpdateProcessor in Solr 4.0, you could write some javascript in your solrconfig.xml that would call this method as part of the chains processCommit() method...
|
|
|
Author: Chris Hostetter,
2012-09-25, 00:40
|
|
|
Re: Lucene Solr 3.1 RC1 - Lucene - [mail # dev]
|
|
... or directory) encountered while [javadoc] performing copy. * CHANGES.txt says we are using Tika 0.8-SNAPSHOT and UIMA 2.3.1-SNAPSHOT, but when i look at the actual jars there is no indication...
|
|
... build.xml that is modifying these files when it shouldn't be... contrib/analysis-extras/lucene-libs/lucene-icu-3.1.0.jar contrib/analysis-extras/lucene-libs/lucene-smartcn-3.1.0.jar contrib...
|
[+ show more]
[- hide]
| .../analysis-extras/lucene-libs/lucene-stempel-3.1.0.jar example/exampledocs/post.jar ...using "jarc" to poke arround in post.jar the specific change seems to have been the "Created-By" variable in the manifest... < Created... |
| ...-By: 19.1-b02 (Sun Microsystems Inc.) --- > Created-By: 1.5.0_22-b03 (Sun Microsystems Inc.) I'm not entirely sure why the other three jars are included in the src release at all... |
| ... the non-javadocs (ie: tutorial) * while in the solr directory "ant javadoc" produced this warning... [javadoc] /home/hossman/tmp/lucene3.1rc/3.1.rc1/s-src-tgz/apache-solr-3.1.0/solr... |
|
|
Author: Chris Hostetter,
2011-03-17, 19:53
|
|
|
Re: Solr crashing while extracting from very simple text file - Solr - [mail # user]
|
|
... except that when i run "tika-app-0.6.jar" on a text file like the one Ross describes, i don't get the error he describes, which means it may be something off in how Solr is using Tika...
|
|
... 78 0a 78 0a 58 58 42 4c 45 0a |x.x.XXBLE.| 0000000a hossman@brunner:~/tmp$ curl "http://localhost:8983/solr/update/extract?literal.id=1&commit=true" -F "myfile...
|
|
|
Author: Chris Hostetter,
2010-04-01, 17:07
|
|
|
Re: com.ctc.wstx.exc.WstxLazyException exception while passing the text content of a word doc to SOLR - Solr - [mail # user]
|
|
... the error string is fairly self explanatory: on line 40, column 18 you have a character that isn't legal in XML (0x7) (not all UTF-8 characters are legal in XML) If search the solr...
|
|
... side... http://wiki.apache.org/solr/ExtractingRequestHandler -Hoss ...
|
|
|
Author: Chris Hostetter,
2009-03-18, 03:19
|
|
|
Re: source tree for lucene - Solr - [mail # user]
|
|
.... If you note the Solr 1.4 CHANGES.txt it says... Versions of Major Components ---------------------------- Apache Lucene 2.9.1 (r832363 on 2.9 branch) Apache Tika 0.4 Carrot2 3.1.0 ...so...
|
|
|
Author: Chris Hostetter,
2010-02-11, 02:44
|
|
|
Re: rough outline of where Solr's going - Solr - [mail # dev]
|
|
...'t think that neccessarily means that X.Y needs to equal N.M. I was never suggesting that any version numbers should go backwards and reset to 1.0 ... if the only way to get lucene...
|
|
... *either* lucene-analyzers-3.6.tgz or lucene-analyzers-4.0.tgz -- depending on how significant the changes in API/functionality are for the users. This should can be an independ decision from...
|
[+ show more]
[- hide]
| ... wether the next version lucene-java (possible/probably released on the same day) is lucene-java-3.6.tgz or lucene-java-4.0.tgz I thinks solr-3.1 only makes sense if Solr is include in one big... |
|
|
Author: Chris Hostetter,
2010-03-18, 18:01
|
|
|
|
|
|
Search results for 0.8 :
|
|
|
Re: Doubts about indexing the localhost ROOT using Dutch 0.8.1 - Lucene - [mail # user]
|
|
|
|
Author: Chris Hostetter,
2008-01-03, 20:19
|
|
|
Re: [JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.8.0-ea-b84) - Build # 5180 - Failure! - Lucene - [mail # dev]
|
|
|
|
Author: Chris Hostetter,
2013-04-19, 02:40
|
|
|
RE: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b83) - Build # 5019 - Failure! - Lucene - [mail # dev]
|
|
|
|
Author: Chris Hostetter,
2013-04-08, 16:45
|
|
|
Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b79) - Build # 4583 - Still Failing! - Lucene - [mail # dev]
|
|
|
|
Author: Chris Hostetter,
2013-03-08, 00:19
|
|
|
Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b65) - Build # 3301 - Failure! - Lucene - [mail # dev]
|
|
|
|
Author: Chris Hostetter,
2012-12-20, 18:32
|
|
|
Re: Solr v3.5.0 - numFound changes when paging through results on 8-shard cluster - Solr - [mail # user]
|
|
|
|
Author: Chris Hostetter,
2012-06-19, 21:40
|
|
|
Re: lucene and UTF-8 - Lucene - [mail # user]
|
|
|
|
Author: Chris Hostetter,
2005-09-29, 21:18
|
|
|
Re: UTF-8 support during indexing content - Solr - [mail # user]
|
|
|
|
Author: Chris Hostetter,
2012-02-02, 01:12
|
|
|
Re: solr utf8 for words like compagnieën? - Solr - [mail # user]
|
|
|
|
Author: Chris Hostetter,
2012-01-31, 19:06
|
|
|
Re: resin and UTF-8 in URLs - Solr - [mail # dev]
|
|
|
|
Author: Chris Hostetter,
2007-02-02, 02:00
|
|
|
|