|
|
-
Parsing large xlsx file takes much longer (and usually crashes) with tika than directly with POInutch.buddy@...) 2012-04-11, 11:36
Hi
I'm trying to use tika-parsers to parse a 100mb xlsx file. I find myself waiting a lot of time (maybe an hour or two) and rarely have the file parsed. usually i get a "gc overhead limit exceeded" exception. When I parse the same file with a few lines of code using POI library, the file is pared successfully, and relatively fast. Any inputs on this? I use tika-core-0.10 and tika-parsers-0.10 when I use tika and poi-3.8-beta3 when I use POI. -- View this message in context: http://lucene.472066.n3.nabble.com/Parsing-large-xlsx-file-takes-much-longer-and-usually-crashes-with-tika-than-directly-with-POI-tp3902267p3902267.html Sent from the Apache Tika - Development mailing list archive at Nabble.com. |