| clear query|facets|time |
Search criteria: .
Results from 11 to 20 from
156 (0.208s).
|
|
|
Loading phrases to help you refine your search...
|
|
Re: Next Release Cycle - Nutch - [mail # dev]
|
|
...Hi Lewis, +1 it's time: May for 2.2 and beginning of June for 1.7 to adhere to the 6-month release cycle. After sorting major/critical issues for 1.7 with patches available...
|
|
|
Author: Sebastian Nagel,
2013-04-15, 20:53
|
|
|
Re: Permgen size keeps increasing - Nutch - [mail # user]
|
|
...Hi, What does this precisely mean? (a) Are you running one crawl process with many cycles (depth) by launching "bin/nutch crawl" (org.apache.nutch.crawl.Crawl) (b) or in ...
|
|
|
Author: Sebastian Nagel,
2013-04-09, 19:16
|
|
|
Re: crawl time for depth param 50 and topN not passed - Nutch - [mail # user]
|
|
...Hi David, afaik, the default of topN is Long.MAX_VALUE which is very large. So, the size of the crawl is mainly limited by the number of links you get. Anyway, a depth of 50 is a high ...
|
|
|
Author: Sebastian Nagel,
2013-04-05, 19:24
|
|
|
[NUTCH-1501] Harmonize behavior of parsechecker and indexchecker - Nutch - [issue]
|
|
...Behaviour of ParserChecker and IndexingFiltersChecker has diverged between trunk and 2.x missing in 2.x: NUTCH-1320, NUTCH-1207 open issue to be also applied to 2.x: NUTCH-1419, NUTCH-1389...
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1501
Author: Sebastian Nagel,
2013-03-27, 22:47
|
|
|
[NUTCH-1389] parsechecker and indexchecker to report truncated content - Nutch - [issue]
|
|
...ParserChecker and IndexingFiltersChecker should report when a document is truncated due to {http,file,ftp}.content.limit.Truncated content may cause text and metadata extraction to fail for ...
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1389
Author: Sebastian Nagel,
2013-03-27, 21:35
|
|
|
[NUTCH-1419] parsechecker and indexchecker to report protocol status - Nutch - [issue]
|
|
...Parsechecker and indexchecker should report the protocol status when the fetch was not successful (status other than 200/ok).In case of a redirect, the protocol status contains the URL a red...
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1419
Author: Sebastian Nagel,
2013-03-27, 03:06
|
|
|
Re: parsechecker and redirection - Nutch - [mail # user]
|
|
...Hi Lewis, let's address NUTCH-1038, NUTCH-1389, NUTCH-1419, and NUTCH-1501! On 03/25/2013 11:22 PM, Lewis John Mcgibbney wrote:...
|
|
|
Author: Sebastian Nagel,
2013-03-25, 22:51
|
|
|
Re: parsechecker and redirection - Nutch - [mail # user]
|
|
...Hi Canan, hi Lewis, parsechecker cannot follow redirects, also in trunk / 1.x. It would be nice, at least, if parsechecker would report clearly that there is a redirect. Currentl...
|
|
|
Author: Sebastian Nagel,
2013-03-25, 22:04
|
|
|
[NUTCH-1541] Indexer plugin to write CSV - Nutch - [issue]
|
|
...With the new pluggable indexer a simple plugin would be handy to write configurable fields into a CSV file - for further analysis or just for export....
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1541
Author: Sebastian Nagel,
2013-03-13, 10:07
|
|
|
[NUTCH-1454] parsing chm failed - Nutch - [issue]
|
|
...(reported by Jan Riewe, see http://lucene.472066.n3.nabble.com/CHM-Files-and-Tika-td3999735.html)Nutch fails to parse chm files with ERROR tika.TikaParser - Can't retrieve Tika parser f...
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1454
Author: Sebastian Nagel,
2013-03-06, 06:12
|
|
|
|