Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 11 to 20 from 156 (0.208s).
Loading phrases to help you
refine your search...
Re: Next Release Cycle - Nutch - [mail # dev]
...Hi Lewis,  +1  it's time: May for 2.2 and beginning of June for 1.7 to adhere to the 6-month release cycle.  After sorting major/critical issues for 1.7 with patches available...
   Author: Sebastian Nagel, 2013-04-15, 20:53
Re: Permgen size keeps increasing - Nutch - [mail # user]
...Hi,  What does this precisely mean? (a) Are you running one crawl process with many cycles (depth)     by launching "bin/nutch crawl" (org.apache.nutch.crawl.Crawl) (b) or in ...
   Author: Sebastian Nagel, 2013-04-09, 19:16
Re: crawl time for depth param 50 and topN not passed - Nutch - [mail # user]
...Hi David,  afaik, the default of topN is Long.MAX_VALUE which is very large. So, the size of the crawl is mainly limited by the number of links you get. Anyway, a depth of 50 is a high ...
   Author: Sebastian Nagel, 2013-04-05, 19:24
[NUTCH-1501] Harmonize behavior of parsechecker and indexchecker - Nutch - [issue]
...Behaviour of ParserChecker and IndexingFiltersChecker has diverged between trunk and 2.x missing in 2.x: NUTCH-1320, NUTCH-1207 open issue to be also applied to 2.x: NUTCH-1419, NUTCH-1389...
http://issues.apache.org/jira/browse/NUTCH-1501    Author: Sebastian Nagel, 2013-03-27, 22:47
[NUTCH-1389] parsechecker and indexchecker to report truncated content - Nutch - [issue]
...ParserChecker and IndexingFiltersChecker should report when a document is truncated due to {http,file,ftp}.content.limit.Truncated content may cause text and metadata extraction to fail for ...
http://issues.apache.org/jira/browse/NUTCH-1389    Author: Sebastian Nagel, 2013-03-27, 21:35
[NUTCH-1419] parsechecker and indexchecker to report protocol status - Nutch - [issue]
...Parsechecker and indexchecker should report the protocol status when the fetch was not successful (status other than 200/ok).In case of a redirect, the protocol status contains the URL a red...
http://issues.apache.org/jira/browse/NUTCH-1419    Author: Sebastian Nagel, 2013-03-27, 03:06
Re: parsechecker and redirection - Nutch - [mail # user]
...Hi Lewis,  let's address NUTCH-1038, NUTCH-1389, NUTCH-1419, and NUTCH-1501!  On 03/25/2013 11:22 PM, Lewis John Mcgibbney wrote:...
   Author: Sebastian Nagel, 2013-03-25, 22:51
Re: parsechecker and redirection - Nutch - [mail # user]
...Hi Canan, hi Lewis,  parsechecker cannot follow redirects, also in trunk / 1.x.  It would be nice, at least, if parsechecker would report clearly that there is a redirect. Currentl...
   Author: Sebastian Nagel, 2013-03-25, 22:04
[NUTCH-1541] Indexer plugin to write CSV - Nutch - [issue]
...With the new pluggable indexer a simple plugin would be handy to write configurable fields into a CSV file - for further analysis or just for export....
http://issues.apache.org/jira/browse/NUTCH-1541    Author: Sebastian Nagel, 2013-03-13, 10:07
[NUTCH-1454] parsing chm failed - Nutch - [issue]
...(reported by Jan Riewe, see http://lucene.472066.n3.nabble.com/CHM-Files-and-Tika-td3999735.html)Nutch fails to parse chm files with ERROR tika.TikaParser - Can't retrieve Tika parser f...
http://issues.apache.org/jira/browse/NUTCH-1454    Author: Sebastian Nagel, 2013-03-06, 06:12
Sort:
project
Nutch (156)
Tika (1)
type
mail # user (90)
mail # dev (39)
issue (27)
date
last 7 days (1)
last 30 days (6)
last 90 days (24)
last 6 months (52)
last 9 months (156)
author
Markus Jelsma (1767)
Lewis John Mcgibbney (1125)
Julien Nioche (805)
Mattmann, Chris A (402)
lewis john mcgibbney (334)
Andrzej Bialecki (302)
Ferdy Galema (224)
Tejas Patil (164)
Bai Shen (162)
Sebastian Nagel (156)
kiran chitturi (156)
alxsss@...)
remi tassing (133)
Lewis John McGibbney (129)
Gabriele Kahlout (115)