|
|
-
Re: Aborting with 10 hung threads -ver.2Sebastian Nagel 2012-01-31, 19:24
Hi Remi,
if this error only occurs "for some sites" it may be the case that these sites are hosting large documents and serving them slowly. If you do not limit the document size by http.content.limit you may have a look at: https://issues.apache.org/jira/browse/NUTCH-1182 and the properties mapred.task.timeout fetcher.threads.timeout.divisor to give the fetcher more time to complete. Sebastian On 01/31/2012 01:58 PM, remi tassing wrote: > Hi, > > I'm using Nutch-1.2 and having "Aborting with 10 hung threads" for some > sites. > > I checked this thread > http://www.mail-archive.com/[EMAIL PROTECTED]/msg15889.html and > the JIRA issue https://issues.apache.org/jira/browse/NUTCH-719 > > In Fetcher.java, I did the following change: > > - public void addFetchItem(FetchItem it) { > + public synchronized void addFetchItem(FetchItem it) { > > I can crawl for a little bit longer, but the error still kicks back. > > Any idea how to solve this? > > Remi > |