Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Nutch, mail # user - Aborting with 10 hung threads -ver.2


Copy link to this message
-
Re: Aborting with 10 hung threads -ver.2
Sebastian Nagel 2012-01-31, 19:24
Hi Remi,

if this error only occurs "for some sites" it may be the case that
these sites are hosting large documents and serving them slowly.
If you do not limit the document size by http.content.limit
you may have a look at:
   https://issues.apache.org/jira/browse/NUTCH-1182
and the properties
   mapred.task.timeout
   fetcher.threads.timeout.divisor
to give the fetcher more time to complete.

Sebastian

On 01/31/2012 01:58 PM, remi tassing wrote:
> Hi,
>
> I'm using Nutch-1.2 and having "Aborting with 10 hung threads" for some
> sites.
>
> I checked this thread
> http://www.mail-archive.com/[EMAIL PROTECTED]/msg15889.html and
> the JIRA issue https://issues.apache.org/jira/browse/NUTCH-719
>
> In Fetcher.java, I did the following change:
>
> -    public void addFetchItem(FetchItem it) {
> +    public synchronized void addFetchItem(FetchItem it) {
>
> I can crawl for a little bit longer, but the error still kicks back.
>
> Any idea how to solve this?
>
> Remi
>