Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 10 from 39 (0.128s).
Loading phrases to help you
refine your search...
Re: Lower case URLs - correct regex? - Nutch - [mail # user]
...We've dug a bit deeper...  We're actually upgrading from Nutch 1.0 to 1.4. It seems the regex stuff  has moved away from the Perl5Substitution implementation, which  supported...
   Author: Dean Pullen, 2012-05-08, 14:30
Lower case URLs - correct regex? - Nutch - [mail # user]
...Hi all,   I'm trying to lower case all URLs via Nutch's regex-normalize.xml  The regex looks like:   (.*) \L$1\E   This appears to be correct, yet we're seeing this when ...
   Author: Dean Pullen, 2012-05-08, 12:37
Hadoop not doing anything - Nutch - [mail # user]
...Hi all,  If this is definitely a Hadoop issue, as opposed to it being an issue  caused by Nutch, I'll happily go ask on the Hadoop mailing list...  Anyway, I'm kicking off a &...
   Author: Dean Pullen, 2012-05-01, 15:26
Re: Nutch 1.4 with Hadoop - how does Nutch know where Hadoop is running - Nutch - [mail # user]
...Thanks for your reply.  I understand what you've said, but how does Nutch know where the Hadoop  jobtracker is running?  Regards,  Dean.  On 20/03/2012 11:03, Markus...
   Author: Dean Pullen, 2012-03-20, 10:59
Nutch 1.4 with Hadoop - how does Nutch know where Hadoop is running - Nutch - [mail # user]
...Hi all,  An odd question, but I can't work out how Nutch 1.4 actually knows where  Hadoop is running.  Usually I copy Hadoop over the top of Nutch, but if we want to put  ...
   Author: Dean Pullen, 2012-03-20, 10:51
Re: Failed fetching - Nutch - [mail # user]
...Thanks for the reply - I'm using 1.4  The problem was; as previously described, the nutch-site.xml didn't have  the protocol-http in the plugins include - I had presumed this was &...
   Author: Dean Pullen, 2012-02-03, 11:06
Re: Failed fetching - Nutch - [mail # user]
...What I see in logs/userlogs/myfetchjobxx/syslog is:  2012-02-02 17:15:25,045 INFO org.apache.nutch.fetcher.Fetcher: fetch of  http://nutch.apache.org/ failed with:  org.apache...
   Author: Dean Pullen, 2012-02-02, 17:22
Re: Failed fetching - Nutch - [mail # user]
...I've added:   http.verbose true If true, HTTP will log more verbosely.   fetcher.verbose true If true, fetcher will log more verbosely.    To the nutch-site.xml in an att...
   Author: Dean Pullen, 2012-02-02, 17:11
Failed fetching - Nutch - [mail # user]
...Hi all,  I'm trying to fetch from http://nutch.apache.org  But after fetching, parsing, and updating the DB I examine the DB for  'http://nutch.apache.org/' (oddly I must incl...
   Author: Dean Pullen, 2012-02-02, 16:44
Re: Null Pointer During Crawl on Hadoop EC2 - Nutch - [mail # user]
...Looks like this to me:  https://issues.apache.org/jira/browse/NUTCH-1084  D.  On 13/01/2012 15:41, Matthew Slade wrote:...
   Author: Dean Pullen, 2012-01-13, 15:43
Sort:
project
Nutch (39)
Solr (10)
type
mail # user (39)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (0)
last 9 months (39)
author
Markus Jelsma (1767)
Lewis John Mcgibbney (1125)
Julien Nioche (805)
Mattmann, Chris A (402)
lewis john mcgibbney (334)
Andrzej Bialecki (302)
Ferdy Galema (224)
Tejas Patil (164)
Bai Shen (163)
kiran chitturi (157)
Sebastian Nagel (156)
alxsss@...)
remi tassing (133)
Lewis John McGibbney (129)
Gabriele Kahlout (115)
Dean Pullen