Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 7 from 7 (0.295s).
Loading phrases to help you
refine your search...
Crawl of local file system that puts results on HDFS - Nutch - [mail # user]
...hey everyone,  I'm using Nutch 1.5. I'm trying to crawl a local directory and index the files into HDFS, and then into Solr. I can successfully run a local crawl that then creates a loc...
   Author: Casey McTaggart, 2013-02-02, 00:06
crawl SMB server using Nutch and Hadoop? - Nutch - [mail # user]
...hi,  has anyone been able to successfully crawl a SMB server using the deployed version of Nutch? my urls/seed.txt looks like this:     smb:///servername//  my regex-urlf...
   Author: Casey McTaggart, 2012-09-25, 21:25
Re: problem running Nutch 1.5.1 in distributed mode- simple crawl - Nutch - [mail # user]
...including /plugins/classes in plugin.folders made it work. thank you!!!  On Tue, Sep 18, 2012 at 10:58 AM, Walter Tietze  wrote:  ...
   Author: Casey McTaggart, 2012-09-19, 17:37
Re: problem running Nutch 1.5.1 in distributed mode- simple crawl - Nutch - [mail # user]
...thanks Walter, I still am unable to get anything to run- I think it's because Hadoop is for some reason not finding the tika jar. I tried running Hadoop with -libjars and including both the ...
   Author: Casey McTaggart, 2012-09-18, 16:46
Re: problem running Nutch 1.5.1 in distributed mode- simple crawl - Nutch - [mail # user]
...I would also like to add that I can run the same crawl locally and it's successful. So, it's just the distributed mode that's not working. can anyone offer any advice? Do you think it might ...
   Author: Casey McTaggart, 2012-09-17, 16:31
Re: problem running Nutch 1.5.1 in distributed mode- simple crawl - Nutch - [mail # user]
...Hi Lewis,  I get the exact same results when I run the bin/nutch script from runtime/deploy... any other help? sorry, thanks!  I run it like this sudo -u hdfs bin/nutch crawl urls/...
   Author: Casey McTaggart, 2012-09-16, 15:58
problem running Nutch 1.5.1 in distributed mode- simple crawl - Nutch - [mail # user]
...Hi everyone,  I'm using Hadoop as installed by Cloudera (CDH4)... I think it's version 1.0.1. I can run a local filesystem crawl with Nutch, and it returns what I'd expect. However, I n...
   Author: Casey McTaggart, 2012-09-15, 23:22
Sort:
project
Nutch (7)
type
mail # user (7)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (1)
last 9 months (7)
author
Markus Jelsma (1767)
Lewis John Mcgibbney (1118)
Julien Nioche (805)
Mattmann, Chris A (402)
lewis john mcgibbney (334)
Andrzej Bialecki (302)
Ferdy Galema (224)
Bai Shen (161)
Tejas Patil (161)
Sebastian Nagel (155)
kiran chitturi (155)
alxsss@...)
remi tassing (133)
Lewis John McGibbney (129)
Gabriele Kahlout (115)
Casey McTaggart