Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Plain View
Nutch, mail # user - crawl and update one url already in crawldb


+
webdev1977 2012-03-22, 12:53
+
Markus Jelsma 2012-03-22, 12:57
+
webdev1977 2012-03-22, 13:10
+
Markus Jelsma 2012-03-22, 13:24
+
webdev1977 2012-03-22, 14:29
Copy link to this message
-
Re: crawl and update one url already in crawldb
Markus Jelsma 2012-03-22, 14:50
Use Hadoop or set the hadoop.tmp.dir per job. If you don't, things will break.

On Thursday 22 March 2012 15:29:50 webdev1977 wrote:
> I just tried it out and so far so good.. Not an near instant solution, but
> it works ;-)  One last question..
>
> If I am running a bunch of bin/nutch commands from the same directory I
> seem to be having an issue.  I am assuming it is with the mapred system
> and various tmp files (running in local mode).  Is it possible to run
> multiple commands using the same nutch directory without causing
> conflicts?
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/crawl-and-update-one-url-already-in-cra
> wldb-tp3848358p3848665.html Sent from the Nutch - User mailing list archive
> at Nabble.com.

--
Markus Jelsma - CTO - Openindex