Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Plain View
Droids, mail # dev - Droids suitability for a 100M+ page crawl


Copy link to this message
-
Droids suitability for a 100M+ page crawl
Otis Gospodnetic 2011-03-25, 20:08
Hi,

Somebody (Paul?) mentioned using Droids for doing a 50M page crawl.  Anyone else
using Droids for crawls of that size?

I'm asking because I have a need to do a "semi-vertical" crawl on up to 10K
domains and I'm considering Droids vs. Nutch.  This may translate to several
times that many different servers - say 100K.  And that may translate to a few
100M web pages.  Too big for Droids without having a persistent link queue,
right?

Thanks,
Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
+
paul.vc 2011-03-25, 21:09
+
Thorsten Scherler 2011-03-28, 09:06
+
Otis Gospodnetic 2011-04-03, 13:54
+
Chapuis Bertil 2011-03-25, 21:56