Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Plain View
Nutch, mail # user - Going Beyond the Prototype


+
webdev1977 2011-05-10, 14:37
+
J. Delgado 2011-05-10, 16:05
+
webdev1977 2011-05-10, 16:56
+
webdev1977 2011-05-12, 17:58
+
Dietrich 2011-05-12, 18:19
Copy link to this message
-
Re: Going Beyond the Prototype
webdev1977 2011-05-12, 18:24
I was saying that based on what the previous poster stated.  Also the fact
that I have read through quite a bit of posts stating that the problem with
crawling in a vertical environment has to do with the way fetcher2 was
built.  The fetches are grouped by domain name and if you have a lot of urls
from the same domain then you are not able to do quick mapreduce jobs.  

I hope this is wrong though ;-)

--
View this message in context: http://lucene.472066.n3.nabble.com/Going-Beyond-the-Prototype-tp2923289p2932969.html
Sent from the Nutch - User mailing list archive at Nabble.com.
+
Dietrich 2011-05-12, 18:30
+
Julien Nioche 2011-05-12, 20:12
+
webdev1977 2011-05-16, 10:41
+
webdev1977 2011-05-25, 11:52
+
Julien Nioche 2011-05-25, 13:34
+
webdev1977 2011-05-25, 15:55
+
Julien Nioche 2011-05-25, 20:09
+
Dietrich 2011-05-12, 18:22