. Results from
Did you mean:
Loading phrases to help you
refine your search...
No results found for
Search results for
....mail-archive.com/[EMAIL PROTECTED]/msg08665.html Discussion Grub has some interesting ideas about building a search engine using distributed
. And how is that
to nutch? CategoryHomepage FAQ...
... fetched. Then later send a CONT signal to the process. Do not turn off your
between! How many concurrent threads should I use? This is dependent on your particular set-up; unless...
[+ show more]
... bugs, patches, or feature requests to the mailing lists. Refer instead to Commiter's_Rules and HowToContribute areas of the Nutch
. Are there any mailing lists available? There...
... (see above). There are instructions on how to get Nutch working with Eclipse on [http://
.apache.org/nutch/RunNutchInEclipse] but the easiest way of doing is to use ANT for compiling...
... fetch pages that require Authentication? See the HttpAuthenticationSchemes
page. Speed of Fetching seems to decrease between crawl iterations... what's wrong? A possible reason...
, 2013-02-07, 04:47
...-analysis to get a single global
score for each url. Building a webgraph assumes that all links are stored in the current segments to be processed. Links are not held over from one processing...
... links to D which links back to A. This program is
expensive and usually, due to time and space requirement, can't be run on more than a three or four level depth. While it does...
[+ show more]
... and link cycles and then allow those links to be removed. Problem is the class is very expensive
. You can set the depth you want it to run but it is worse than exponential so I...
... scores. Some things to consider: Pagerank is just one of over 200 signals that google uses (if they still use it) to determine
. Even if Google still uses it it most likely has...
... changed. Link analysis scores are good global
scores, but a link score does not a search engine make today. Oh how I wish it was that simple. LinkRank is a good starting point, that...
, 2011-08-07, 12:55
... of the tutorial though I will point you to
resources if you want to know more about the architecture of Nutch and Hadoop. The tutorial comes in two phases. Firstly we get Hadoop running...
... not be compatible with future releases of either Nutch or Hadoop. Five: For this tutorial we setup nutch across 6 different
. If you are using a different number of machines you should still...
[+ show more]
... First let me layout the
that we used in our setup. To setup Nutch and Hadoop we had 7 commodity
ranging from 750Mghz to 1.0 Ghz. Each
had at least 128 Megs of RAM...
... and at least a 10 Gigabyte hard drive. One
had dual 750 Mghz CPUs and another had dual 30 Gigabyte hard drives. All of these
were purchased for under $500.00 at a liquidation sale...
.... I am telling you this to let you know that you don't have to have big hardware to get up and running with Nutch and Hadoop. Our
were named like this: devcluster01 devcluster02...
, 2011-09-02, 19:58
newest on top
oldest on top
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (1)
last 9 months (3)
Mattmann, Chris A (20)
Chris Hostetter (11)
Chuck Williams (10)
Terry Steichen (8)
Grant Ingersoll (7)
Sean Owen (7)
Michael McCandless (6)
Otis Gospodnetic (6)
Ted Dunning (6)
Gururaja H (5)
aash dhariya (5)
Erick Erickson (4)
Dan Brickley (3)
Doron Cohen (3)
All projects made searchable here are trademarks of the Apache Software Foundation. Service operated by