Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 81 to 90 from 224 (0.129s).
Loading phrases to help you
refine your search...
Re: Nutch 2 solrindex - Nutch - [mail # user]
...Thanks. I will see if I can reproduce and patch this. (In case you do not create a Jira).  On Thu, Aug 2, 2012 at 7:54 PM,  wrote:  ...
   Author: Ferdy Galema, 2012-08-03, 08:35
Re: Different batch id - Nutch - [mail # user]
...Hi,  It depends on the expectation ;)  I agree that it may be confusing, but currently the -all option in the various Nutch tools only process "all with a mark". There is a separat...
   Author: Ferdy Galema, 2012-08-03, 08:30
Re: Nutch 2.0, MySQL and UTF-8 - Nutch - [mail # user]
...Thanks for the update. We need to take this into consideration when we reimplement the SqlStore. (Still a pending issue).  On Thu, Aug 2, 2012 at 1:28 PM,  wrote:  ...
   Author: Ferdy Galema, 2012-08-02, 11:52
Re: Nutch 2 solrindex - Nutch - [mail # user]
...Hi,  Do you want to open a Jira and attach the patch over there? Or just explain what the problem is caused. I'm curious to what this might be.  Thanks, Ferdy.  On Wed, Aug 1,...
   Author: Ferdy Galema, 2012-08-02, 07:16
Re: updatedb fails to put UPDATEDB_MARK in nutch-2.0 - Nutch - [mail # user]
...Hi,  The markers (controlling the batchId) still have some unpredictable behaviour sometimes. I'm not sure what causes this problem. Does it occur on a few rows or on all rows? Do you r...
   Author: Ferdy Galema, 2012-08-01, 07:42
Re: NegativeArraySizeException and "problem advancing port rec#" during fetching - Nutch - [mail # user]
...Hi,  Have you tried the mapreduce mailing list? This really looks like a Hadoop specific error. (Note that crawl_generate really is really just a sequence file.) What about using a diff...
   Author: Ferdy Galema, 2012-07-31, 12:01
Re: Programatically determining crawlIds in Nutch 2.x - Nutch - [mail # user]
...Hi,  Just to avoid confusion: There are 2 concepts, namely batchId and crawlId. The batchId is a subset within the same table. The table is determined by crawlId. Not all stores adhere ...
   Author: Ferdy Galema, 2012-07-30, 15:46
Re: Seed List URLs To Index Question - Nutch - [mail # user]
...Hi,  What version of Nutch are you running? Please note that urls not ending up in the index can have many reasons. But most likely because of the fact that not everything is crawled. (...
   Author: Ferdy Galema, 2012-07-27, 09:09
Re: updatedb in nutch-2.0 with mysql - Nutch - [mail # user]
...I've just ran a crawl with Nutch 2.0 tag using the SqlStore. Please try to reproduce from a clean checkout/download.  nano conf/nutch-site.xml #set http.agent.name and http.robots.agent...
   Author: Ferdy Galema, 2012-07-27, 09:02
[NUTCH-1438] ParserJob support for option -reparse - Nutch - [issue]
http://issues.apache.org/jira/browse/NUTCH-1438    Author: Ferdy Galema, 2012-07-26, 13:08
Sort:
project
Nutch (224)
ElasticSearch (2)
Mahout (1)
type
mail # user (117)
issue (61)
mail # dev (46)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (13)
last 9 months (224)
author
Markus Jelsma (1767)
Lewis John Mcgibbney (1112)
Julien Nioche (805)
Mattmann, Chris A (400)
lewis john mcgibbney (334)
Andrzej Bialecki (302)
Ferdy Galema (224)
Bai Shen (161)
Tejas Patil (158)
Sebastian Nagel (155)
kiran chitturi (155)
alxsss@...)
remi tassing (133)
Lewis John McGibbney (129)
Gabriele Kahlout (115)