Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 71 to 80 from 155 (0.359s).
Loading phrases to help you
refine your search...
Re: Parse HTML Page with link generated by javascript - Nutch - [mail # user]
...Hi Alexandre,   Nutch does not interpret java script but is has a link extractor for java script based on regular expressions, see plugin parse-js. It does its job but - produces some n...
   Author: Sebastian Nagel, 2012-10-03, 20:13
Re: [VOTE] Apache Nutch 2.1 Release Candidate Available - Nutch - [mail # dev]
...Forgot to say: I've run the test crawl with HBase 0.90.5  On 10/01/2012 04:34 PM, Julien Nioche wrote:...
   Author: Sebastian Nagel, 2012-10-01, 18:37
Re: [VOTE] Apache Nutch 2.1 Release Candidate Available - Nutch - [mail # dev]
...+1  * package looks good * sample crawl runs like a charm  On 09/21/2012 05:07 PM, Lewis John Mcgibbney wrote:...
   Author: Sebastian Nagel, 2012-09-27, 21:26
Re: Nutch not crawling jabong - Nutch - [mail # user]
...Hi,  there are plenty of reasons why a document is missing. See http://wiki.apache.org/nutch/DebugTool for a list of possible reasons (sorry, explanations are missing).  About the ...
   Author: Sebastian Nagel, 2012-09-24, 19:27
Re: tmp folder problem - Nutch - [mail # user]
...Hi Matteo,  have a look at the property hadoop.tmp.dir which allows you to direct the temp folder to another volume with more space on it. For "local" crawls:  - do not share this ...
   Author: Sebastian Nagel, 2012-09-20, 19:27
[NUTCH-1415] release packages to contain top level folder apache-nutch-x.x - Nutch - [issue]
...The release packages should contain a top level folder named apache-nutch-x.x (x replaced by major and minor version) as in previous releases. Unpacking the packages from the command line vi...
http://issues.apache.org/jira/browse/NUTCH-1415    Author: Sebastian Nagel, 2012-09-18, 22:24
Re: svn commit: r1387356 - in /nutch/branches/2.x: CHANGES.txt build.xml - Nutch - [mail # dev]
...Great.  On 09/18/2012 10:57 PM, Lewis John Mcgibbney wrote:...
   Author: Sebastian Nagel, 2012-09-18, 21:12
Re: breakpoints in eclipse and nutch 1.5 - Nutch - [mail # user]
...Yes, "very much appreciated". Line numbers change frequently between versions.  Btw, I switched to use bin/nutch in combination with the Eclipse remote debugger. bin/nutch is very flexi...
   Author: Sebastian Nagel, 2012-09-11, 20:38
Re: Escaping URL during redirection - Nutch - [mail # user]
...Redirects are filtered and normalized. It works for 1.4/1.5 and should for trunk. One subtlety: there is an extra scope for normalization of redirects ("fetcher"). If scoped normalization ru...
   Author: Sebastian Nagel, 2012-09-09, 07:39
Re: CHM Files and Tika - Nutch - [mail # user]
...Hi Jan,  opened a Jira issue: https://issues.apache.org/jira/browse/NUTCH-1454 Thanks!  Beyond the "can't retrieve parser" error: I've tried a couple of chm files (among them the t...
   Author: Sebastian Nagel, 2012-08-14, 20:28
Sort:
project
Nutch (155)
type
mail # user (90)
mail # dev (38)
issue (27)
date
last 7 days (2)
last 30 days (9)
last 90 days (25)
last 6 months (54)
last 9 months (155)
author
Markus Jelsma (1767)
Lewis John Mcgibbney (1110)
Julien Nioche (805)
Mattmann, Chris A (399)
lewis john mcgibbney (334)
Andrzej Bialecki (302)
Ferdy Galema (224)
Bai Shen (161)
Tejas Patil (157)
Sebastian Nagel (155)
kiran chitturi (155)
alxsss@...)
remi tassing (133)
Lewis John McGibbney (129)
Gabriele Kahlout (115)