Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 31 to 40 from 133 (0.304s).
Loading phrases to help you
refine your search...
Re: nutch crawling file system SOLVED - Nutch - [mail # user]
...You're probably looking for the "Highlighting" future  http://wiki.apache.org/solr/HighlightingParameters  Remi  On Sun, Mar 11, 2012 at 6:10 PM, alessio crisantemi  wrot...
   Author: remi tassing, 2012-03-11, 17:37
Re: Crawling with Certs - Nutch - [mail # user]
...There are many debugging tips on the bottom of that page, did you try them?  E.g. ParserChecker, debug-level log info, ...  BTW, which authentication scheme is required by your sit...
   Author: remi tassing, 2012-03-07, 21:14
Re: Crawling with Certs - Nutch - [mail # user]
...Try googling for Nutch+httpclient  Remi  On Wednesday, March 7, 2012, Christopher Gross  wrote:...
   Author: remi tassing, 2012-03-07, 20:40
Re: java.net.UnknownHostException during fetching - Nutch - [mail # user]
...So you can actually ping those servers or use wget or curl to download them?  On Sun, Mar 4, 2012 at 7:49 PM, hadi  wrote:  ...
   Author: remi tassing, 2012-03-04, 19:05
Re: nutch craling file system - Nutch - [mail # user]
...Why don't you try and let us know?  On Sun, Mar 4, 2012 at 6:05 PM, alessio crisantemi  wrote:  ...
   Author: remi tassing, 2012-03-04, 17:09
Re: java.net.UnknownHostException during fetching - Nutch - [mail # user]
...I had that same error for "dead" URLs or those that needed proxies to get access to  Remi  On Sun, Mar 4, 2012 at 1:19 PM, hadi  wrote:  ...
   Author: remi tassing, 2012-03-04, 16:43
Re: nutch craling file system - Nutch - [mail # user]
...Plz try GOOGLing that first!  If you don't find anything then try these: [1]http://wiki.apache.org/nutch/FAQ#How_do_I_index_my_local_file_system.3F [2]http://www.folge2.de/tp/search/1/c...
   Author: remi tassing, 2012-03-04, 16:06
Re: Only fetching initial seedlist - Nutch - [mail # user]
...This question comes a lot, try searching the mailinglist archive  On Friday, March 2, 2012, James Ford  wrote: seedlist some http://lucene.472066.n3.nabble.com/Only-fetching-initia...
   Author: remi tassing, 2012-03-02, 03:59
Re: multiple small crawlers on single machine conflict at /tmp/hadoop-username/mapred - Nutch - [mail # user]
...How did you define that property so it's different so each job?  Remi  On Friday, March 2, 2012, Jeremy Villalobos  wrote: are...
   Author: remi tassing, 2012-03-02, 03:57
Re: IOExeption when crawling with nutch in Fetching process - Nutch - [mail # user]
...Another possibility might be the "tmp" memory[1]:  "The answer we find addressed the situation is that you're most likely out of disk space in /tmp. Consider using another location, or ...
   Author: remi tassing, 2012-02-29, 21:46
Sort:
project
Nutch (133)
Solr (27)
type
mail # user (133)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (0)
last 9 months (133)
author
Markus Jelsma (1783)
Lewis John Mcgibbney (1176)
Julien Nioche (816)
Mattmann, Chris A (405)
lewis john mcgibbney (336)
Andrzej Bialecki (302)
Ferdy Galema (229)
Tejas Patil (218)
Bai Shen (177)
kiran chitturi (165)
Sebastian Nagel (163)
alxsss@...)
remi tassing (133)
Lewis John McGibbney (129)
Gabriele Kahlout (115)