| clear query|facets|time |
Search criteria: .
Results from 71 to 80 from
133 (0.174s).
|
|
|
Loading phrases to help you refine your search...
|
|
Re: Failed fetching - Nutch - [mail # user]
|
|
...I'm slowly from migrating from Nutch-1.2 to 1.4 and it works with cygwin. I use protocol-httpclient but could try protocol-http if you want Remi On Friday, February 10, 201...
|
|
|
Author: remi tassing,
2012-02-14, 18:03
|
|
|
Re: Invalid uri? - Nutch - [mail # user]
|
|
...I did have the same issue before and I modified Httpclient to replace those characters. It's good to know there is a simpler solution, I'll try URLNormalizer later on Remi ...
|
|
|
Author: remi tassing,
2012-02-14, 03:34
|
|
|
Re: Understanding NutchConfigration properly - Nutch - [mail # user]
|
|
...if they are really useless why keep them? Remi On Sunday, February 12, 2012, Julien Nioche wrote: [EMAIL PROTECTED]> initially done.....
|
|
|
Author: remi tassing,
2012-02-12, 17:54
|
|
|
Re: how are CSV/TXT files handled - Nutch - [mail # user]
|
|
...You're right about Parsechecker and Nutch-1.2. Well I'm trying Nutch-1.4 right now but still having same problem. Here is my parsechecker output: $ bin/nutch parsechecker http://...
|
|
|
Author: remi tassing,
2012-02-08, 14:04
|
|
|
Re: how are CSV/TXT files handled - Nutch - [mail # user]
|
|
...Ok I just did (It's great but I've been reluctant because recompiling always gives me errors). However, I'm still having a similar error: $ bin/nutch parsechecker http://URL fetching: ...
|
|
|
Author: remi tassing,
2012-02-08, 09:22
|
|
|
how are CSV/TXT files handled - Nutch - [mail # user]
|
|
...Hey guys, I checked the mailing-list archive but couldn't get an answer on this. I think CSV and TXT don't need any kind of parsing, but how.are handled by default? Remi...
|
|
|
Author: remi tassing,
2012-02-07, 14:37
|
|
|
Re: how are CSV/TXT files handled - Nutch - [mail # user]
|
|
...With the "nutch parsechecker" command I get the following error message: "Error: Could not find or load main class parsechecker", this doesn't sound good! On Tue, Feb 7, 2012 at ...
|
|
|
Author: remi tassing,
2012-02-07, 08:08
|
|
|
Re: how are CSV/TXT files handled - Nutch - [mail # user]
|
|
...The point that made me start thinking is because I got this error message: "failed(2,0): Can't retrieve Tika parser for mime-type application/ms-excel" I'm using Nutch-1.2 and my...
|
|
|
Author: remi tassing,
2012-02-07, 07:58
|
|
|
Re: invalid uri with "three dots" - Nutch - [mail # user]
|
|
...Problem solved! I replaced all whitespaces with "%20" in the url before getting the "content" in httpreaponse.java(Httpclient plugin). Dirty solution? Yes, but it works for me no...
|
|
|
Author: remi tassing,
2012-02-01, 18:18
|
|
|
Re: why nutch dosen't crawl Arabic sites well? - Nutch - [mail # user]
|
|
...Try the following command. It'll export all the urls that were crawled. [1] http://wiki.apache.org/nutch/bin/nutch_readdb Remi On Wednesday, February 1, 2012, mina wrote: ...
|
|
|
Author: remi tassing,
2012-02-01, 17:55
|
|
|
|