| clear query|facets|time |
Search criteria: .
Results from 11 to 20 from
229 (0.206s).
|
|
|
Loading phrases to help you refine your search...
|
|
[NUTCH-1387] All parsers should respond to cancellation / interrupts. - Nutch - [issue]
|
|
...During parsing a TimeoutException can occur. This is caused whenever the FutureTask.get() cannot be completed within the specified timeout. The tricky part is that single urls might be perfe...
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1387
Author: Ferdy Galema,
2013-01-12, 19:15
|
|
|
[NUTCH-1286] Refactoring/reimplementing crawling API (NutchApp) - Nutch - [issue]
|
|
...This issue is to track changes we (Mathijs and I) have planned for the API and webapp in Nutchgora. We have a pretty good idea of how we want to be using the crawl API. It may involve some m...
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1286
Author: Ferdy Galema,
2013-01-12, 18:55
|
|
|
[NUTCH-1452] hadoop.job.history.user.location in nutch-default making job history useless - Nutch - [issue]
|
|
...There is still a property in nutch-default 'hadoop.job.history.user.location' that redirects the creation of history files from job output locations to a custom location. I noticed that the ...
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1452
Author: Ferdy Galema,
2013-01-12, 18:47
|
|
|
[NUTCH-1457] Nutch2 Refactor the update process so that fetched items are only processed once - Nutch - [issue]
|
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1457
Author: Ferdy Galema,
2013-01-12, 18:46
|
|
|
Re: code changes not reflecting when deployed on hadoop - Nutch - [mail # user]
|
|
...For the record: This no longer seems to be the case for trunk. (At least when you properly ant clean prior to building). On Fri, Dec 28, 2012 at 12:25 PM, Sourajit Basak wrote: ...
|
|
|
Author: Ferdy Galema,
2013-01-07, 10:52
|
|
|
Re: What's the different between marker and metadata? - Nutch - [mail # user]
|
|
...Late reply, so this one is for the record: Markers are used for controlling the set of urls for processing: Generator --> Sets fetch markers for the Fetcher. Fetcher --> Re...
|
|
|
Author: Ferdy Galema,
2013-01-07, 10:22
|
|
|
Re: Input path does not exist - Nutch - [mail # user]
|
|
...Hi, Seems like you try to use "-dir" as an input path. Take a look at the arguments you provide. What command are you running (local or distributed?) On Wed, Dec 12, 2012 at 7:2...
|
|
|
Author: Ferdy Galema,
2012-12-12, 09:25
|
|
|
[NUTCH-1446] Port NUTCH-1444 to trunk (Indexing should not create temporary files) - Nutch - [issue]
|
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1446
Author: Ferdy Galema,
2012-12-06, 14:53
|
|
|
Re: How to find ids of pages that have been newly crawled or modified after a given date with Nutch 2.1 - Nutch - [mail # user]
|
|
...Hi, There might be something wrong with the field modifiedTime. I'm not sure how well you can rely on this field (with the default or the adaptive scheduler). If you want to get ...
|
|
|
Author: Ferdy Galema,
2012-11-13, 09:30
|
|
|
Re: org.apache.solr.common.SolrException: Document contains multiple values for uniqueKey field: id? - Nutch - [mail # user]
|
|
...I'm not a regular Solr user, but here are some pointers: Somehow, you have added multiple values for the 'id' field. What did you change from the default indexing behaviour?Perhaps some cust...
|
|
|
Author: Ferdy Galema,
2012-11-13, 08:57
|
|
|
|