| clear query|facets|time |
Search criteria: .
Results from 241 to 250 from
805 (0.515s).
|
|
|
Loading phrases to help you refine your search...
|
|
Re: NullPointerException with ArcSegmentCreator - Nutch - [mail # user]
|
|
...Hi Justin, I tried with the latest trunk in local mode and it works fine on the document you mentioned but by looking at the code for FetcherOutputFormat I can see that you'd get such ...
|
|
|
Author: Julien Nioche,
2012-04-16, 12:42
|
|
|
Re: [VOTE] Apache Nutch 1.5 release rc #1 - Nutch - [mail # dev]
|
|
...Thanks Chris, -1 the versions of the deps for hadoop, tika and possibly others are not correct in the pom.xml found in the src archive and on the mvn repository, which will be a proble...
|
|
|
Author: Julien Nioche,
2012-04-16, 09:02
|
|
|
Re: Limiting Nutch crawl - Nutch - [mail # user]
|
|
...You are absolutely right. One way to limit per depth is to write a custom ScoringFilter to track the depth from the seed and prevent the outlinks from being added or the url from being gener...
|
|
|
Author: Julien Nioche,
2012-04-11, 15:48
|
|
|
Nutch 1.x trunk release - Nutch - [mail # dev]
|
|
...Hi guys, Chris - any idea of if / when you'll have the time to do a RC for trunk? Thanks Julien On 3 April 2012 15:30, Mattmann, Chris A (388J) wrote: &n...
|
|
|
Author: Julien Nioche,
2012-04-10, 15:07
|
|
|
[NUTCH-1208] Don't include KEYS file in bin distribution - Nutch - [issue]
|
|
...We should get rid of the KEYS file in the bin packaging (zip/tar) in 1.5....
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1208
Author: Julien Nioche,
2012-04-05, 05:29
|
|
|
Re: NutchGora release, and Nutch 1.x trunk release - Nutch - [mail # dev]
|
|
...done! thanks no probs * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/...
|
|
|
Author: Julien Nioche,
2012-04-03, 12:42
|
|
|
Re: NutchGora release, and Nutch 1.x trunk release - Nutch - [mail # dev]
|
|
...Good idea. On 3 April 2012 11:29, Markus Jelsma wrote: * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebb...
|
|
|
Author: Julien Nioche,
2012-04-03, 11:22
|
|
|
Re: Order of plugins, regex-urlfilter being ignored - Nutch - [mail # user]
|
|
...see nutch-default.xml urlfilter.order The order by which url filters are applied. If empty, all available url filters (as dictated by propertie...
|
|
|
Author: Julien Nioche,
2012-04-03, 10:05
|
|
|
Re: Nutch simple doesnt crawl webpages - Nutch - [mail # user]
|
|
...that page contains : which I believe is self explanatory Julien On 3 April 2012 11:01, jepse wrote: * *Open Source Solutions for Text Engineering...
|
|
|
Author: Julien Nioche,
2012-04-03, 10:04
|
|
|
Re: How to get Term Frequency Vector - Nutch - [mail # user]
|
|
...One option would be to use Behemoth to convert the Nutch segments, tokenize (e.g. with UIMA) then generate vectors for Mahout see https://github.com/jnioche/behemoth Julien  ...
|
|
|
Author: Julien Nioche,
2012-03-29, 18:45
|
|
|
|