| clear query|facets|time |
Search criteria: .
Results from 1 to 10 from
805 (0.229s).
|
|
|
Loading phrases to help you refine your search...
|
|
Re: Request for Backup Mentor(s) for GSoCq - Nutch - [mail # dev]
|
|
...Hi Lewis I am happy to be a backup mentor for this. Cheers Julien On 16 May 2013 18:19, Lewis John Mcgibbney wrote: * *Open Source Solutions for Text E...
|
|
|
Author: Julien Nioche,
2013-05-17, 08:04
|
|
|
Re: Store seed-url in Solr - Nutch - [mail # user]
|
|
...Hi Urs, The plugin urlMeta can be used for that. You can add a custom feature to entries in your seed list and configure the parameters used by urlMeta so that the metadata value gets ...
|
|
|
Author: Julien Nioche,
2013-05-02, 12:30
|
|
|
Re: HBase 0.94.6 and Nutch 2.1 - Nutch - [mail # user]
|
|
...See https://issues.apache.org/jira/browse/NUTCH-1047 which is in trunk for writing indexing plugins. You will have the same issues with the versions of HBase if you use GORA within your plug...
|
|
|
Author: Julien Nioche,
2013-05-02, 07:57
|
|
|
Re: HBase 0.94.6 and Nutch 2.1 - Nutch - [mail # user]
|
|
...Nutch 1.x is definitely more tested and robust than 2.x. Loads of work is done for the latter but the former is probably a safer option in production. You could use the pluggable indexer and...
|
|
|
Author: Julien Nioche,
2013-05-01, 20:25
|
|
|
Re: Proper way to stop a crawl safely - Nutch 1.6 from Hadoop 1.1.1 - Nutch - [mail # user]
|
|
...The crawl script (/bin/crawl) can be stopped in its iterations if a .STOP file is created in the same directory. Otherwise 'hadoop job -kill' is the way to go J. On 1 May 2013 0...
|
|
|
Author: Julien Nioche,
2013-05-01, 07:19
|
|
|
Re: Partial Updates in Solr 4.1 - Nutch - [mail # dev]
|
|
...Hi Tomas Nice to hear about punkspider and great that you are using Nutch. Can you please open a JIRA issue and attach a patch for this? https://wiki.apache.org/nutch/HowToContri...
|
|
|
Author: Julien Nioche,
2013-04-26, 07:52
|
|
|
Re: rewriting urls that are index - Nutch - [mail # user]
|
|
...URLNormalizers can have a scope, see http://nutch.apache.org/apidocs-1.6/org/apache/nutch/net/URLNormalizers.html#SCOPE_INDEXER. Should help to normalise only at indexing time On 22 A...
|
|
|
Author: Julien Nioche,
2013-04-22, 16:19
|
|
|
Re: Nutch/Solr expert - Nutch - [mail # dev]
|
|
...Hi Andrea See http://wiki.apache.org/nutch/Support for a list of people who can help. Am based in the UK but feel free to get in touch at [EMAIL PROTECTED]f that's still of interest &n...
|
|
|
Author: Julien Nioche,
2013-04-13, 18:05
|
|
|
Re: nutch and ElasticSearch - Nutch - [mail # user]
|
|
...https://issues.apache.org/jira/browse/NUTCH-1047 has been committed in trunk but there is no index writer implementation for elastic search (yet) but shouldn't be too hard to add one I...
|
|
|
Author: Julien Nioche,
2013-04-04, 15:34
|
|
|
Re: Wiki locked down, spam pages deleted - Nutch - [mail # dev]
|
|
...Thanks for taking the time to do it Kiran! On 4 April 2013 01:16, kiran chitturi wrote: * *Open Source Solutions for Text Engineering http://digitalpebble.blo...
|
|
|
Author: Julien Nioche,
2013-04-04, 08:14
|
|
|
|