Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 11 to 20 from 133 (0.53s).
Loading phrases to help you
refine your search...
Re: Getting seed url - Nutch - [mail # user]
...Segments have a field called 'outlinks', could this help?  On Tuesday, June 12, 2012, Sebastian Nagel wrote:  ...
   Author: remi tassing, 2012-06-11, 22:45
Compilation of core classes - Nutch - [mail # user]
...Hello guys,  this is probably a basic Java/Ant question. It's pretty easy to compile plugins. All you do is go to the plugin root directory and run 'ant' (e.g. nutch-1.4/src/plugin/prot...
   Author: remi tassing, 2012-06-10, 09:35
Re: using less resources - Nutch - [mail # user]
...I was wondering how do you know  if the page was changed without actually fetching it  On Wednesday, May 23, 2012, wrote:  ...
   Author: remi tassing, 2012-05-23, 12:58
Re: Crawl sites with hashtags in url - Nutch - [mail # user]
...Hi Roberto,  If you're having an invalid URI error, then this might probably help you: http://lucene.472066.n3.nabble.com/Invalid-uri-td3742047.html  Remi  On Tue, May 1, 2012...
   Author: remi tassing, 2012-05-02, 00:20
Re: solution for scanned pdf parsing - Nutch - [mail # user]
...It could also be due to the filesize  //Remi   On Tuesday, April 24, 2012, nutchsolruser  wrote: with http://lucene.472066.n3.nabble.com/solution-for-scanned-pdf-parsing-tp393...
   Author: remi tassing, 2012-04-24, 10:45
Re: Good workflow for a regular re-indexing job - Nutch - [mail # user]
...Have you read this? http://wiki.apache.org/nutch/NutchTutorial/ You can put all commands in a shell script  Remi  On Monday, April 23, 2012, Ian Piper wrote:  ...
   Author: remi tassing, 2012-04-23, 22:57
Re: exclude some urls from crawling - Nutch - [mail # user]
...To exclude index.php and index.html just use: -index\.html -index\.php  You can do the same for video and live-score.  To ultimately make sure if a URL is blocked or not, try: echo...
   Author: remi tassing, 2012-04-13, 13:46
Re: How to handle failures in nutch? - Nutch - [mail # user]
...I don't think so!  freegen will generate a new segment and you don't need to merge it with the others.  Then you can (fetch and) parse the content from that new segment.  Fina...
   Author: remi tassing, 2012-04-10, 10:15
Re: Returning web page abstract with Solr - Nutch - [mail # user]
...Are you looking for result highlighting? http://wiki.apache.org/solr/HighlightingParameters  Remi  On Wed, Apr 4, 2012 at 3:30 PM, smooth almonds wrote:  ...
   Author: remi tassing, 2012-04-04, 07:33
Re: Normalizer error: "IndexOutOfBoundsException: No group 1" - Nutch - [mail # user]
...True true, thanks!  On Tue, Apr 3, 2012 at 3:08 AM, Sebastian Nagel wrote:  ...
   Author: remi tassing, 2012-04-03, 00:19
Sort:
project
Nutch (133)
Solr (27)
type
mail # user (133)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (0)
last 9 months (133)
author
Markus Jelsma (1767)
Lewis John Mcgibbney (1119)
Julien Nioche (805)
Mattmann, Chris A (402)
lewis john mcgibbney (334)
Andrzej Bialecki (302)
Ferdy Galema (224)
Tejas Patil (163)
Bai Shen (161)
Sebastian Nagel (156)
kiran chitturi (155)
alxsss@...)
remi tassing (133)
Lewis John McGibbney (129)
Gabriele Kahlout (115)