Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 81 to 90 from 133 (4.133s).
Loading phrases to help you
refine your search...
is it necessary to merge DBs before solrindex? - Nutch - [mail # user]
...Hi,  The solrindex command requires crawldb and linkdb as parameters. Now, I would like to know if for newly generated segments it's necessary to merge the corresponding crawldb and lin...
   Author: remi tassing, 2012-02-01, 06:23
all possible fields in Nutch Schema.xml - Nutch - [mail # user]
...Hi,  already defined. I was wondering if there was an exhaustive list of possible fileds we could include.  Are those from this site all there is?  https://svn.apache.org/repo...
   Author: remi tassing, 2012-01-31, 20:45
Re: why nutch dosen't crawl Arabic sites well? - Nutch - [mail # user]
...Check your log for any error  On Tuesday, January 31, 2012, Markus Jelsma  wrote: http://lucene.472066.n3.nabble.com/why-nutch-dosen-t-crawl-Arabic-sites-we...
   Author: remi tassing, 2012-01-31, 16:55
Aborting with 10 hung threads -ver.2 - Nutch - [mail # user]
...Hi,  I'm using Nutch-1.2 and having "Aborting with 10 hung threads" for some sites.  I checked this thread http://www.mail-archive.com/[EMAIL PROTECTED]/msg15889.html and the JIRA ...
   Author: remi tassing, 2012-01-31, 12:58
From Nutch 1.2 to 1.4 - Nutch - [mail # user]
...Hi,  So I've finally decided to move to Nutch-1.4, it seems a lot faster.  The issue I had with executing versions greater than 1.2 on cygwin is solved by the tip from Luis, thanks...
   Author: remi tassing, 2012-01-31, 09:22
Re: why nutch dosen't crawl all links - Nutch - [mail # user]
...Does it crawl other sites with Arabic characters? Remi  On Tuesday, January 31, 2012, mina  wrote: http://www.irna.ir/News/30786427/سوء-استفاده-از-نام-كمیته-امداد-برای-جمع-آوری-رای...
   Author: remi tassing, 2012-01-31, 05:53
Re: undo "db_gone" - Nutch - [mail # user]
...I'm using Solr-3.4.  I honestly didn't get that message Mark  Remi  On Sunday, January 29, 2012, Markus Jelsma  wrote:...
   Author: remi tassing, 2012-01-29, 15:40
undo "db_gone" - Nutch - [mail # user]
...Hi,  I understand when a url is classified as "db_gone", Nutch won't bother fetch it again. I have many urls in this situation that I would like to recrawl.  Any idea how to fix it...
   Author: remi tassing, 2012-01-29, 07:10
Re: invalid uri with "three dots" - Nutch - [mail # user]
...Hey guys,  any ideas on how to "properly escape non-URI characters?". I'm getting invalid URI for urls that contain "three dots", "space"...  //Remi  [1] https://issues.apache...
   Author: remi tassing, 2012-01-26, 14:16
Re: Dump unfetched ,fetched,gone, URLS - Nutch - [mail # user]
...This command dumps the fetched and unfetched but not gone urls: http://wiki.apache.org/nutch/bin/nutch_readseg  Remi  On Monday, January 23, 2012, Nutch Begineeer  wrote: only...
   Author: remi tassing, 2012-01-23, 20:15
Sort:
project
Nutch (133)
Solr (27)
type
mail # user (133)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (0)
last 9 months (133)
author
Markus Jelsma (1767)
Lewis John Mcgibbney (1125)
Julien Nioche (805)
Mattmann, Chris A (402)
lewis john mcgibbney (334)
Andrzej Bialecki (302)
Ferdy Galema (224)
Tejas Patil (164)
Bai Shen (163)
kiran chitturi (157)
Sebastian Nagel (156)
alxsss@...)
remi tassing (133)
Lewis John McGibbney (129)
Gabriele Kahlout (115)