Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 41 to 50 from 61 (0.371s).
Loading phrases to help you
refine your search...
PDF not crawled/indexed - Nutch - [mail # user]
...Hi,  I am crawling my website with this command:  bin/nutch crawl urls -dir crawl-$(date +%FT%H-%M-%S) -solr  http://localhost:8983/solr/ -depth 20 -topN 5  Is it a good ...
   Author: Tolga, 2012-05-22, 07:48
Crawl / index files as well - Nutch - [mail # user]
...Okay I'm coming to the end of my questions.  Do I need to read  http://wiki.apache.org/nutch/FAQ#How_do_I_index_my_local_file_system.3F  to index files as well on a web site? ...
   Author: Tolga, 2012-05-21, 11:54
org.apache.solr.common.SolrException: ERROR: [doc=null] missing required field: id - Nutch - [mail # user]
...Hi,  I am getting this error while crawling my website with nutch: [doc=null] missing required field: id request: http://localhost:8983/solr/update?wt=javabin&version=2 at org.apache.so...
   Author: Tolga, 2012-05-21, 11:02
Re: ERROR solr.SolrIndexer - java.io.IOException: Job failed! - Nutch - [mail # user]
...Hi Cameron,  I've been dealing with the same issue, and taking care of it by adding  the field, in your case 'site', to solr schema.xml, and restarting solr.  On 5/18/12 7:58 ...
   Author: Tolga, 2012-05-18, 06:03
Re: HTTP error 400 - Nutch - [mail # user]
...I'm still confused. You mean to use  http://wiki.apache.org/nutch/NutchTutorial#A3.2_Using_Individual_Commands_for_Whole-Web_Crawling  ?  On 5/15/12 2:05 PM, Markus Jelsma wro...
   Author: Tolga, 2012-05-17, 10:07
curl or nutch - Nutch - [mail # user]
...Hi,  I have been trying for a week. I really want to get a start, so what  should I use? curl or nutch? I want to be able to index pdf, xml etc.  and search within them as wel...
   Author: Tolga, 2012-05-16, 07:43
solrindex - Nutch - [mail # user]
...I'm going nuts.  I issued the command bin/nutch crawl urls -solr  http://localhost:8983/solr/ -depth 3 -topN 5, went on to  http://localhost:8983/solr/admin/stats.jsp and veri...
   Author: Tolga, 2012-05-15, 13:27
Re: HTTP error 400 - Nutch - [mail # user]
...bin/nutch solrindex http://localhost:8983/solr/ crawldb -linkdb  crawl/linkdb crawl/segments/*  SolrIndexer: starting at 2012-05-15 15:34:36 org.apache.hadoop.mapred.InvalidInputEx...
   Author: Tolga, 2012-05-15, 12:49
Re: HTTP error 400 - Nutch - [mail # user]
...Hi,  I would like to report that the directory schema given in the command  bin/nutch solrindex http://127.0.0.1:8983/solr/ crawldb -linkdb  crawldb/linkdb crawldb/segments/* ...
   Author: Tolga, 2012-05-15, 12:01
Re: HTTP error 400 - Nutch - [mail # user]
...I'm a little confused. How can I not use the crawl command and execute  the separate crawl cycle commands at the same time?  Regards,  On 5/11/12 9:40 AM, Markus Jelsma wrote:...
   Author: Tolga, 2012-05-15, 10:40
Sort:
project
Nutch (61)
Solr (41)
type
mail # user (61)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (0)
last 9 months (61)
author
Markus Jelsma (1783)
Lewis John Mcgibbney (1183)
Julien Nioche (817)
Mattmann, Chris A (406)
lewis john mcgibbney (337)
Andrzej Bialecki (302)
Ferdy Galema (229)
Tejas Patil (219)
Bai Shen (177)
kiran chitturi (165)
Sebastian Nagel (164)
alxsss@...)
remi tassing (133)
Lewis John McGibbney (129)
Gabriele Kahlout (115)
Tolga