| clear query|facets|time |
Search criteria: .
Results from 41 to 50 from
61 (0.371s).
|
|
|
Loading phrases to help you refine your search...
|
|
PDF not crawled/indexed - Nutch - [mail # user]
|
|
...Hi, I am crawling my website with this command: bin/nutch crawl urls -dir crawl-$(date +%FT%H-%M-%S) -solr http://localhost:8983/solr/ -depth 20 -topN 5 Is it a good ...
|
|
|
Author: Tolga,
2012-05-22, 07:48
|
|
|
Crawl / index files as well - Nutch - [mail # user]
|
|
...Okay I'm coming to the end of my questions. Do I need to read http://wiki.apache.org/nutch/FAQ#How_do_I_index_my_local_file_system.3F to index files as well on a web site? ...
|
|
|
Author: Tolga,
2012-05-21, 11:54
|
|
|
org.apache.solr.common.SolrException: ERROR: [doc=null] missing required field: id - Nutch - [mail # user]
|
|
...Hi, I am getting this error while crawling my website with nutch: [doc=null] missing required field: id request: http://localhost:8983/solr/update?wt=javabin&version=2 at org.apache.so...
|
|
|
Author: Tolga,
2012-05-21, 11:02
|
|
|
Re: ERROR solr.SolrIndexer - java.io.IOException: Job failed! - Nutch - [mail # user]
|
|
...Hi Cameron, I've been dealing with the same issue, and taking care of it by adding the field, in your case 'site', to solr schema.xml, and restarting solr. On 5/18/12 7:58 ...
|
|
|
Author: Tolga,
2012-05-18, 06:03
|
|
|
Re: HTTP error 400 - Nutch - [mail # user]
|
|
...I'm still confused. You mean to use http://wiki.apache.org/nutch/NutchTutorial#A3.2_Using_Individual_Commands_for_Whole-Web_Crawling ? On 5/15/12 2:05 PM, Markus Jelsma wro...
|
|
|
Author: Tolga,
2012-05-17, 10:07
|
|
|
curl or nutch - Nutch - [mail # user]
|
|
...Hi, I have been trying for a week. I really want to get a start, so what should I use? curl or nutch? I want to be able to index pdf, xml etc. and search within them as wel...
|
|
|
Author: Tolga,
2012-05-16, 07:43
|
|
|
solrindex - Nutch - [mail # user]
|
|
...I'm going nuts. I issued the command bin/nutch crawl urls -solr http://localhost:8983/solr/ -depth 3 -topN 5, went on to http://localhost:8983/solr/admin/stats.jsp and veri...
|
|
|
Author: Tolga,
2012-05-15, 13:27
|
|
|
Re: HTTP error 400 - Nutch - [mail # user]
|
|
...bin/nutch solrindex http://localhost:8983/solr/ crawldb -linkdb crawl/linkdb crawl/segments/* SolrIndexer: starting at 2012-05-15 15:34:36 org.apache.hadoop.mapred.InvalidInputEx...
|
|
|
Author: Tolga,
2012-05-15, 12:49
|
|
|
Re: HTTP error 400 - Nutch - [mail # user]
|
|
...Hi, I would like to report that the directory schema given in the command bin/nutch solrindex http://127.0.0.1:8983/solr/ crawldb -linkdb crawldb/linkdb crawldb/segments/* ...
|
|
|
Author: Tolga,
2012-05-15, 12:01
|
|
|
Re: HTTP error 400 - Nutch - [mail # user]
|
|
...I'm a little confused. How can I not use the crawl command and execute the separate crawl cycle commands at the same time? Regards, On 5/11/12 9:40 AM, Markus Jelsma wrote:...
|
|
|
Author: Tolga,
2012-05-15, 10:40
|
|
|
|