Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 6 from 6 (3.063s).
Loading phrases to help you
refine your search...
Re: Nutch efficiency and multiple single URL crawls - Nutch - [mail # user]
...Got it, I will try that out, that's an excellent feature. Thank you for the help.   On Thu, Nov 29, 2012 at 4:06 AM, Markus Jelsma wrote:     ___  Alejandro Caceres Hyper...
   Author: Alejandro Caceres, 2012-11-29, 21:46
Re: Best practice to index a large crawl through Solr? - Nutch - [mail # user]
...No problem. Wrt to your first question, Solr would actually be storing this data locally. Solr sharding actually uses its own mechanism called SolrCloud. I'd recommend checking it out here: ...
   Author: Alejandro Caceres, 2012-10-22, 20:00
Re: Best practice to index a large crawl through Solr? - Nutch - [mail # user]
...It sort of depends on your purpose and the amount of data. I currently have a single Solr instance (~1GB of memory, 2 processors on the server) serving almost ~3,700,000 records from Nutch a...
   Author: Alejandro Caceres, 2012-10-22, 19:11
Re: Search in specific website - Nutch - [mail # user]
...Once you've indexed it with Solr this can be done using Solr Query Syntax. Essentially what you're asking boils down to a Solr question. In your example situation you could do something like...
   Author: Alejandro Caceres, 2012-10-12, 21:48
Re: Why won't my crawl ignore these urls? [SOLVED] - Nutch - [mail # user]
...Glad to help and good luck!  On Fri, Aug 3, 2012 at 1:43 AM, Ian Piper  wrote:  ...
   Author: Alejandro Caceres, 2012-08-03, 19:33
Re: Nutch 1.5.1 Solr 3.6.1 Error - Nutch - [mail # user]
...Hey Kevin,  Check your "schema.xml" file. This file specifies what fields Sole "knows about" when indexing. I suspect you have not edited it to include what nutch is trying to index. Do...
   Author: Alejandro Caceres, 2012-07-31, 19:10
Sort:
project
Nutch (6)
type
mail # user (6)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (1)
last 9 months (6)
author
Markus Jelsma (1767)
Lewis John Mcgibbney (1118)
Julien Nioche (805)
Mattmann, Chris A (402)
lewis john mcgibbney (334)
Andrzej Bialecki (302)
Ferdy Galema (224)
Tejas Patil (163)
Bai Shen (161)
Sebastian Nagel (156)
kiran chitturi (155)
alxsss@...)
remi tassing (133)
Lewis John McGibbney (129)
Gabriele Kahlout (115)
Alejandro Caceres