Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 31 to 40 from 1769 (0.286s).
Loading phrases to help you
refine your search...
Re: Link interface for crawler - Droids - [mail # dev]
...On 01/30/2013 12:31 PM, Tobias R�bner wrote:  I prefer a well defined interface since the ContentEntity is in the end a simple HashMap where we store information. We have a couple of ...
   Author: Thorsten Scherler, 2013-01-30, 12:55
Re: Link interface for crawler - Droids - [mail # dev]
...Hi Thorsten,  I would propose to extend the ContentEntity and add the needed fields there.  The Task should only contain data releveant for executing the task. All other "meta" dat...
   Author: Tobias Rübner, 2013-01-30, 11:31
Link interface for crawler - Droids - [mail # dev]
...Hi all,  Tobias I saw that you dropped the link interface but moved the links to the contentEntity. The problem I see is that an URL needs stuff like getAnchorText if it is useful for t...
   Author: Thorsten Scherler, 2013-01-30, 11:05
Re: svn commit: r1439804 - in /incubator/droids/branches/0.2.x-cleanup/droids-core: ./ src/main/java/org/apache/droids/core/ src/main/java/org/apache/droids/handle/ src/main/java/org/apache/droids/parse/ src/main/java/org/apache/droids/taskmaster/ src/test... - Droids - [mail # dev]
...On 01/30/2013 10:42 AM, Tobias R�bner wrote:  I understand your saying about the parser I worked around that by using the getOutlinks() from the LinkTask to store them and the extract...
   Author: Thorsten Scherler, 2013-01-30, 10:22
cleanup branch - added crawler module - Droids - [mail # dev]
...Hi,  I updated the cleanup branch with a rewritten crawler module. Basically I removed all the protocol stuff and added a simple HttpClient Fetcher for retrieving web pages.  Since...
   Author: Tobias Rübner, 2013-01-30, 10:13
Re: svn commit: r1439804 - in /incubator/droids/branches/0.2.x-cleanup/droids-core: ./ src/main/java/org/apache/droids/core/ src/main/java/org/apache/droids/handle/ src/main/java/org/apache/droids/parse/ src/main/java/org/apache/droids/taskmaster/ src/test... - Droids - [mail # dev]
...Hi Thorsten,  actually while implementing the new HTTPClient Crawler I needed a simple and generic way for the parser to create new tasks. When the parser is used for extracting the lin...
   Author: Tobias Rübner, 2013-01-30, 09:42
Re: svn commit: r1439804 - in /incubator/droids/branches/0.2.x-cleanup/droids-core: ./ src/main/java/org/apache/droids/core/ src/main/java/org/apache/droids/handle/ src/main/java/org/apache/droids/parse/ src/main/java/org/apache/droids/taskmaster/ src/test... - Droids - [mail # dev]
...On 01/29/2013 04:23 PM, Thorsten Scherler wrote:  Actually I just fixed my custom code for the linkTask with      @Override     public Link createTask(URI uri) ...
   Author: Thorsten Scherler, 2013-01-29, 15:31
Re: svn commit: r1439804 - in /incubator/droids/branches/0.2.x-cleanup/droids-core: ./ src/main/java/org/apache/droids/core/ src/main/java/org/apache/droids/handle/ src/main/java/org/apache/droids/parse/ src/main/java/org/apache/droids/taskmaster/ src/test... - Droids - [mail # dev]
...On 01/29/2013 10:50 AM, [EMAIL PROTECTED] wrote:  Why did you added createTask to the interface?  IMO it is not really generic since seeing your implementation and my current use c...
   Author: Thorsten Scherler, 2013-01-29, 15:23
Solr vs. ElasticSearch: Part 6 – User & Dev Communities | Sematext Blog on WordPress.com - all - [Sematext # blog]
...Solr vs. ElasticSearch: Part 6 – User & Dev Communities January 22, 2013 by Rafał Kuć 2 Comments One of t...
http://blog.sematext.com/2013/01/22/solr-vs-elasticsearch-userdev-communiti.../    2013-01-22, 00:00
RE - Droids - [mail # dev]
...http://luselphotography.com/wp-content/themes/Professional_Photography/yahool2.php...
   Author: Ryan McKinley, 2013-01-21, 21:18
Sort:
project
Lucene (136344)
Solr (105610)
ElasticSearch (35138)
Mahout (31754)
Nutch (16927)
ManifoldCF (15210)
Tika (6014)
Lucene.Net (5810)
PyLucene (1924)
Droids (1674)
Lucy (1405)
OpenRelevance (286)
type
mail # dev (846)
javadoc (468)
issue (176)
source code (163)
Sematext # blog (95)
wiki (16)
web site (5)
date
last 7 days (0)
last 30 days (10)
last 90 days (14)
last 6 months (43)
last 9 months (1138)
author
Thorsten Scherler (217)
Richard Frovarp (153)
Chapuis Bertil (75)
Eugen Paraschiv (68)
Ryan McKinley (68)
Oleg Kalnichevski (37)
Tobias Rübner (36)
Otis Gospodnetic (33)
Mingfai (32)
Ross Gardler (29)
Javier Puerto (28)
Tony Dietrich (22)
Bertil Chapuis (21)
Grant Ingersoll (18)
Fuad Efendi (17)