| clear query|facets|time |
Search criteria: .
Results from 21 to 30 from
1760 (3.328s).
|
|
|
Loading phrases to help you refine your search...
|
|
Re: Link interface for crawler - Droids - [mail # dev]
|
|
...On 01/30/2013 01:55 PM, Thorsten Scherler wrote: The problem as well I see ATM is that we do public abstract class CrawlingDroid extends AbstractDroid { but LinkTask is no interface wh...
|
|
|
Author: Thorsten Scherler,
2013-01-30, 13:30
|
|
|
Re: Link interface for crawler - Droids - [mail # dev]
|
|
...On 01/30/2013 12:31 PM, Tobias R�bner wrote: I prefer a well defined interface since the ContentEntity is in the end a simple HashMap where we store information. We have a couple of ...
|
|
|
Author: Thorsten Scherler,
2013-01-30, 12:55
|
|
|
Re: Link interface for crawler - Droids - [mail # dev]
|
|
...Hi Thorsten, I would propose to extend the ContentEntity and add the needed fields there. The Task should only contain data releveant for executing the task. All other "meta" dat...
|
|
|
Author: Tobias Rübner,
2013-01-30, 11:31
|
|
|
Link interface for crawler - Droids - [mail # dev]
|
|
...Hi all, Tobias I saw that you dropped the link interface but moved the links to the contentEntity. The problem I see is that an URL needs stuff like getAnchorText if it is useful for t...
|
|
|
Author: Thorsten Scherler,
2013-01-30, 11:05
|
|
|
Re: svn commit: r1439804 - in /incubator/droids/branches/0.2.x-cleanup/droids-core: ./ src/main/java/org/apache/droids/core/ src/main/java/org/apache/droids/handle/ src/main/java/org/apache/droids/parse/ src/main/java/org/apache/droids/taskmaster/ src/test... - Droids - [mail # dev]
|
|
...On 01/30/2013 10:42 AM, Tobias R�bner wrote: I understand your saying about the parser I worked around that by using the getOutlinks() from the LinkTask to store them and the extract...
|
|
|
Author: Thorsten Scherler,
2013-01-30, 10:22
|
|
|
cleanup branch - added crawler module - Droids - [mail # dev]
|
|
...Hi, I updated the cleanup branch with a rewritten crawler module. Basically I removed all the protocol stuff and added a simple HttpClient Fetcher for retrieving web pages. Since...
|
|
|
Author: Tobias Rübner,
2013-01-30, 10:13
|
|
|
Re: svn commit: r1439804 - in /incubator/droids/branches/0.2.x-cleanup/droids-core: ./ src/main/java/org/apache/droids/core/ src/main/java/org/apache/droids/handle/ src/main/java/org/apache/droids/parse/ src/main/java/org/apache/droids/taskmaster/ src/test... - Droids - [mail # dev]
|
|
...Hi Thorsten, actually while implementing the new HTTPClient Crawler I needed a simple and generic way for the parser to create new tasks. When the parser is used for extracting the lin...
|
|
|
Author: Tobias Rübner,
2013-01-30, 09:42
|
|
|
Re: svn commit: r1439804 - in /incubator/droids/branches/0.2.x-cleanup/droids-core: ./ src/main/java/org/apache/droids/core/ src/main/java/org/apache/droids/handle/ src/main/java/org/apache/droids/parse/ src/main/java/org/apache/droids/taskmaster/ src/test... - Droids - [mail # dev]
|
|
...On 01/29/2013 04:23 PM, Thorsten Scherler wrote: Actually I just fixed my custom code for the linkTask with @Override public Link createTask(URI uri) ...
|
|
|
Author: Thorsten Scherler,
2013-01-29, 15:31
|
|
|
Re: svn commit: r1439804 - in /incubator/droids/branches/0.2.x-cleanup/droids-core: ./ src/main/java/org/apache/droids/core/ src/main/java/org/apache/droids/handle/ src/main/java/org/apache/droids/parse/ src/main/java/org/apache/droids/taskmaster/ src/test... - Droids - [mail # dev]
|
|
...On 01/29/2013 10:50 AM, [EMAIL PROTECTED] wrote: Why did you added createTask to the interface? IMO it is not really generic since seeing your implementation and my current use c...
|
|
|
Author: Thorsten Scherler,
2013-01-29, 15:23
|
|
|
Solr vs. ElasticSearch: Part 6 – User & Dev Communities | Sematext Blog on WordPress.com - all - [Sematext # blog]
|
|
...Solr vs. ElasticSearch: Part 6 – User & Dev Communities January 22, 2013 by Rafał Kuć 2 Comments One of t...
|
|
|
http://blog.sematext.com/2013/01/22/solr-vs-elasticsearch-userdev-communiti.../
2013-01-22, 00:00
|
|
|
|