Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Nutch, mail # user - Setting the Fetch time with a CustomFetchSchedule


Copy link to this message
-
RE: Setting the Fetch time with a CustomFetchSchedule
Markus Jelsma 2012-05-21, 12:44
Yes, you can pass ParseMeta keys to the FetchSchedule as part of the CrawlDatum's meta data as i did with:
https://issues.apache.org/jira/browse/NUTCH-1024
 
 
-----Original message-----
> From:Vikas Hazrati <[EMAIL PROTECTED]>
> Sent: Mon 21-May-2012 13:44
> To: [EMAIL PROTECTED]
> Subject: Setting the Fetch time with a CustomFetchSchedule
>
> Hi,
>
> I would like to implement a custom implementation of AbstractFetchSchedule
> and would like to change the FetchTime on the basis of some parameters that
> I get as a part of my parsing.
>
> // something like this
> datum.setFetchTime(fetchTime + (long)datum.getFetchInterval() * 1000 +
> customLogic);
>
> Right now I have a custom URLFilter and a custom parser which extends
> HtmlParseFilter. At the time of custom parsing, I come across some
> parameters which would help me define how should I define the fetchtime for
> that URL. I would like to pass these values to my CustomFetchSchedule.
>
> Is there a way to do that? Can I pass them as a part of configuration?
>
> Since I would get the data that i need to make a decision only as a part of
> Parse, would it be possible to pass this data to the FetchSchedule?
>
> Thoughts?
>
> Regards | Vikas
>