Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Nutch, mail # user - Is it possible to crawl yahoo answer?


Copy link to this message
-
Re: Is it possible to crawl yahoo answer?
tamanjit.bindra@...) 2011-07-15, 12:04
Don't think that should be a problem. Though I still feel you would have to
try to actually know, because am not sure if it is going to crawl to an
encrypted url (Experts please help here)

Just make sure the following line is coomented out in crawl-urlfilter.txt:

# skip URLs containing certain characters as probable queries, etc.
#-[?*!@=]

And add the following line:

+^http://answers.yahoo.com/([a-zA-Z0-9-_/]*)

Hopefully it should work. Good luck.

--
View this message in context: http://lucene.472066.n3.nabble.com/Is-it-possible-to-crawl-yahoo-answer-tp3171559p3171764.html
Sent from the Nutch - User mailing list archive at Nabble.com.