Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 10 from 17 (0.465s).
Loading phrases to help you
refine your search...
Re: Tika's outlink is not as expected - Nutch - [mail # user]
...Thank you. I just found it a minute ago and was going to write the email.  ([;_]?((?i)l|j|bv_)?((?i)sid|phpsessid|sessionid)=.*?)(\?|&|#|$)  Perhaps, I was too tired yesterday ...
   Author: Ake Tangkananond, 2012-08-15, 07:47
Re: Tika's outlink is not as expected - Nutch - [mail # user]
...Hi Ferdy,  Thanks for you advise. I don't have any special filtering/normalizing rules except the standard one. I even try disabling all url normalization plugin, but the result is no d...
   Author: Ake Tangkananond, 2012-08-14, 16:29
Re: Tika's outlink is not as expected - Nutch - [mail # user]
...Thanks for reply Ferdy.  Variable 'db.max.outlinks.per.page' was set to 100. And I could parse HTML fine.   Regards, Ake Tangkananond     On 8/14/12 6:43 PM, "Ferdy Galem...
   Author: Ake Tangkananond, 2012-08-14, 11:49
Tika's outlink is not as expected - Nutch - [mail # user]
...Hi,  I'm getting an unexpected behavior from nutch parsing mechanism. Perhaps I don't really understand Nucth well. Here is what I find it weird. Could you please advise?  I crawl ...
   Author: Ake Tangkananond, 2012-08-14, 11:15
Re: Nutch 2 encoding - Nutch - [mail # user]
...Hi,  I'm debugging.  I inserted a code to print out the encoding here in HtmlParser:java function getParse and it printed utf-8. So I think it might be the data store problem. What...
   Author: Ake Tangkananond, 2012-08-09, 18:05
Re: Nutch 2 encoding - Nutch - [mail # user]
...Hi,  Sorry for late reply. I was trying to figure out myself but seem no luck.  I'm on Hbase with local deploy version 0.90.6, r1295128, the working version as said in Wiki: http:/...
   Author: Ake Tangkananond, 2012-08-09, 16:06
Nutch 2 encoding - Nutch - [mail # user]
...Hi all,  I just wonder if Nutch 2 is working fine with non english characters in your deployment? Thai language used to work fine for me in Nutch 1.5 but not in Nutch 2. Did I miss some...
   Author: Ake Tangkananond, 2012-08-09, 14:05
Nutch plugins/feed - Nutch - [mail # user]
...Hi,  I see there is a rss parser under src/plugins but it wasn't put into deployment profile in src/plugins/build.xml. Are there a substitution to this parser now? Which one I should us...
   Author: Ake Tangkananond, 2012-08-08, 07:54
Re: Filter out document before sending to solr index - Nutch - [mail # user]
...Cool. Thanks. !!  Regards, Ake Tangkananond     On 8/7/12 2:58 PM, "Ferdy Galema"  wrote:  ...
   Author: Ake Tangkananond, 2012-08-07, 08:05
Filter out document before sending to solr index - Nutch - [mail # user]
...Hi,  Is it possible to filter out some document from being indexed in Solr when executing command "bin/nutch solrindex" ?  Note: Those document are fine living in the nutch datasto...
   Author: Ake Tangkananond, 2012-08-07, 07:49
Sort:
project
Nutch (17)
type
mail # user (17)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (0)
last 9 months (17)
author
Markus Jelsma (1767)
Lewis John Mcgibbney (1125)
Julien Nioche (805)
Mattmann, Chris A (402)
lewis john mcgibbney (334)
Andrzej Bialecki (302)
Ferdy Galema (224)
Tejas Patil (164)
Bai Shen (163)
kiran chitturi (157)
Sebastian Nagel (156)
alxsss@...)
remi tassing (133)
Lewis John McGibbney (129)
Gabriele Kahlout (115)
Ake Tangkananond