Toggle navigation
Search
/
Big Data
/
DevOps
About
project
Nutch
(49365)
ElasticSearch
(216813)
Solr
(174825)
Mahout
(49737)
Lucene
(26372)
ManifoldCF
(22981)
Tika
(15748)
PyLucene
(2772)
Lucene.Net
(2465)
Lucy
(1407)
author
Markus Jelsma
(2556)
Lewis John Mcgibbney
(1784)
Andrzej Bialecki
(1638)
Julien Nioche
(1181)
Stefan Groschupf
(819)
Sebastian Nagel
(799)
Dennis Kubes
(745)
Mattmann, Chris A
(671)
Doug Cutting
(667)
Doğacan Güney
(448)
lewis john mcgibbney
(410)
Jérôme Charron
(398)
Sami Siren
(397)
Tejas Patil
(343)
Lewis John McGibbney
(290)
ogjunk-nutch@...
(269)
Piotr Kosiorowski
(263)
Chris Mattmann
(239)
Ken Krugler
(238)
Ferdy Galema
(229)
Gal Nitzan
(225)
alxsss@...
(220)
MilleBii
(218)
Jack Tang
(194)
Bai Shen
(188)
Susam Pal
(170)
kiran chitturi
(167)
Otis Gospodnetic
(166)
feng lu
(165)
Byron Miller
(160)
Alexander Aristov
(159)
remi tassing
(158)
Fuad Efendi
(154)
Raghavendra Prabhu
(146)
Talat Uyarer
(145)
Jorge Luis Betancourt Gon...
(130)
AJ Chen
(117)
Michael Ji
(114)
TDLN
(112)
Sean Dean
(111)
Howie Wang
(110)
A Laxmi
(105)
Richard Braman
(103)
BELLINI ADAM
(101)
BlackIce
(100)
Marek Bachmann
(99)
Stefan Neufeind
(94)
Dawid Weiss
(93)
reinhard schwab
(93)
S.L
(92)
Zaheed Haque
(91)
kaveh minooie
(90)
webdev1977
(88)
Arkadi.Kosmynin@...
(87)
yoursoft@...
(87)
Marko Bauhardt
(85)
Joe Zhang
(83)
Michael Wechner
(83)
Briggs
(82)
Vanderdray, Jacob
(82)
type
mail # user
(33599)
mail # dev
(9759)
javadoc
(2854)
issue
(2561)
source code
(900)
wiki
(61)
web site
(7)
date
last 7 days (25)
last 30 days (79)
last 90 days (322)
last 6 months (698)
last 9 months (20087)
Search
Sort:
time-biased relevance
relevancy
newest on top
oldest on top
clear
query
|
facets
|
time
Search criteria:
. Results from
1
to
10
from
49365
(0.0s).
Loading phrases to help you
refine your search...
[NUTCH-2571] SegmentReader -list fails to read segment
-
Nutch
- [issue]
...The -list command of SegmentReader fails to read data from segments:% bin/nutch readseg -list crawl/segments/20180409100315/ Exception in thread "main" java.io.IOException: wrong value class...
http://issues.apache.org/jira/browse/NUTCH-2571
Author:
Sebastian Nagel
, 2018-04-23, 12:14
[NUTCH-2375] Upgrade the code base from org.apache.hadoop.mapred to org.apache.hadoop.mapreduce
-
Nutch
- [issue]
...Nutch is still using the deprecated org.apache.hadoop.mapred dependency which has been deprecated. It need to be updated to org.apache.hadoop.mapreduce dependency....
http://issues.apache.org/jira/browse/NUTCH-2375
Author:
Omkar Reddy
, 2018-04-23, 11:56
[NUTCH-2572] HostDb: updatehostdb does not set values
-
Nutch
- [issue]
...% bin/nutch readdb crawl/crawldb -stats -sort...status 1 (db_unfetched): 3 nutch.apache.org : 3status 2 (db_fetched): 2 nutch....
http://issues.apache.org/jira/browse/NUTCH-2572
Author:
Sebastian Nagel
, 2018-04-23, 11:56
[NUTCH-2570] Deduplication job fails to install deduplicated CrawlDb
-
Nutch
- [issue]
...The DeduplicationJob ("nutch dedup") fails to install the deduplicated CrawlDb and leaves only the "old" crawldb (if "db.preserve.backup" is true):% tree crawldbcrawldb├── current│ └── par...
http://issues.apache.org/jira/browse/NUTCH-2570
Author:
Sebastian Nagel
, 2018-04-23, 11:26
[NUTCH-2544] Nutch 1.15 no longer compatible with AWS EMR and S3
-
Nutch
- [issue]
...Nutch 1.14 is working OK with AWS EMR and S3 storage, but NUTCH-2375 appears to have broken this.Generator partitioning fails with Error: java.lang.NullPointerException at org.apache.nutch.c...
http://issues.apache.org/jira/browse/NUTCH-2544
Author:
Steven W
, 2018-04-23, 11:26
[NUTCH-2526] NPE in scoring-opic when indexing document without CrawlDb datum
-
Nutch
- [issue]
...I was trying to write a parse filter plugin whose work was to parse internal links as a separate document.what I did basically is,breaking the page into multiple parseResults each parseResul...
http://issues.apache.org/jira/browse/NUTCH-2526
Author:
Yash Thenuan
, 2018-04-23, 09:53
[NUTCH-2456] Allow to index pages/URLs not contained in CrawlDb
-
Nutch
- [issue]
...If http.redirect.max is set to a positive value, the Fetcher will follow redirects, creating a new CrawlDatum.If the redirected URL is fetched and parsed, during indexing for it we have a sp...
http://issues.apache.org/jira/browse/NUTCH-2456
Author:
Yossi Tamari
, 2018-04-23, 08:29
[NUTCH-2569] ClassNotFoundException when running in (pseudo-)distributed mode
-
Nutch
- [issue]
...The CrawlDb / updatedb job fails in pseudo-distributed mode with a ClassNotFoundException:18/04/22 19:24:49 INFO mapreduce.Job: Task Id : attempt_1524395182329_0018_m_000000_0, Status : FAIL...
http://issues.apache.org/jira/browse/NUTCH-2569
Author:
Sebastian Nagel
, 2018-04-22, 19:49
[NUTCH-2517] mergesegs corrupts segment data
-
Nutch
- [issue]
...The problem probably occurs since commit https://github.com/apache/nutch/commit/54510e503f7da7301a59f5f0e5bf4509b37d35b4How to reproduce: create container from apache/nutch image (latest) op...
http://issues.apache.org/jira/browse/NUTCH-2517
Author:
Marco Ebbinghaus
, 2018-04-22, 19:18
[NUTCH-1228] Change mapred.task.timeout to mapreduce.task.timeout in fetcher
-
Nutch
- [issue]
http://issues.apache.org/jira/browse/NUTCH-1228
Author:
Markus Jelsma
, 2018-04-21, 17:14
1
2
3
4
5
Next >