Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 41 to 50 from 1767 (0.342s).
Loading phrases to help you
refine your search...
RE: Nutch support with regards to Deduplication and Document versioning - Nutch - [mail # user]
...If you use 1.x and don't merge segments you still have older versions of documents. There is no active versioning in Nutch 1x except segment naming and merging, if you use it.  ...
   Author: Markus Jelsma, 2013-01-23, 08:28
RE: conditional indexing - Nutch - [mail # user]
...Hi - i've not yet committed a fix for: https://issues.apache.org/jira/browse/NUTCH-1449  This will allow you to stop documents from being indexed from within your indexing filter. Order...
   Author: Markus Jelsma, 2013-01-23, 08:26
[NUTCH-1219] Upgrade all jobs to new MapReduce API - Nutch - [issue]
...We should upgrade to the new Hadoop API for Nutch trunk as already has been done for the Nutchgora branch. If i'm not mistaken we can already upgrade to the latest 0.20.5 version that still ...
http://issues.apache.org/jira/browse/NUTCH-1219    Author: Markus Jelsma, 2013-01-21, 09:33
RE: Synthetic Tokens - Nutch - [mail # user]
...Hi,  In Nutch a `synthetic token` maps to a field/value pair.  You need an indexing filter to read the key/value pair from the parsed metadata and add it as a field/value pair to t...
   Author: Markus Jelsma, 2013-01-21, 09:23
[NUTCH-1223] Migrate WebGraph to MapReduce API - Nutch - [issue]
http://issues.apache.org/jira/browse/NUTCH-1223    Author: Markus Jelsma, 2013-01-21, 08:40
RE: [CALL FOR TESTING] NUTCH-1047 Pluggable indexing backends - Nutch - [mail # dev]
...Sure, will look into this next week! Thanks for the good work and have a nice weekend!!!!!  MArkus    ...
   Author: Markus Jelsma, 2013-01-18, 16:26
[NUTCH-1088] Write Solr XML documents - Nutch - [issue]
...Documents need to be reindexed when index-time analysis is modified. Indexing individual segments from Nutch is tedious, especially for small segments. This issue should add a feature that c...
http://issues.apache.org/jira/browse/NUTCH-1088    Author: Markus Jelsma, 2013-01-18, 14:59
[NUTCH-1449] Optionally delete documents skipped by IndexingFilters - Nutch - [issue]
...Add configuration option to delete documents instead of skipping them if the indexing filters return null. This is useful to delete documents with new business logic in the indexing filter c...
http://issues.apache.org/jira/browse/NUTCH-1449    Author: Markus Jelsma, 2013-01-18, 12:08
[NUTCH-1520] SegmentMerger looses records - Nutch - [issue]
...It seems the SegmentMerger tool looses documents. You're likely to see less documents in an index if you index one or more already merged segments than if you index all unmerged segments.Thi...
http://issues.apache.org/jira/browse/NUTCH-1520    Author: Markus Jelsma, 2013-01-17, 12:16
[NUTCH-1480] SolrIndexer to write to multiple servers. - Nutch - [issue]
...SolrUtils should return an array of SolrServers and read the SolrUrl as a comma delimited list of URL's using Configuration.getString(). SolrWriter should be able to handle this list of Solr...
http://issues.apache.org/jira/browse/NUTCH-1480    Author: Markus Jelsma, 2013-01-17, 12:11
Sort:
project
Nutch (1767)
Solr (909)
Tika (56)
Lucene (9)
type
mail # user (1302)
mail # dev (270)
issue (195)
date
last 7 days (0)
last 30 days (2)
last 90 days (22)
last 6 months (183)
last 9 months (1767)
author
Markus Jelsma (1767)
Lewis John Mcgibbney (1110)
Julien Nioche (805)
Mattmann, Chris A (399)
lewis john mcgibbney (334)
Andrzej Bialecki (302)
Ferdy Galema (224)
Bai Shen (161)
Tejas Patil (157)
Sebastian Nagel (155)
kiran chitturi (155)
alxsss@...)
remi tassing (133)
Lewis John McGibbney (129)
Gabriele Kahlout (115)