| clear query|facets|time |
Search criteria: .
Results from 41 to 50 from
1767 (0.342s).
|
|
|
Loading phrases to help you refine your search...
|
|
RE: Nutch support with regards to Deduplication and Document versioning - Nutch - [mail # user]
|
|
...If you use 1.x and don't merge segments you still have older versions of documents. There is no active versioning in Nutch 1x except segment naming and merging, if you use it. ...
|
|
|
Author: Markus Jelsma,
2013-01-23, 08:28
|
|
|
RE: conditional indexing - Nutch - [mail # user]
|
|
...Hi - i've not yet committed a fix for: https://issues.apache.org/jira/browse/NUTCH-1449 This will allow you to stop documents from being indexed from within your indexing filter. Order...
|
|
|
Author: Markus Jelsma,
2013-01-23, 08:26
|
|
|
[NUTCH-1219] Upgrade all jobs to new MapReduce API - Nutch - [issue]
|
|
...We should upgrade to the new Hadoop API for Nutch trunk as already has been done for the Nutchgora branch. If i'm not mistaken we can already upgrade to the latest 0.20.5 version that still ...
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1219
Author: Markus Jelsma,
2013-01-21, 09:33
|
|
|
RE: Synthetic Tokens - Nutch - [mail # user]
|
|
...Hi, In Nutch a `synthetic token` maps to a field/value pair. You need an indexing filter to read the key/value pair from the parsed metadata and add it as a field/value pair to t...
|
|
|
Author: Markus Jelsma,
2013-01-21, 09:23
|
|
|
[NUTCH-1223] Migrate WebGraph to MapReduce API - Nutch - [issue]
|
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1223
Author: Markus Jelsma,
2013-01-21, 08:40
|
|
|
RE: [CALL FOR TESTING] NUTCH-1047 Pluggable indexing backends - Nutch - [mail # dev]
|
|
...Sure, will look into this next week! Thanks for the good work and have a nice weekend!!!!! MArkus ...
|
|
|
Author: Markus Jelsma,
2013-01-18, 16:26
|
|
|
[NUTCH-1088] Write Solr XML documents - Nutch - [issue]
|
|
...Documents need to be reindexed when index-time analysis is modified. Indexing individual segments from Nutch is tedious, especially for small segments. This issue should add a feature that c...
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1088
Author: Markus Jelsma,
2013-01-18, 14:59
|
|
|
[NUTCH-1449] Optionally delete documents skipped by IndexingFilters - Nutch - [issue]
|
|
...Add configuration option to delete documents instead of skipping them if the indexing filters return null. This is useful to delete documents with new business logic in the indexing filter c...
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1449
Author: Markus Jelsma,
2013-01-18, 12:08
|
|
|
[NUTCH-1520] SegmentMerger looses records - Nutch - [issue]
|
|
...It seems the SegmentMerger tool looses documents. You're likely to see less documents in an index if you index one or more already merged segments than if you index all unmerged segments.Thi...
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1520
Author: Markus Jelsma,
2013-01-17, 12:16
|
|
|
[NUTCH-1480] SolrIndexer to write to multiple servers. - Nutch - [issue]
|
|
...SolrUtils should return an array of SolrServers and read the SolrUrl as a comma delimited list of URL's using Configuration.getString(). SolrWriter should be able to handle this list of Solr...
|
|
|
http://issues.apache.org/jira/browse/NUTCH-1480
Author: Markus Jelsma,
2013-01-17, 12:11
|
|
|
|