| clear query|facets|time |
Search criteria: .
Results from 61 to 70 from
107 (0.774s).
|
|
|
Loading phrases to help you refine your search...
|
|
Re: How to find the k most similar docs - Mahout - [mail # user]
|
|
...Pat, MatrixDump expects an input file of . The matrix that gets created from RowIdJob is and you cannot run MatrixDump to see the contents of the matrix. You need to use...
|
|
|
Author: Suneel Marthi,
2012-03-09, 12:26
|
|
|
Re: Minhash review - Mahout - [mail # dev]
|
|
...That's correct. ________________________________ From: Frank Scholten To: [EMAIL PROTECTED] Sent: Thursday, March 8, 2012 4:17 AM Subject: Re: Minhash review &...
|
|
|
Author: Suneel Marthi,
2012-03-08, 12:44
|
|
|
Re: Minhash review - Mahout - [mail # dev]
|
|
...Frank, I modified the present MinHash to hash on the index as opposed to the present tf-idf weights, but the change had no impact on the output and I still get bad clusters. I di...
|
|
|
Author: Suneel Marthi,
2012-03-08, 07:22
|
|
|
Re: How to find the k most similar docs - Mahout - [mail # user]
|
|
...Did the RowSimilarityJob execute successfully? Your output should have been one or more part-r-* files (depending on the number of reducers you have configured in ur environment). &nb...
|
|
|
Author: Suneel Marthi,
2012-03-07, 02:25
|
|
|
Re: How to find the k most similar docs - Mahout - [mail # user]
|
|
...Pat, Your input to RowSimilarity seems to be the tfidf-vectors directory which is . Before executing the RowSimilarity job u need to run the RowIdJob which creates a matrix of . ...
|
|
|
Author: Suneel Marthi,
2012-03-05, 19:48
|
|
|
Re: How to find the k most similar docs - Mahout - [mail # user]
|
|
...Pat, You are welcome. FYI... Another option you could consider for determining document similarity would be 'MinHash clustering'. Mahout comes with a minHash c...
|
|
|
Author: Suneel Marthi,
2012-02-20, 20:28
|
|
|
Re: How to find the k most similar docs - Mahout - [mail # user]
|
|
...Hi Pat, 1. Please look at the discussion thread at http://mail-archives.apache.org/mod_mbox/mahout-user/201202.mbox/browser for a description of what the RowSimilarityJob does. The R...
|
|
|
Author: Suneel Marthi,
2012-02-20, 05:00
|
|
|
Re: How to find the k most similar docs - Mahout - [mail # user]
|
|
...You might want to look at the RowSimilarityJob in Mahout to determine document similarity. Here's what you would do:- Assuming that your documents have already been vector...
|
|
|
Author: Suneel Marthi,
2012-02-18, 21:27
|
|
|
Update Mahout Wiki with latest Mahout Versions - Mahout - [mail # user]
|
|
...Could someone update the Mahout wiki - http://mahout.apache.org with the correct release and development versions?...
|
|
|
Author: Suneel Marthi,
2012-02-16, 17:17
|
|
|
Re: Mahout 0.5 java.lang.IllegalStateException: No clusters found. Check your -c path. - Mahout - [mail # user]
|
|
...Did u specify the -cl option when executing kmeans? Sent from my iPhone On Feb 14, 2012, at 9:18 PM, Qiang Xu wrote: ...
|
|
|
Author: Suneel Marthi,
2012-02-15, 02:50
|
|
|
|