Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Mahout, mail # user - Kmeans cluster mapping to actual document IDs


Copy link to this message
-
Kmeans cluster mapping to actual document IDs
Hossein Kazemi 2012-04-11, 10:15
Hi,
I have clustered a set of documents using the Mahout's Kmeans
(map-reduce) I used Sparse Vectors due to the large size of my corpus.
In the book it says that the folder named ClusteredPoints contains the
mapping between the clustered documents and the document IDs. However,
all I can see is just a "1:0" , a feature-vector and a ClusterID. where
can I find the actual document names/ids ?
thx