Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Plain View
Mahout, mail # user - can't get <point-id, cluster-id> thru "-p"


Copy link to this message
-
can't get <point-id, cluster-id> thru "-p"
Baoqiang Cao 2012-03-14, 17:52
Hi,

Very sorry for such a trivial question but ran out of luck. I'm trying
to see which points (thru point-ids) belong to which cluster center.
Here is what I did:

mahout clusterdump -s /mahout/kmeans/clusters-15-final -d
/mahout/sparse/dictionary.file-0 -dt sequencefile   -p /mahout/points
> out

The onscreen output is:

12/03/14 12:39:52 INFO common.AbstractJob: Command line arguments:
{--dictionary=/mahout/sparse/dictionary.file-0,
--dictionaryType=sequencefile,
--distanceMeasure=org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure,
--endPhase=2147483647, --outputFormat=TEXT,
--pointsDir=/mahout/points,
--seqFileDir=/mahout/kmeans/clusters-15-final, --startPhase=0,
--tempDir=temp}
12/03/14 12:39:55 WARN snappy.LoadSnappy: Snappy native library is available
12/03/14 12:39:55 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/03/14 12:39:55 INFO snappy.LoadSnappy: Snappy native library loaded
12/03/14 12:39:55 INFO compress.CodecPool: Got brand-new decompressor
12/03/14 12:39:55 INFO compress.CodecPool: Got brand-new decompressor
12/03/14 12:39:55 INFO compress.CodecPool: Got brand-new decompressor
12/03/14 12:39:55 INFO compress.CodecPool: Got brand-new decompressor
12/03/14 12:42:07 INFO clustering.ClusterDumper: Wrote 5188 clusters
12/03/14 12:42:07 INFO driver.MahoutDriver: Program took 135276 ms
(Minutes: 2.2546)
There is nothing under "/mahout/points". Any help on why and how?

Thanks in advance.
Baoqiang
+
Pat Ferrel 2012-03-14, 19:13
+
Baoqiang Cao 2012-03-14, 22:18
+
Pat Ferrel 2012-03-19, 21:01
+
Baoqiang Cao 2012-03-20, 02:45
+
Pat Ferrel 2012-03-20, 15:35
+
Baoqiang Cao 2012-03-21, 19:06