Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Plain View
Mahout, mail # user - How much memory do I need? : Clustering : Hadoop


Copy link to this message
-
How much memory do I need? : Clustering : Hadoop
Paritosh Ranjan 2011-09-24, 09:32
Hi,

I am clustering 5 million vectors ( 200 dimensions each ) on a 8 node
cluster with 2 GB memory each using CanopyDriver. The replication factor
is 3.

The reduce phase of buildCluster is taking too long to finish.

How can I Improve the performance?

Is it related to memory? If yes, what configuration do you suggest? I
can not reduce the dimension of vectors.

Thanks and Regards,
Paritosh Ranjan
+
Jeff Eastman 2011-09-26, 16:13