Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Mahout, mail # user - empty vector out of clusterdump


Copy link to this message
-
Re: empty vector out of clusterdump
Jeff Eastman 2012-03-20, 18:16
Empty? Note that the printouts of Mahout vectors prints only the
non-zero elements. It looks like you may have had many such zero vectors
and they were clustered into VL-1705919 which has zero for center and
radius. If your other clusters look differently, then I think this is
probably correct.
On 3/20/12 6:10 AM, Baoqiang wrote:
> Yes, I used -cl in kmeans step. It is that the biggest cluster is empty, all others are not empty. I don't know why.
>
> Sent from my iPhone
>
> On Mar 20, 2012, at 1:36 AM, Paritosh Ranjan<[EMAIL PROTECTED]>  wrote:
>
>> Did you run kmeans with -cl<run input vector clustering>   option set to "true"?
>>
>>
>> On 19-03-2012 07:38, Baoqiang Cao wrote:
>>> Hi,
>>>
>>> I used mahout kmeans and then clusterdump. The biggest cluster (number
>>> of members is 844992), here is the result:
>>>
>>> VL-1705919{n=844992 c=[] r=[]}
>>>          Top Terms:
>>>          Weight : [props - optional]:  Point:
>>>          1.0 : [distance=0.0]: []
>>>          1.0 : [distance=0.0]: []
>>>          1.0 : [distance=0.0]: []
>>>          1.0 : [distance=0.0]: []
>>>          1.0 : [distance=0.0]: []
>>>          1.0 : [distance=0.0]: []
>>>
>>> What does this mean? This whole cluster is made of empty vectors(members)?
>>>
>>> Best,
>>> Baoqiang
>