Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Mahout, mail # dev - Helping out with the .7 release


Copy link to this message
-
Re: Helping out with the .7 release
Jeff Eastman 2012-02-22, 15:56
Hi Saikat,

I agree with Paritosh, that a great place to begin would be to write
some unit tests. This will familiarize you with the code base and help
us a lot with our 0.7 housekeeping release. The new clustering
classification components are going to unify many - but not all - of the
existing clustering algorithms to reduce their complexity by factoring
out duplication and streamlining their integration into semi-supervised
classification engines.

Please feel free to post any questions you may have in reading through
this code. This is a major refactoring effort and we will need all the
help we can get. Thanks for the offer,

Jeff

On 2/21/12 10:46 PM, Saikat Kanjilal wrote:
> Hi Paritosh,Yes creating the test case would be a great first start, however are there other tasks you guys need help with before I can do before the test creation, I will sync trunk and start reading through the code in the meantime.Regards
>
>> Date: Wed, 22 Feb 2012 10:57:51 +0530
>> From: [EMAIL PROTECTED]
>> To: [EMAIL PROTECTED]
>> Subject: Re: Helping out with the .7 release
>>
>> We are creating clustering as classification components which will help
>> in moving clustering out. Once the component is ready, then the
>> clustering algorithms would need refactoring.
>> The clustering as classification component and the outlier removal
>> component has been created.
>>
>> Most of it is committed, and rest is available as a patch. See
>> https://issues.apache.org/jira/browse/MAHOUT-929
>> If you will apply the latest patch available on Mahout-929 you can see
>> all that is available now.
>>
>> If you want, you can help with the test case of
>> ClusterClassificationMapper available in the patch.
>>
>> On 22-02-2012 10:27, Saikat Kanjilal wrote:
>>> Hi Guys,I was interested in helping out with the clustering component of mahout, I looked through the JIRA items below and was wondering if there is a specific one that would be good to start with:
>>>
>>> https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+MAHOUT+AND+resolution+%3D+Unresolved+AND+component+%3D+Clustering+ORDER+BY+priority+DESC&mode=hide
>>>
>>> I initially was thinking to work on Mahout-930 or Mahout-931 but could work on others if needed.
>>> Best Regards  
>