|
|
-
Re: [jira] [Commented] (MAHOUT-929) Refactor Clustering (Vector Classification) into a Separate Postprocess with Outlier PruningJeff Eastman 2012-02-23, 12:29
Just +1 <grin>
On 2/22/12 10:35 PM, Paritosh Ranjan (Commented) (JIRA) wrote: > [ https://issues.apache.org/jira/browse/MAHOUT-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214329#comment-13214329 ] > > Paritosh Ranjan commented on MAHOUT-929: > ---------------------------------------- > > Assigned to myself. > > I think cluster classification driver is developed now. Would wait for some time for the ClusterClassificationMapper's Test case ( patch ) as we asked on dev. > > Else I will write it and commit it. Might need help while committing for the first time. > > Considering, ClusterClassificationDriver development is done, we need to refactor the KMeans, FuzzyK, Dirichlet, Canopy Drivers. > I will create separate child issues for refactoring these algos, so that different people can pick it in parallel, if they want. It will help in avoiding duplicate efforts. > > Jeff, any comments/suggestions? > >> Refactor Clustering (Vector Classification) into a Separate Postprocess with Outlier Pruning >> -------------------------------------------------------------------------------------------- >> >> Key: MAHOUT-929 >> URL: https://issues.apache.org/jira/browse/MAHOUT-929 >> Project: Mahout >> Issue Type: Improvement >> Components: Classification, Clustering >> Affects Versions: 0.6 >> Reporter: Jeff Eastman >> Assignee: Paritosh Ranjan >> Fix For: 0.7 >> >> Attachments: Mahout-929, Mahout-929, Mahout-929, Mahout-929 >> >> >> The current clustering drivers have a -cp option to produce clusteredPoints directory containing the input vectors classified by the final clusters produced by the algorithm. These options are redundantly implemented in those drivers. >> - Factor out& implement an independent post processor to perform the classification step independently of the various clustering implementations. >> - Implement a pluggable outlier removal capability for this classifier. >> - Consider building off of the ClusterClassifier& ClusterIterator ideas. > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > > > > |