-Re: What are the biggest changes in 0.4?
deneche abdelhakim 2010-10-19, 15:08
You can now save a random forest and use it to classify new data.
On Tue, Oct 19, 2010 at 3:40 PM, Sebastian Schelter <[EMAIL PROTECTED]> wrote:
> Here's the stuff I've been working on in 0.4:
> * Map/Reduce job to compute the pairwise similarities of the rows of a
> matrix using a customizable similarity measure (with implementations already
> provided for cooccurrence, euclidean distance, loglikelihood, pearson
> correlation, tanimoto-coefficient, cosine)
> * Map/Reduce job to compute the item-item-similarities for itembased
> collaborative filtering
> * RecommenderJob has been evolved to a fully distributed itembased
> On 19.10.2010 16:30, Jeff Eastman wrote:
>> On 10/19/10 7:00 AM, Sean Owen wrote:
>>> I've even lost track of what the big-ticket changes have been since 0.3.
>>> compiling 7-8 bullet points for the release notes, as I am going through
>>> release process now.
>>> Would anyone please volunteer some bullet points? I don't want to miss
>>> anything and want to describe it correctly. I'll do my best to fill in
>>> seems missing.
>> For clustering, here's a few:
>> * Model refactoring and CLI changes to improve integration and
>> * New ClusterEvaluator and CDbwClusterEvaluator offer new ways to
>> evaluate clustering effectiveness
>> * New Spectral Clustering and MinHash Clustering from GSoC (still
>> * New VectorModelClassifier allows any set of clusters to be used
>> for classification