-Re: Evalutation of recommenders
Sean Owen 2012-04-10, 19:33
You're talking about recommendations now... are we talking about a
clustering, classification or recommender system?
In general I don't know if it makes sense for business users to be
deciding aspects of the internal model. At most someone should input
the tradeoffs -- how important is accuracy vs speed? those kinds of
things. Then it's an optimization problem. But, understood, maybe you
need to let people explore these things manually at first.
On Tue, Apr 10, 2012 at 2:21 PM, Saikat Kanjilal <[EMAIL PROTECTED]> wrote:
> The question really is what are some tried approaches to figure out how to measure the quality of a set of algorithms currently being used for clustering/classification?
> And in thinking about this some more we also need to be able to regenerate models as soon as the business users tweak the weights associated with features inside a feature vector, we need to figure out a way to efficiently tie this into our online workflow which could show updated recommendations every few hours?
> When I say picking an algorithm on the fly what I mean is that we need to continuously test our basket of algorithms based on a new set of training data and make the determination offline as to which of the algorithms to use at that moment to regenerate our recommendations.
>> Date: Tue, 10 Apr 2012 14:08:17 -0500
>> Subject: Re: Evalutation of recommenders
>> From: [EMAIL PROTECTED]
>> To: [EMAIL PROTECTED]
>> Picking an algorithm 'on the fly' is almost surely not realistic --
>> well, I am not sure what eval process you would run in milliseconds.
>> But it's also unnecessary; you usually run evaluations offline on
>> training/test data that reflects real input, and then, the resulting
>> tuning should be fine for that real input that comes the next day.
>> Is that really the question, or are you just asking about how you
>> measure the quality of clustering or a classifier?
>> On Tue, Apr 10, 2012 at 10:41 AM, Saikat Kanjilal <[EMAIL PROTECTED]> wrote:
>> > Hi everyone,We're looking at building out some clustering and classification algorithms using mahout and one of the things we're also looking at doing is to build performance metrics around each of these algorithms, as we go down the path of choosing the best model in an iterative closed feedback loop (i.e. our business users manipulate weights for each attribute for our feature vectors->we use these changes to regenerate an asynchronous model using the appropriate clustering/classification algorithms and then replenish our online component with this newly recalculated data for fresh recommendations). So our end goal is to have a basket of algorithms and use a set of performance metrics to pick and choose the right algorithm on the fly. I was wondering if anyone has done this type of analysis before and if so are there approaches that have worked well and approaches that haven't when it comes to measuring the "quality" of each of the recommendation algorithms.
>> > Regards