|
Razon, Oren
2012-07-04, 15:08
Sean Owen
2012-07-04, 15:39
Razon, Oren
2012-07-05, 07:05
Sean Owen
2012-07-05, 08:09
Sebastian Schelter
2012-07-05, 08:12
Razon, Oren
2012-07-05, 08:38
Sebastian Schelter
2012-07-05, 09:45
Razon, Oren
2012-07-05, 12:22
Sean Owen
2012-07-05, 12:58
Dmitriy Lyubimov
2012-07-05, 16:17
Razon, Oren
2012-07-06, 15:07
Razon, Oren
2012-07-06, 20:39
Sean Owen
2012-07-06, 20:52
Dmitriy Lyubimov
2012-07-06, 21:26
Ted Dunning
2012-07-06, 21:32
Dmitriy Lyubimov
2012-07-06, 21:43
Razon, Oren
2012-07-07, 18:35
|
-
A bunch of SVD questions...Razon, Oren 2012-07-04, 15:08
Hi,
I'm exploring Mahout SVD parallel implementation over Hadoop (ALS), and I would like to clarify a few things : 1. How do you recommend top K items with this job? Does the job factorize the ranking matrix, than compute a predicted ranking for each cell in the matrix, so when you need a recommendation you only need to retrieve the top K items according to prediction value for the user? Or is it factorize the matrix and require some online logic when the recommendation is being asked? 2. From my knowledge, applying a SVD technique require first to fill in all empty cells in the ranking matrix (with average ranking for example). Is it something done during the ALS job (and if so, what is the way it's being filled), or should it be done as a preprocessing step? 3. From my understanding SVD recommenders are used to predict user implicit preference. By doing so you can recommend top K items (top K items over descending orders according to the prediction). I wonder, could it be applied on a binary dataset (explicit), where my ranking matrix contain only 1\0? 4. From doing some readings I found that the timeSVD++ developed by Yehuda Koren is considered as the superior SVD implementation for SVD recommenders. I wondered if there is any kind of a parallel implementation of it on top of Hadoop? I found this proposal: https://issues.apache.org/jira/browse/MAHOUT-371 I wonder, what is the status of it? Was it being checked already? Is it stable? Did anyone experienced with it? Thanks, Oren --------------------------------------------------------------------- Intel Electronics Ltd. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
-
Re: A bunch of SVD questions...Sean Owen 2012-07-04, 15:39
SVD is not the same thing as ALS, though both are factoring matrices.
There is not a distributed SVD-based recommender, though there is a distributed SVD, and you could use it as part of a recommender system. I assume you are talking about ALS. The version of ALS in Mahout operates on the sparse rating matrix. You make recommendations by multiplying the two factored matrices back together, which gives you a dense, approximate version of the original sparse rating matrix, with blanks filled in. The k largest new entries in a row are your recs for one user. Of course you never actually compute that complete product -- it's way too big. You just recreate one row to make recs for one user. You definitely can't fill in average ratings in the rating matrix, at scale -- it makes it dense and too big. It is conceptually a good idea and that's why the literature talks about this, but I do not think it is practical. At best you can subtract the mean of the existing entries, *from the existing entries only*. This makes the fact that empty cells are conceptually 0 make more sense. (This is also why I like the Yehuda/Koren formulation of ALS, where you are not predicting ratings, but a positive interaction score. There the fact that empty means 0 is just fine. No bias terms needed.) SVD/ALS are used to factor matrices and reconstruct an approximation of the original *that is more complete*. The input values can be from whatever you want -- implicit etc. 1/0 data actually makes more sense for something like ALS as input. The paper I mention above has a slightly better generalization of that. That code wasn't finished and it almost surely will not be. It is not so much a different SVD as massaging the input to incorporate stuff like time info. I personally am not sure that the SVD is the best approach for recommenders, mostly on grounds that it is hard to scale because it is doing something more complicated. On Wed, Jul 4, 2012 at 6:08 PM, Razon, Oren <[EMAIL PROTECTED]> wrote: > Hi, > I'm exploring Mahout SVD parallel implementation over Hadoop (ALS), and I would like to clarify a few things : > 1. How do you recommend top K items with this job? Does the job factorize the ranking matrix, than compute a predicted ranking for each cell in the matrix, so when you need a recommendation you only need to retrieve the top K items according to prediction value for the user? Or is it factorize the matrix and require some online logic when the recommendation is being asked? > 2. From my knowledge, applying a SVD technique require first to fill in all empty cells in the ranking matrix (with average ranking for example). Is it something done during the ALS job (and if so, what is the way it's being filled), or should it be done as a preprocessing step? > 3. From my understanding SVD recommenders are used to predict user implicit preference. By doing so you can recommend top K items (top K items over descending orders according to the prediction). I wonder, could it be applied on a binary dataset (explicit), where my ranking matrix contain only 1\0? > 4. From doing some readings I found that the timeSVD++ developed by Yehuda Koren is considered as the superior SVD implementation for SVD recommenders. I wondered if there is any kind of a parallel implementation of it on top of Hadoop? I found this proposal: https://issues.apache.org/jira/browse/MAHOUT-371 > I wonder, what is the status of it? Was it being checked already? Is it stable? Did anyone experienced with it? > > Thanks, > Oren > > > > > > --------------------------------------------------------------------- > Intel Electronics Ltd. > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies.
-
RE: A bunch of SVD questions...Razon, Oren 2012-07-05, 07:05
Thanks for the answer Sean!
Some clarifications... Not sure I understand this sentence: "At best you can subtract the mean of the existing entries, *from the existing entries only*. This makes the fact that empty cells are conceptually 0 make more sense." Why does it make empty cells conceptually 0? If I get you right, you're saying that ALS work better if I will refer to preferences as 1\0 instead of the actual preference, am I right? If I do need to predict user preference as number, kind of prediction question and not a ranking question, does ALS can do that if I'm working with the real preference values? Thanks, Oren -----Original Message----- From: Sean Owen [mailto:[EMAIL PROTECTED]] Sent: Wednesday, July 04, 2012 18:39 To: [EMAIL PROTECTED] Subject: Re: A bunch of SVD questions... SVD is not the same thing as ALS, though both are factoring matrices. There is not a distributed SVD-based recommender, though there is a distributed SVD, and you could use it as part of a recommender system. I assume you are talking about ALS. The version of ALS in Mahout operates on the sparse rating matrix. You make recommendations by multiplying the two factored matrices back together, which gives you a dense, approximate version of the original sparse rating matrix, with blanks filled in. The k largest new entries in a row are your recs for one user. Of course you never actually compute that complete product -- it's way too big. You just recreate one row to make recs for one user. You definitely can't fill in average ratings in the rating matrix, at scale -- it makes it dense and too big. It is conceptually a good idea and that's why the literature talks about this, but I do not think it is practical. At best you can subtract the mean of the existing entries, *from the existing entries only*. This makes the fact that empty cells are conceptually 0 make more sense. (This is also why I like the Yehuda/Koren formulation of ALS, where you are not predicting ratings, but a positive interaction score. There the fact that empty means 0 is just fine. No bias terms needed.) SVD/ALS are used to factor matrices and reconstruct an approximation of the original *that is more complete*. The input values can be from whatever you want -- implicit etc. 1/0 data actually makes more sense for something like ALS as input. The paper I mention above has a slightly better generalization of that. That code wasn't finished and it almost surely will not be. It is not so much a different SVD as massaging the input to incorporate stuff like time info. I personally am not sure that the SVD is the best approach for recommenders, mostly on grounds that it is hard to scale because it is doing something more complicated. On Wed, Jul 4, 2012 at 6:08 PM, Razon, Oren <[EMAIL PROTECTED]> wrote: > Hi, > I'm exploring Mahout SVD parallel implementation over Hadoop (ALS), and I would like to clarify a few things : > 1. How do you recommend top K items with this job? Does the job factorize the ranking matrix, than compute a predicted ranking for each cell in the matrix, so when you need a recommendation you only need to retrieve the top K items according to prediction value for the user? Or is it factorize the matrix and require some online logic when the recommendation is being asked? > 2. From my knowledge, applying a SVD technique require first to fill in all empty cells in the ranking matrix (with average ranking for example). Is it something done during the ALS job (and if so, what is the way it's being filled), or should it be done as a preprocessing step? > 3. From my understanding SVD recommenders are used to predict user implicit preference. By doing so you can recommend top K items (top K items over descending orders according to the prediction). I wonder, could it be applied on a binary dataset (explicit), where my ranking matrix contain only 1\0? > 4. From doing some readings I found that the timeSVD++ developed by Yehuda Koren is considered as the superior SVD implementation for SVD recommenders. I wondered if there is any kind of a parallel implementation of it on top of Hadoop? I found this proposal: https://issues.apache.org/jira/browse/MAHOUT-371 Intel Electronics Ltd. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
-
Re: A bunch of SVD questions...Sean Owen 2012-07-05, 08:09
The math is going to implicitly treat non-existent cells in a sparse
matrix as 0. It's not something that happens by virtue of your input; it just is so. I'm saying that it's a lot easier if your input makes sense with that assumption. If your input is ratings on a scale of 1-5, this is bad. It means that any unrated item is completely hated. "Exists" (1) or not-exists (0) obviously fits better. Any matrix factorization approach will help you predict ratings, if that's what the original matrix was. Predicting ratings is not the only way to rank recommendations and it is sometimes not possible (like, when you don't have ratings). I think it's OK to think of the input in more general terms; it doesn't have to be ratings at all. On Thu, Jul 5, 2012 at 10:05 AM, Razon, Oren <[EMAIL PROTECTED]> wrote: > Thanks for the answer Sean! > Some clarifications... > > Not sure I understand this sentence: > "At best you can subtract the mean of the existing entries, *from the existing entries only*. This makes the fact that empty cells are conceptually 0 make more sense." Why does it make empty cells conceptually 0? > > If I get you right, you're saying that ALS work better if I will refer to preferences as 1\0 instead of the actual preference, am I right? > > If I do need to predict user preference as number, kind of prediction question and not a ranking question, does ALS can do that if I'm working with the real preference values? > > Thanks, > Oren > > -----Original Message----- > From: Sean Owen [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, July 04, 2012 18:39 > To: [EMAIL PROTECTED] > Subject: Re: A bunch of SVD questions... > > SVD is not the same thing as ALS, though both are factoring matrices. > There is not a distributed SVD-based recommender, though there is a > distributed SVD, and you could use it as part of a recommender system. > I assume you are talking about ALS. > > The version of ALS in Mahout operates on the sparse rating matrix. You > make recommendations by multiplying the two factored matrices back > together, which gives you a dense, approximate version of the original > sparse rating matrix, with blanks filled in. The k largest new entries > in a row are your recs for one user. Of course you never actually > compute that complete product -- it's way too big. You just recreate > one row to make recs for one user. > > You definitely can't fill in average ratings in the rating matrix, at > scale -- it makes it dense and too big. It is conceptually a good idea > and that's why the literature talks about this, but I do not think it > is practical. At best you can subtract the mean of the existing > entries, *from the existing entries only*. This makes the fact that > empty cells are conceptually 0 make more sense. > > (This is also why I like the Yehuda/Koren formulation of ALS, where > you are not predicting ratings, but a positive interaction score. > There the fact that empty means 0 is just fine. No bias terms needed.) > > SVD/ALS are used to factor matrices and reconstruct an approximation > of the original *that is more complete*. The input values can be from > whatever you want -- implicit etc. 1/0 data actually makes more sense > for something like ALS as input. The paper I mention above has a > slightly better generalization of that. > > That code wasn't finished and it almost surely will not be. It is not > so much a different SVD as massaging the input to incorporate stuff > like time info. I personally am not sure that the SVD is the best > approach for recommenders, mostly on grounds that it is hard to scale > because it is doing something more complicated. > > > On Wed, Jul 4, 2012 at 6:08 PM, Razon, Oren <[EMAIL PROTECTED]> wrote: >> Hi, >> I'm exploring Mahout SVD parallel implementation over Hadoop (ALS), and I would like to clarify a few things : >> 1. How do you recommend top K items with this job? Does the job factorize the ranking matrix, than compute a predicted ranking for each cell in the matrix, so when you need a recommendation you only need to retrieve the top K items according to prediction value for the user? Or is it factorize the matrix and require some online logic when the recommendation is being asked?
-
Re: A bunch of SVD questions...Sebastian Schelter 2012-07-05, 08:12
1. You can use org.apache.mahout.cf.taste.hadoop.als.RecommenderJob to
compute top-N recommendations from the factorization in batch. For each user, you have to compute the product of the item feature matrix and his feature vector and pick the highest ranking unknown items after that. 2. The semantics of the empty cells depends on the type of data you have. For explicit feedback (ratings), you cannot fill the empty cells because you simply don't know what rating the user would have given. For implicit feedback a cell usually holds the count of some observed behavior like clicks e.g. Here empty cells are by definition 0 (no clicks observed), however the factorization has to be modified to give 'lower confidence' to these datapoints. 3. There are two 'flavors' of the ALS factorzation implemented in Mahout, one for implicit feedback data, the other for explicit feedback data, I suggest you look into the papers they are based on: "Large-scale Parallel Collaborative Filtering for the Netflix Prize" http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf "Collaborative Filtering for Implicit Feedback Datasets" http://research.yahoo.com/pub/2433 I also uploaded the slides from a lecture I gave at a scalable data mining class at our department, they might also be helpful in understanding the topic: http://www.slideshare.net/sscdotopen/latent-factor-models-for-collaborative-filtering Best, Sebastian 2012/7/4 Razon, Oren <[EMAIL PROTECTED]>: > Hi, > I'm exploring Mahout SVD parallel implementation over Hadoop (ALS), and I would like to clarify a few things : > 1. How do you recommend top K items with this job? Does the job factorize the ranking matrix, than compute a predicted ranking for each cell in the matrix, so when you need a recommendation you only need to retrieve the top K items according to prediction value for the user? Or is it factorize the matrix and require some online logic when the recommendation is being asked? > 2. From my knowledge, applying a SVD technique require first to fill in all empty cells in the ranking matrix (with average ranking for example). Is it something done during the ALS job (and if so, what is the way it's being filled), or should it be done as a preprocessing step? > 3. From my understanding SVD recommenders are used to predict user implicit preference. By doing so you can recommend top K items (top K items over descending orders according to the prediction). I wonder, could it be applied on a binary dataset (explicit), where my ranking matrix contain only 1\0? > 4. From doing some readings I found that the timeSVD++ developed by Yehuda Koren is considered as the superior SVD implementation for SVD recommenders. I wondered if there is any kind of a parallel implementation of it on top of Hadoop? I found this proposal: https://issues.apache.org/jira/browse/MAHOUT-371 > I wonder, what is the status of it? Was it being checked already? Is it stable? Did anyone experienced with it? > > Thanks, > Oren > > > > > > --------------------------------------------------------------------- > Intel Electronics Ltd. > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies.
-
RE: A bunch of SVD questions...Razon, Oren 2012-07-05, 08:38
Thanks for the answer Sebastian!
You said mahout has two 'flavors' of the ALS factorization, one for implicit and the other for explicit. Can you direct me which code do what? Cause on the Hadoop part I can see only one ALS implementation... -----Original Message----- From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] Sent: Thursday, July 05, 2012 11:12 To: [EMAIL PROTECTED] Subject: Re: A bunch of SVD questions... 1. You can use org.apache.mahout.cf.taste.hadoop.als.RecommenderJob to compute top-N recommendations from the factorization in batch. For each user, you have to compute the product of the item feature matrix and his feature vector and pick the highest ranking unknown items after that. 2. The semantics of the empty cells depends on the type of data you have. For explicit feedback (ratings), you cannot fill the empty cells because you simply don't know what rating the user would have given. For implicit feedback a cell usually holds the count of some observed behavior like clicks e.g. Here empty cells are by definition 0 (no clicks observed), however the factorization has to be modified to give 'lower confidence' to these datapoints. 3. There are two 'flavors' of the ALS factorzation implemented in Mahout, one for implicit feedback data, the other for explicit feedback data, I suggest you look into the papers they are based on: "Large-scale Parallel Collaborative Filtering for the Netflix Prize" http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf "Collaborative Filtering for Implicit Feedback Datasets" http://research.yahoo.com/pub/2433 I also uploaded the slides from a lecture I gave at a scalable data mining class at our department, they might also be helpful in understanding the topic: http://www.slideshare.net/sscdotopen/latent-factor-models-for-collaborative-filtering Best, Sebastian 2012/7/4 Razon, Oren <[EMAIL PROTECTED]>: > Hi, > I'm exploring Mahout SVD parallel implementation over Hadoop (ALS), and I would like to clarify a few things : > 1. How do you recommend top K items with this job? Does the job factorize the ranking matrix, than compute a predicted ranking for each cell in the matrix, so when you need a recommendation you only need to retrieve the top K items according to prediction value for the user? Or is it factorize the matrix and require some online logic when the recommendation is being asked? > 2. From my knowledge, applying a SVD technique require first to fill in all empty cells in the ranking matrix (with average ranking for example). Is it something done during the ALS job (and if so, what is the way it's being filled), or should it be done as a preprocessing step? > 3. From my understanding SVD recommenders are used to predict user implicit preference. By doing so you can recommend top K items (top K items over descending orders according to the prediction). I wonder, could it be applied on a binary dataset (explicit), where my ranking matrix contain only 1\0? > 4. From doing some readings I found that the timeSVD++ developed by Yehuda Koren is considered as the superior SVD implementation for SVD recommenders. I wondered if there is any kind of a parallel implementation of it on top of Hadoop? I found this proposal: https://issues.apache.org/jira/browse/MAHOUT-371 > I wonder, what is the status of it? Was it being checked already? Is it stable? Did anyone experienced with it? > > Thanks, > Oren > > > > > > --------------------------------------------------------------------- > Intel Electronics Ltd. > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. --------------------------------------------------------------------- Intel Electronics Ltd. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
-
Re: A bunch of SVD questions...Sebastian Schelter 2012-07-05, 09:45
There is only one implementation, because both 'flavors' of ALS have the
same computation shape. The default mode is to factorize explicit feedback data and if you specifiy the option '--implicitFeedback', it will switch to the algorithm that works on implicit feedback data. Internally the different solver from org.apache.mahout.math.als are used if you want to have a deeper look. Best, Sebatian On 05.07.2012 10:38, Razon, Oren wrote: > Thanks for the answer Sebastian! > You said mahout has two 'flavors' of the ALS factorization, one for implicit and the other for explicit. > Can you direct me which code do what? > Cause on the Hadoop part I can see only one ALS implementation... > > -----Original Message----- > From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] > Sent: Thursday, July 05, 2012 11:12 > To: [EMAIL PROTECTED] > Subject: Re: A bunch of SVD questions... > > 1. You can use org.apache.mahout.cf.taste.hadoop.als.RecommenderJob to > compute top-N recommendations from the factorization in batch. For > each user, you have to compute the product of the item feature matrix > and his feature vector and pick the highest ranking unknown items > after that. > > 2. The semantics of the empty cells depends on the type of data you > have. For explicit feedback (ratings), you cannot fill the empty cells > because you simply don't know what rating the user would have given. > For implicit feedback a cell usually holds the count of some observed > behavior like clicks e.g. Here empty cells are by definition 0 (no > clicks observed), however the factorization has to be modified to give > 'lower confidence' to these datapoints. > > 3. There are two 'flavors' of the ALS factorzation implemented in > Mahout, one for implicit feedback data, the other for explicit > feedback data, I suggest you look into the papers they are based on: > > "Large-scale Parallel Collaborative Filtering for the Netflix Prize" > http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf > "Collaborative Filtering for Implicit Feedback Datasets" > http://research.yahoo.com/pub/2433 > > I also uploaded the slides from a lecture I gave at a scalable data > mining class at our department, they might also be helpful in > understanding the topic: > > http://www.slideshare.net/sscdotopen/latent-factor-models-for-collaborative-filtering > > Best, > Sebastian > 2012/7/4 Razon, Oren <[EMAIL PROTECTED]>: >> Hi, >> I'm exploring Mahout SVD parallel implementation over Hadoop (ALS), and I would like to clarify a few things : >> 1. How do you recommend top K items with this job? Does the job factorize the ranking matrix, than compute a predicted ranking for each cell in the matrix, so when you need a recommendation you only need to retrieve the top K items according to prediction value for the user? Or is it factorize the matrix and require some online logic when the recommendation is being asked? >> 2. From my knowledge, applying a SVD technique require first to fill in all empty cells in the ranking matrix (with average ranking for example). Is it something done during the ALS job (and if so, what is the way it's being filled), or should it be done as a preprocessing step? >> 3. From my understanding SVD recommenders are used to predict user implicit preference. By doing so you can recommend top K items (top K items over descending orders according to the prediction). I wonder, could it be applied on a binary dataset (explicit), where my ranking matrix contain only 1\0? >> 4. From doing some readings I found that the timeSVD++ developed by Yehuda Koren is considered as the superior SVD implementation for SVD recommenders. I wondered if there is any kind of a parallel implementation of it on top of Hadoop? I found this proposal: https://issues.apache.org/jira/browse/MAHOUT-371 >> I wonder, what is the status of it? Was it being checked already? Is it stable? Did anyone experienced with it?
-
RE: A bunch of SVD questions...Razon, Oren 2012-07-05, 12:22
Thanks.
I had some other questions in mind so I will use this post... 1. Cold start for items problem - With the user cold start problem I can handle by trying new items for the user based on popularity \ randomly. But what options do I have when using the ALS \ co-occurrence matrix to overcome cold start for item? 2. What about applying a matrix factorization technique (ALS \ SVD) as a preprocessing. Meaning, after doing the factorization, use the new lower Item matrix for example to compute item similarity between items? Will it be a good idea? 3. I'm looking for a huge data set to try my recommender on. I'm searching something which is even bigger than last.fm\ libimseti can anyone recommend on such dataset? Thanks, Oren -----Original Message----- From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] Sent: Thursday, July 05, 2012 12:46 To: [EMAIL PROTECTED] Subject: Re: A bunch of SVD questions... There is only one implementation, because both 'flavors' of ALS have the same computation shape. The default mode is to factorize explicit feedback data and if you specifiy the option '--implicitFeedback', it will switch to the algorithm that works on implicit feedback data. Internally the different solver from org.apache.mahout.math.als are used if you want to have a deeper look. Best, Sebatian On 05.07.2012 10:38, Razon, Oren wrote: > Thanks for the answer Sebastian! > You said mahout has two 'flavors' of the ALS factorization, one for implicit and the other for explicit. > Can you direct me which code do what? > Cause on the Hadoop part I can see only one ALS implementation... > > -----Original Message----- > From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] > Sent: Thursday, July 05, 2012 11:12 > To: [EMAIL PROTECTED] > Subject: Re: A bunch of SVD questions... > > 1. You can use org.apache.mahout.cf.taste.hadoop.als.RecommenderJob to > compute top-N recommendations from the factorization in batch. For > each user, you have to compute the product of the item feature matrix > and his feature vector and pick the highest ranking unknown items > after that. > > 2. The semantics of the empty cells depends on the type of data you > have. For explicit feedback (ratings), you cannot fill the empty cells > because you simply don't know what rating the user would have given. > For implicit feedback a cell usually holds the count of some observed > behavior like clicks e.g. Here empty cells are by definition 0 (no > clicks observed), however the factorization has to be modified to give > 'lower confidence' to these datapoints. > > 3. There are two 'flavors' of the ALS factorzation implemented in > Mahout, one for implicit feedback data, the other for explicit > feedback data, I suggest you look into the papers they are based on: > > "Large-scale Parallel Collaborative Filtering for the Netflix Prize" > http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf > "Collaborative Filtering for Implicit Feedback Datasets" > http://research.yahoo.com/pub/2433 > > I also uploaded the slides from a lecture I gave at a scalable data > mining class at our department, they might also be helpful in > understanding the topic: > > http://www.slideshare.net/sscdotopen/latent-factor-models-for-collaborative-filtering > > Best, > Sebastian > 2012/7/4 Razon, Oren <[EMAIL PROTECTED]>: >> Hi, >> I'm exploring Mahout SVD parallel implementation over Hadoop (ALS), and I would like to clarify a few things : >> 1. How do you recommend top K items with this job? Does the job factorize the ranking matrix, than compute a predicted ranking for each cell in the matrix, so when you need a recommendation you only need to retrieve the top K items according to prediction value for the user? Or is it factorize the matrix and require some online logic when the recommendation is being asked? >> 2. From my knowledge, applying a SVD technique require first to fill in all empty cells in the ranking matrix (with average ranking for example). Is it something done during the ALS job (and if so, what is the way it's being filled), or should it be done as a preprocessing step? Intel Electronics Ltd. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
-
Re: A bunch of SVD questions...Sean Owen 2012-07-05, 12:58
Unless you are recommending users to items too, you don't have a cold
start problem for items. If you are, you can apply the same technique. Using fold-in, you can create a reasonable user or item vector from the time you have the very first interaction for the user or item, which solves most of the cold start problem without resorting to simple top-10 lists. You can certainly compute user-user and item-item similarity on the factored matrices. It's a good approximation and is faster. Cosine measure works fine in this space. Look at finding someone's bootleg copy of the Netflix data set, or the KDD cup data set. I am using StackOverflow and Wikipedia dumps as good sources of a big data set though you need to massage it to get it into a usable form. Sean On Thu, Jul 5, 2012 at 3:22 PM, Razon, Oren <[EMAIL PROTECTED]> wrote: > Thanks. > I had some other questions in mind so I will use this post... > > 1. Cold start for items problem - With the user cold start problem I can handle by trying new items for the user based on popularity \ randomly. > But what options do I have when using the ALS \ co-occurrence matrix to overcome cold start for item? > > 2. What about applying a matrix factorization technique (ALS \ SVD) as a preprocessing. > Meaning, after doing the factorization, use the new lower Item matrix for example to compute item similarity between items? Will it be a good idea? > > 3. I'm looking for a huge data set to try my recommender on. I'm searching something which is even bigger than last.fm\ libimseti can anyone recommend on such dataset? > > Thanks, > Oren > > > -----Original Message----- > From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] > Sent: Thursday, July 05, 2012 12:46 > To: [EMAIL PROTECTED] > Subject: Re: A bunch of SVD questions... > > There is only one implementation, because both 'flavors' of ALS have the > same computation shape. The default mode is to factorize explicit > feedback data and if you specifiy the option '--implicitFeedback', it > will switch to the algorithm that works on implicit feedback data. > Internally the different solver from org.apache.mahout.math.als are used > if you want to have a deeper look. > > Best, > Sebatian > > On 05.07.2012 10:38, Razon, Oren wrote: >> Thanks for the answer Sebastian! >> You said mahout has two 'flavors' of the ALS factorization, one for implicit and the other for explicit. >> Can you direct me which code do what? >> Cause on the Hadoop part I can see only one ALS implementation... >> >> -----Original Message----- >> From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] >> Sent: Thursday, July 05, 2012 11:12 >> To: [EMAIL PROTECTED] >> Subject: Re: A bunch of SVD questions... >> >> 1. You can use org.apache.mahout.cf.taste.hadoop.als.RecommenderJob to >> compute top-N recommendations from the factorization in batch. For >> each user, you have to compute the product of the item feature matrix >> and his feature vector and pick the highest ranking unknown items >> after that. >> >> 2. The semantics of the empty cells depends on the type of data you >> have. For explicit feedback (ratings), you cannot fill the empty cells >> because you simply don't know what rating the user would have given. >> For implicit feedback a cell usually holds the count of some observed >> behavior like clicks e.g. Here empty cells are by definition 0 (no >> clicks observed), however the factorization has to be modified to give >> 'lower confidence' to these datapoints. >> >> 3. There are two 'flavors' of the ALS factorzation implemented in >> Mahout, one for implicit feedback data, the other for explicit >> feedback data, I suggest you look into the papers they are based on: >> >> "Large-scale Parallel Collaborative Filtering for the Netflix Prize" >> http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf >> "Collaborative Filtering for Implicit Feedback Datasets" >> http://research.yahoo.com/pub/2433 >> >> I also uploaded the slides from a lecture I gave at a scalable data
-
RE: A bunch of SVD questions...Dmitriy Lyubimov 2012-07-05, 16:17
Cold start problem is usually best attacked if there is also content
information about users initially -- demographics, user profile or something. Otherwise, yes, you are pretty much limited to an average user profile to start trials. There are various ways to combine factorization and content side techniques into single model, I have a paper reference somewhere around if you think user content info is your case. On Jul 5, 2012 5:22 AM, "Razon, Oren" <[EMAIL PROTECTED]> wrote: > Thanks. > I had some other questions in mind so I will use this post... > > 1. Cold start for items problem - With the user cold start problem I can > handle by trying new items for the user based on popularity \ randomly. > But what options do I have when using the ALS \ co-occurrence matrix to > overcome cold start for item? > > 2. What about applying a matrix factorization technique (ALS \ SVD) as a > preprocessing. > Meaning, after doing the factorization, use the new lower Item matrix for > example to compute item similarity between items? Will it be a good idea? > > 3. I'm looking for a huge data set to try my recommender on. I'm searching > something which is even bigger than last.fm\ libimseti can anyone > recommend on such dataset? > > Thanks, > Oren > > > -----Original Message----- > From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] > Sent: Thursday, July 05, 2012 12:46 > To: [EMAIL PROTECTED] > Subject: Re: A bunch of SVD questions... > > There is only one implementation, because both 'flavors' of ALS have the > same computation shape. The default mode is to factorize explicit > feedback data and if you specifiy the option '--implicitFeedback', it > will switch to the algorithm that works on implicit feedback data. > Internally the different solver from org.apache.mahout.math.als are used > if you want to have a deeper look. > > Best, > Sebatian > > On 05.07.2012 10:38, Razon, Oren wrote: > > Thanks for the answer Sebastian! > > You said mahout has two 'flavors' of the ALS factorization, one for > implicit and the other for explicit. > > Can you direct me which code do what? > > Cause on the Hadoop part I can see only one ALS implementation... > > > > -----Original Message----- > > From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] > > Sent: Thursday, July 05, 2012 11:12 > > To: [EMAIL PROTECTED] > > Subject: Re: A bunch of SVD questions... > > > > 1. You can use org.apache.mahout.cf.taste.hadoop.als.RecommenderJob to > > compute top-N recommendations from the factorization in batch. For > > each user, you have to compute the product of the item feature matrix > > and his feature vector and pick the highest ranking unknown items > > after that. > > > > 2. The semantics of the empty cells depends on the type of data you > > have. For explicit feedback (ratings), you cannot fill the empty cells > > because you simply don't know what rating the user would have given. > > For implicit feedback a cell usually holds the count of some observed > > behavior like clicks e.g. Here empty cells are by definition 0 (no > > clicks observed), however the factorization has to be modified to give > > 'lower confidence' to these datapoints. > > > > 3. There are two 'flavors' of the ALS factorzation implemented in > > Mahout, one for implicit feedback data, the other for explicit > > feedback data, I suggest you look into the papers they are based on: > > > > "Large-scale Parallel Collaborative Filtering for the Netflix Prize" > > > http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf > > "Collaborative Filtering for Implicit Feedback Datasets" > > http://research.yahoo.com/pub/2433 > > > > I also uploaded the slides from a lecture I gave at a scalable data > > mining class at our department, they might also be helpful in > > understanding the topic: > > > > > http://www.slideshare.net/sscdotopen/latent-factor-models-for-collaborative-filtering > > > > Best, > > Sebastian > > 2012/7/4 Razon, Oren <[EMAIL PROTECTED]>:
-
RE: A bunch of SVD questions...Razon, Oren 2012-07-06, 15:07
Hi Dmitriy,
Thank you for the answer. I will be happy to read such paper -----Original Message----- From: Dmitriy Lyubimov [mailto:[EMAIL PROTECTED]] Sent: Thursday, July 05, 2012 19:18 To: [EMAIL PROTECTED] Subject: RE: A bunch of SVD questions... Cold start problem is usually best attacked if there is also content information about users initially -- demographics, user profile or something. Otherwise, yes, you are pretty much limited to an average user profile to start trials. There are various ways to combine factorization and content side techniques into single model, I have a paper reference somewhere around if you think user content info is your case. On Jul 5, 2012 5:22 AM, "Razon, Oren" <[EMAIL PROTECTED]> wrote: > Thanks. > I had some other questions in mind so I will use this post... > > 1. Cold start for items problem - With the user cold start problem I can > handle by trying new items for the user based on popularity \ randomly. > But what options do I have when using the ALS \ co-occurrence matrix to > overcome cold start for item? > > 2. What about applying a matrix factorization technique (ALS \ SVD) as a > preprocessing. > Meaning, after doing the factorization, use the new lower Item matrix for > example to compute item similarity between items? Will it be a good idea? > > 3. I'm looking for a huge data set to try my recommender on. I'm searching > something which is even bigger than last.fm\ libimseti can anyone > recommend on such dataset? > > Thanks, > Oren > > > -----Original Message----- > From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] > Sent: Thursday, July 05, 2012 12:46 > To: [EMAIL PROTECTED] > Subject: Re: A bunch of SVD questions... > > There is only one implementation, because both 'flavors' of ALS have the > same computation shape. The default mode is to factorize explicit > feedback data and if you specifiy the option '--implicitFeedback', it > will switch to the algorithm that works on implicit feedback data. > Internally the different solver from org.apache.mahout.math.als are used > if you want to have a deeper look. > > Best, > Sebatian > > On 05.07.2012 10:38, Razon, Oren wrote: > > Thanks for the answer Sebastian! > > You said mahout has two 'flavors' of the ALS factorization, one for > implicit and the other for explicit. > > Can you direct me which code do what? > > Cause on the Hadoop part I can see only one ALS implementation... > > > > -----Original Message----- > > From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] > > Sent: Thursday, July 05, 2012 11:12 > > To: [EMAIL PROTECTED] > > Subject: Re: A bunch of SVD questions... > > > > 1. You can use org.apache.mahout.cf.taste.hadoop.als.RecommenderJob to > > compute top-N recommendations from the factorization in batch. For > > each user, you have to compute the product of the item feature matrix > > and his feature vector and pick the highest ranking unknown items > > after that. > > > > 2. The semantics of the empty cells depends on the type of data you > > have. For explicit feedback (ratings), you cannot fill the empty cells > > because you simply don't know what rating the user would have given. > > For implicit feedback a cell usually holds the count of some observed > > behavior like clicks e.g. Here empty cells are by definition 0 (no > > clicks observed), however the factorization has to be modified to give > > 'lower confidence' to these datapoints. > > > > 3. There are two 'flavors' of the ALS factorzation implemented in > > Mahout, one for implicit feedback data, the other for explicit > > feedback data, I suggest you look into the papers they are based on: > > > > "Large-scale Parallel Collaborative Filtering for the Netflix Prize" > > > http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf > > "Collaborative Filtering for Implicit Feedback Datasets" > > http://research.yahoo.com/pub/2433 > > > > I also uploaded the slides from a lecture I gave at a scalable data Intel Electronics Ltd. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
-
RE: A bunch of SVD questions...Razon, Oren 2012-07-06, 20:39
Thanks Sean
I've accidently continued this thread under the thread you opened, so I'm moving back to my thread :) I will rephrase the question I've asked there. Let's say that as part of my held-out test my model find for user u2 connection to i1 has strength of 28.94 to i2 17.9 and to i3 4.5. The ranking itself which I have (hidden) is on scale of 1-5 (or even binary 0\1 for an example). Now how could I estimate the ranking I gave for u2 if I only predicted the connection strength he has with each item in order to rank the items while my data is on different scale? In other words, the problem definition here is not prediction but ranking, therefor I guess it should have different measures than prediction measures... Am I missing something? If familiar with precision \ recall \ ROC \ Lift and so on, but not sure I understand how should I use them here. -----Original Message----- From: Sean Owen [mailto:[EMAIL PROTECTED]] Sent: Thursday, July 05, 2012 15:59 To: [EMAIL PROTECTED] Subject: Re: A bunch of SVD questions... Unless you are recommending users to items too, you don't have a cold start problem for items. If you are, you can apply the same technique. Using fold-in, you can create a reasonable user or item vector from the time you have the very first interaction for the user or item, which solves most of the cold start problem without resorting to simple top-10 lists. You can certainly compute user-user and item-item similarity on the factored matrices. It's a good approximation and is faster. Cosine measure works fine in this space. Look at finding someone's bootleg copy of the Netflix data set, or the KDD cup data set. I am using StackOverflow and Wikipedia dumps as good sources of a big data set though you need to massage it to get it into a usable form. Sean On Thu, Jul 5, 2012 at 3:22 PM, Razon, Oren <[EMAIL PROTECTED]> wrote: > Thanks. > I had some other questions in mind so I will use this post... > > 1. Cold start for items problem - With the user cold start problem I can handle by trying new items for the user based on popularity \ randomly. > But what options do I have when using the ALS \ co-occurrence matrix to overcome cold start for item? > > 2. What about applying a matrix factorization technique (ALS \ SVD) as a preprocessing. > Meaning, after doing the factorization, use the new lower Item matrix for example to compute item similarity between items? Will it be a good idea? > > 3. I'm looking for a huge data set to try my recommender on. I'm searching something which is even bigger than last.fm\ libimseti can anyone recommend on such dataset? > > Thanks, > Oren > > > -----Original Message----- > From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] > Sent: Thursday, July 05, 2012 12:46 > To: [EMAIL PROTECTED] > Subject: Re: A bunch of SVD questions... > > There is only one implementation, because both 'flavors' of ALS have the > same computation shape. The default mode is to factorize explicit > feedback data and if you specifiy the option '--implicitFeedback', it > will switch to the algorithm that works on implicit feedback data. > Internally the different solver from org.apache.mahout.math.als are used > if you want to have a deeper look. > > Best, > Sebatian > > On 05.07.2012 10:38, Razon, Oren wrote: >> Thanks for the answer Sebastian! >> You said mahout has two 'flavors' of the ALS factorization, one for implicit and the other for explicit. >> Can you direct me which code do what? >> Cause on the Hadoop part I can see only one ALS implementation... >> >> -----Original Message----- >> From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] >> Sent: Thursday, July 05, 2012 11:12 >> To: [EMAIL PROTECTED] >> Subject: Re: A bunch of SVD questions... >> >> 1. You can use org.apache.mahout.cf.taste.hadoop.als.RecommenderJob to >> compute top-N recommendations from the factorization in batch. For >> each user, you have to compute the product of the item feature matrix >> and his feature vector and pick the highest ranking unknown items Intel Electronics Ltd. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
-
Re: A bunch of SVD questions...Sean Owen 2012-07-06, 20:52
That's right, in the formulation you are referring to you are not
predicting the original input values, so you can't compare them with RMSE or something. To test precision / recall you hold out some of the top-rated items (these are the "relevant results"), and see how many come back in the recommendations. F1 is based on precision/recall. (For boolean data you pick random input to hold out and the test is sort of flawed by nature.) nDCG captures more, as it scores higher for putting relevant results higher. It's a somewhat better metric. And so on for ROC -- should be fairly direct to apply once you know what your positive / negative classes are supposed to be. Mahout has some code for computing this sort of thing which you can directly apply or lift and adapt. On Fri, Jul 6, 2012 at 11:39 PM, Razon, Oren <[EMAIL PROTECTED]> wrote: > Thanks Sean > I've accidently continued this thread under the thread you opened, so I'm moving back to my thread :) > > I will rephrase the question I've asked there. > Let's say that as part of my held-out test my model find for user u2 connection to i1 has strength of 28.94 to i2 17.9 and to i3 4.5. > The ranking itself which I have (hidden) is on scale of 1-5 (or even binary 0\1 for an example). > > Now how could I estimate the ranking I gave for u2 if I only predicted the connection strength he has with each item in order to rank the items while my data is on different scale? > In other words, the problem definition here is not prediction but ranking, therefor I guess it should have different measures than prediction measures... > > Am I missing something? > > If familiar with precision \ recall \ ROC \ Lift and so on, but not sure I understand how should I use them here. >
-
Re: A bunch of SVD questions...Dmitriy Lyubimov 2012-07-06, 21:26
these guys show one way to combine content info with dyadic data
factorization, which is pretty close to what i used. Unfortunately i don't have a free download link for them (it is in ACM library, or Ted knows a cheaper arrangement to pull it off). Agarwal, Chen : "Regression-based Latent Factor Models" I am not sure if one can construct something similar in Mahout though, but I am sure it can be prototyped very easily in java. (I hacked a lot of Mahout framework previously in my time to achieve similar effect). On Fri, Jul 6, 2012 at 8:07 AM, Razon, Oren <[EMAIL PROTECTED]> wrote: > Hi Dmitriy, > Thank you for the answer. > I will be happy to read such paper > > -----Original Message----- > From: Dmitriy Lyubimov [mailto:[EMAIL PROTECTED]] > Sent: Thursday, July 05, 2012 19:18 > To: [EMAIL PROTECTED] > Subject: RE: A bunch of SVD questions... > > Cold start problem is usually best attacked if there is also content > information about users initially -- demographics, user profile or > something. Otherwise, yes, you are pretty much limited to an average user > profile to start trials. > > There are various ways to combine factorization and content side techniques > into single model, I have a paper reference somewhere around if you think > user content info is your case. > On Jul 5, 2012 5:22 AM, "Razon, Oren" <[EMAIL PROTECTED]> wrote: > >> Thanks. >> I had some other questions in mind so I will use this post... >> >> 1. Cold start for items problem - With the user cold start problem I can >> handle by trying new items for the user based on popularity \ randomly. >> But what options do I have when using the ALS \ co-occurrence matrix to >> overcome cold start for item? >> >> 2. What about applying a matrix factorization technique (ALS \ SVD) as a >> preprocessing. >> Meaning, after doing the factorization, use the new lower Item matrix for >> example to compute item similarity between items? Will it be a good idea? >> >> 3. I'm looking for a huge data set to try my recommender on. I'm searching >> something which is even bigger than last.fm\ libimseti can anyone >> recommend on such dataset? >> >> Thanks, >> Oren >> >> >> -----Original Message----- >> From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] >> Sent: Thursday, July 05, 2012 12:46 >> To: [EMAIL PROTECTED] >> Subject: Re: A bunch of SVD questions... >> >> There is only one implementation, because both 'flavors' of ALS have the >> same computation shape. The default mode is to factorize explicit >> feedback data and if you specifiy the option '--implicitFeedback', it >> will switch to the algorithm that works on implicit feedback data. >> Internally the different solver from org.apache.mahout.math.als are used >> if you want to have a deeper look. >> >> Best, >> Sebatian >> >> On 05.07.2012 10:38, Razon, Oren wrote: >> > Thanks for the answer Sebastian! >> > You said mahout has two 'flavors' of the ALS factorization, one for >> implicit and the other for explicit. >> > Can you direct me which code do what? >> > Cause on the Hadoop part I can see only one ALS implementation... >> > >> > -----Original Message----- >> > From: Sebastian Schelter [mailto:[EMAIL PROTECTED]] >> > Sent: Thursday, July 05, 2012 11:12 >> > To: [EMAIL PROTECTED] >> > Subject: Re: A bunch of SVD questions... >> > >> > 1. You can use org.apache.mahout.cf.taste.hadoop.als.RecommenderJob to >> > compute top-N recommendations from the factorization in batch. For >> > each user, you have to compute the product of the item feature matrix >> > and his feature vector and pick the highest ranking unknown items >> > after that. >> > >> > 2. The semantics of the empty cells depends on the type of data you >> > have. For explicit feedback (ratings), you cannot fill the empty cells >> > because you simply don't know what rating the user would have given. >> > For implicit feedback a cell usually holds the count of some observed >> > behavior like clicks e.g. Here empty cells are by definition 0 (no
-
Re: A bunch of SVD questions...Ted Dunning 2012-07-06, 21:32
I think that Dmitriy is referring to this:
http://www.deepdyve.com/lp/association-for-computing-machinery/regression-based-latent-factor-models-1ebJXMCs0K On Fri, Jul 6, 2012 at 2:26 PM, Dmitriy Lyubimov <[EMAIL PROTECTED]> wrote: > (it is in ACM library, or Ted knows a cheaper arrangement to pull it off). >
-
Re: A bunch of SVD questions...Dmitriy Lyubimov 2012-07-06, 21:43
yes, that's the one. Thank you, Ted.
On Fri, Jul 6, 2012 at 2:32 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > I think that Dmitriy is referring to this: > > http://www.deepdyve.com/lp/association-for-computing-machinery/regression-based-latent-factor-models-1ebJXMCs0K > > On Fri, Jul 6, 2012 at 2:26 PM, Dmitriy Lyubimov <[EMAIL PROTECTED]> wrote: > >> (it is in ACM library, or Ted knows a cheaper arrangement to pull it off). >>
-
RE: A bunch of SVD questions...Razon, Oren 2012-07-07, 18:35
Thank you both
-----Original Message----- From: Dmitriy Lyubimov [mailto:[EMAIL PROTECTED]] Sent: Saturday, July 07, 2012 00:43 To: [EMAIL PROTECTED] Subject: Re: A bunch of SVD questions... yes, that's the one. Thank you, Ted. On Fri, Jul 6, 2012 at 2:32 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > I think that Dmitriy is referring to this: > > http://www.deepdyve.com/lp/association-for-computing-machinery/regression-based-latent-factor-models-1ebJXMCs0K > > On Fri, Jul 6, 2012 at 2:26 PM, Dmitriy Lyubimov <[EMAIL PROTECTED]> wrote: > >> (it is in ACM library, or Ted knows a cheaper arrangement to pull it off). >> --------------------------------------------------------------------- Intel Electronics Ltd. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. |