|
|
Pramit Vamsi 2010-08-29, 17:25
Hi,
I am new to mahout and looking for ideas to implement "Customers with Similar Searches Purchased" and "What Do Customers Ultimately Buy After Viewing This Item?" style recommendations on amazon.com. Is it possible with the current Taste implementation? Any pointers will be helpful. Thanks, Pramit
+
Pramit Vamsi 2010-08-29, 17:25
Sean Owen 2010-08-29, 17:39
These are slightly different from conventional collaborative filtering, but I think solutions are available.
"Customers with Similar Searches Purchased"
To apply user-based CF you need a notion of user-user similarity. You could think of this as a sub-problem, where users are users and searches are items, and apply any of the standard UserSimilarity measures to compute user-user similarity.
Then, yes this becomes user-based collaborative filtering, but without ratings. You can just use GenericUserBasedRecommender with your UserSimilarity.
That just gets you started -- I think there's room to optimize and improve on that basic start, such as implementing a custom UserNeighborhood. "What Do Customers Ultimately Buy After Viewing This Item"
This isn't really CF, but association rule mining. You might look at the "Frequent Pattern Mining" support here instead. On Sun, Aug 29, 2010 at 6:25 PM, Pramit Vamsi <[EMAIL PROTECTED]> wrote: > Hi, > > I am new to mahout and looking for ideas to implement "Customers with > Similar Searches Purchased" and "What Do Customers Ultimately Buy After > Viewing This Item?" style recommendations on amazon.com. Is it possible with > the current Taste implementation? Any pointers will be helpful. > Thanks, > Pramit >
+
Sean Owen 2010-08-29, 17:39
Ted Dunning 2010-08-29, 18:26
These are examples of what I call cross-recommendation where you have user x item1 and user x item2 data and you want item1 => item2 recommendations.
All of the standard techniques apply (user-based, LLR cooccurrence, SVD, latent factor models), but you have to rejigger things here and there.
Sean, can Mahout's recommendation system do this cross recommendation?
On Sun, Aug 29, 2010 at 10:39 AM, Sean Owen <[EMAIL PROTECTED]> wrote:
> "Customers with Similar Searches Purchased" > > To apply user-based CF you need a notion of user-user similarity. You > could think of this as a sub-problem, where users are users and > searches are items, and apply any of the standard UserSimilarity > measures to compute user-user similarity. > > Then, yes this becomes user-based collaborative filtering, but without > ratings. You can just use GenericUserBasedRecommender with your > UserSimilarity. > > That just gets you started -- I think there's room to optimize and > improve on that basic start, such as implementing a custom > UserNeighborhood. > > > "What Do Customers Ultimately Buy After Viewing This Item" > > This isn't really CF, but association rule mining. You might look at > the "Frequent Pattern Mining" support here instead.
+
Ted Dunning 2010-08-29, 18:26
Sean Owen 2010-08-29, 18:37
Yes, this is a simpler problem. You just want to find which items are most similar to a given item, for some definition of 'similar'. GenericItemBasedRecommender has a mostSimilarItems() method that just saves you the trouble of computing this by hand, and any ItemSimiliarity function you like can be used.
On Sun, Aug 29, 2010 at 7:26 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > These are examples of what I call cross-recommendation where you have user x > item1 and user x item2 data and you > want item1 => item2 recommendations. > > All of the standard techniques apply (user-based, LLR cooccurrence, SVD, > latent factor models), but you have to rejigger things here > and there. > > Sean, can Mahout's recommendation system do this cross recommendation? >
+
Sean Owen 2010-08-29, 18:37
Pramit Vamsi 2010-08-30, 14:47
I have some understanding now. So given 2 matrices user * (page view/search term) and user * (purchased item), how do you connect these 2 matrices given that I can define the user or item sim methods?
Also, can the second use case can be solved with CF or association mining is needed?
Pramit
On Mon, Aug 30, 2010 at 12:07 AM, Sean Owen <[EMAIL PROTECTED]> wrote:
> Yes, this is a simpler problem. You just want to find which items are > most similar to a given item, for some definition of 'similar'. > GenericItemBasedRecommender has a mostSimilarItems() method that just > saves you the trouble of computing this by hand, and any > ItemSimiliarity function you like can be used. > > On Sun, Aug 29, 2010 at 7:26 PM, Ted Dunning <[EMAIL PROTECTED]> > wrote: > > These are examples of what I call cross-recommendation where you have > user x > > item1 and user x item2 data and you > > want item1 => item2 recommendations. > > > > All of the standard techniques apply (user-based, LLR cooccurrence, SVD, > > latent factor models), but you have to rejigger things here > > and there. > > > > Sean, can Mahout's recommendation system do this cross recommendation? > > >
-- Thanks, Pramit
+
Pramit Vamsi 2010-08-30, 14:47
Ted Dunning 2010-08-30, 16:44
Metaphorically speaking if user x search term is A and user x item is B, then transpose(B) * B is item x item, transpose(A) * B) is search term x search term and transpose(B)*A is item x search-term. Depending on what kind of recommendation system you are using, the actual mechanics will be different, but the shape of the computation will still be fairly similar to the matrix multiplication. In data-base terms, you are joining on user id, but I find the database view of this less helpful for generating good insight into what is happening. I gave a talk the last half of which is on this topic. This talk is recorded here: http://fora.tv/2009/10/14/ACM_Data_Mining_SIG_Ted_DunningOn Mon, Aug 30, 2010 at 7:47 AM, Pramit Vamsi <[EMAIL PROTECTED]>wrote: > I have some understanding now. So given 2 matrices user * (page view/search > term) and user * (purchased item), how do you connect these 2 matrices > given that I can define the user or item sim methods? >
+
Ted Dunning 2010-08-30, 16:44
Jake Mannix 2010-08-30, 17:34
On Mon, Aug 30, 2010 at 9:44 AM, Ted Dunning <[EMAIL PROTECTED]> wrote:
> Metaphorically speaking if user x search term is A and user x item is B, > then transpose(B) * B is item x item, transpose(A) * B) is search term x > search term and transpose(B)*A is item x search-term. >
And in fact, this is one map-reduce job in Mahout, as A.times(B) for DistributedRowMatrices A, B performs exactly transpose(A) * B.
-jake
+
Jake Mannix 2010-08-30, 17:34
Sean Owen 2010-08-30, 15:52
I would take the user-search "matrix" and create one DataModel from it. Define a LogLikelihoodSimilarity object on top of that. That's your user-user similarity measure.
Then the user-purchase "matrix" forms the basis of another DataModel that is actually plugged into a GenericUserBasedRecommender with the similarity measure above (which actually drives off different data).
Off the top of my head, that ought to work out. The second situation might just be really simple. If you have view and purchase data, simply count up and find which purchase was most frequent among all purchases that followed from a view of the current item. That's simple, perhaps oversimplified for your context.
I can think of ways to construe this as a CF problem but I think it just adds complication with no value. It's not really CF.
On Mon, Aug 30, 2010 at 3:47 PM, Pramit Vamsi <[EMAIL PROTECTED]> wrote: > I have some understanding now. So given 2 matrices user * (page view/search > term) and user * (purchased item), how do you connect these 2 matrices > given that I can define the user or item sim methods? > > Also, can the second use case can be solved with CF or association mining is > needed? > > Pramit > > On Mon, Aug 30, 2010 at 12:07 AM, Sean Owen <[EMAIL PROTECTED]> wrote: > >> Yes, this is a simpler problem. You just want to find which items are >> most similar to a given item, for some definition of 'similar'. >> GenericItemBasedRecommender has a mostSimilarItems() method that just >> saves you the trouble of computing this by hand, and any >> ItemSimiliarity function you like can be used. >> >> On Sun, Aug 29, 2010 at 7:26 PM, Ted Dunning <[EMAIL PROTECTED]> >> wrote: >> > These are examples of what I call cross-recommendation where you have >> user x >> > item1 and user x item2 data and you >> > want item1 => item2 recommendations. >> > >> > All of the standard techniques apply (user-based, LLR cooccurrence, SVD, >> > latent factor models), but you have to rejigger things here >> > and there. >> > >> > Sean, can Mahout's recommendation system do this cross recommendation? >> > >> > > > > -- > Thanks, > Pramit >
+
Sean Owen 2010-08-30, 15:52
Chris Bates 2010-08-30, 15:20
I really need to take a look at the Mahout code (I haven't had a chance yet) so I'm not sure if this type of rec is possible, but what I would do is something like this: Enumeration of Search Terms: 1 Bath soap 2 Headphones 3 Computer laptop Enumeration of Users: userid 1 userid 2 userid 3 userid 4 Joined Matrix UserID TotalCountOfSearchItem SearchItemID LocalCountSearchItem 1 15 1 3 1 7 2 1 2 15 1 5 2 7 2 4 3 10 3 5 3 15 1 7 4 10 3 5 4 7 2 2 I wrote a blog post about Naive Bayes for classification tasks that describes this type of layout here: http://www.thedatascientist.com/2010/05/22/how-i-would-use-the-google-prediction-api/But this type of data layout is algorithm agnostic, so you can use it for whatever you need to do. Its just a matter of feeding the data into a form that Mahout will recognize (my guess) Chris On Mon, Aug 30, 2010 at 10:47 AM, Pramit Vamsi <[EMAIL PROTECTED]>wrote: > I have some understanding now. So given 2 matrices user * (page view/search > term) and user * (purchased item), how do you connect these 2 matrices > given that I can define the user or item sim methods? > > Also, can the second use case can be solved with CF or association mining > is > needed? > > Pramit > > On Mon, Aug 30, 2010 at 12:07 AM, Sean Owen <[EMAIL PROTECTED]> wrote: > > > Yes, this is a simpler problem. You just want to find which items are > > most similar to a given item, for some definition of 'similar'. > > GenericItemBased Recommender has a mostSimilarItems() method that just > > saves you the trouble of computing this by hand, and any > > ItemSimiliarity function you like can be used. > > > > On Sun, Aug 29, 2010 at 7:26 PM, Ted Dunning <[EMAIL PROTECTED]> > > wrote: > > > These are examples of what I call cross- recommendation where you have > > user x > > > item1 and user x item2 data and you > > > want item1 => item2 recommendations. > > > > > > All of the standard techniques apply (user-based, LLR cooccurrence, > SVD, > > > latent factor models), but you have to rejigger things here > > > and there. > > > > > > Sean, can Mahout's recommendation system do this cross recommendation? > > > > > > > > > -- > Thanks, > Pramit >
+
Chris Bates 2010-08-30, 15:20
|
|