-Approaches for combining multiple types of item data for user-user similarity
Ken Krugler 2012-07-03, 22:20
I'm curious what approaches are recommended for generating user-user similarity, when I've got two (or more) distinct types of item data, both of which are fairly large.
E.g. let's say I had a set of users where I knew both (a) what books they had bought on Amazon, and (b) what YouTube videos they had watched.
For each user, I want to find the 10 most similar other users.
- I could create two separate models, find the nearest 30 users for each user, and combine (maybe with weighting) the results.
- I could toss all of the data into one model - and I could use a value of < 1.0 for whichever type of preference is less important.
Any other suggestions? Input on the above two approaches?
custom big data solutions & training
Hadoop, Cascading, Mahout & Solr