|
|
-
Mahout 0.4 seems recommend user's existed items to user.
han henry 2011-01-14, 06:13
Hi,All
Now I have a issue ,Mahout 0.4 seems recommend user's existed items to user.
I remembered that Mahout has skips those user's existed items when recommend items to user.
But I have not found the logic for skipping the existed items in Mahout 0.4.
Can anyone confirm that or let me know where can find the logic for skipping existed items ?
Best Regards,
-
Re: Mahout 0.4 seems recommend user's existed items to user.
Sebastian Schelter 2011-01-14, 08:21
Hi,
which recommender are you talking about? The distributed recommender does not do this, I checked it and will include a test for that into our unit tests.
--sebastian On 14.01.2011 07:13, han henry wrote: > Hi,All > > Now I have a issue ,Mahout 0.4 seems recommend user's existed items to user. > > I remembered that Mahout has skips those user's existed items when recommend > items to user. > > But I have not found the logic for skipping the existed items in Mahout > 0.4. > > Can anyone confirm that or let me know where can find the logic for skipping > existed items ? > > Best Regards, >
-
Re: Mahout 0.4 seems recommend user's existed items to user.
han henry 2011-01-14, 08:43
Hi, Sebastian, I mean this one: http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/RecommenderJob.java2011/1/14 Sebastian Schelter <[EMAIL PROTECTED]> > Hi, > > which recommender are you talking about? The distributed recommender does > not do this, I checked it and will include a test for that into our unit > tests. > > --sebastian > > > > On 14.01.2011 07:13, han henry wrote: > >> Hi,All >> >> Now I have a issue ,Mahout 0.4 seems recommend user's existed items to >> user. >> >> I remembered that Mahout has skips those user's existed items when >> recommend >> items to user. >> >> But I have not found the logic for skipping the existed items in Mahout >> 0.4. >> >> Can anyone confirm that or let me know where can find the logic for >> skipping >> existed items ? >> >> Best Regards, >> >> >
-
Re: Mahout 0.4 seems recommend user's existed items to user.
Sebastian Schelter 2011-01-14, 09:03
Hi Han, I extended the unit test in org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.testCompleteJob() to explicitly check for that behavior and everything seems fine. Can you provide some input data where you see this happening? --sebastian On 14.01.2011 09:43, han henry wrote: > Hi, Sebastian, > > I mean this one: > > http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/RecommenderJob.java> > > 2011/1/14 Sebastian Schelter <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> > > Hi, > > which recommender are you talking about? The distributed recommender > does not do this, I checked it and will include a test for that into > our unit tests. > > --sebastian > > > > On 14.01.2011 07:13, han henry wrote: > > Hi,All > > Now I have a issue ,Mahout 0.4 seems recommend user's existed > items to user. > > I remembered that Mahout has skips those user's existed items > when recommend > items to user. > > But I have not found the logic for skipping the existed items > in Mahout > 0.4. > > Can anyone confirm that or let me know where can find the logic > for skipping > existed items ? > > Best Regards, > > >
-
Re: Mahout 0.4 seems recommend user's existed items to user.
han henry 2011-01-14, 09:17
Hi,Sebastian Because my data is on the production ,it 's very large .so sorry that I can not give you input data. But we can try to review the code . The initial version cooccurence arithmetic has logic to skip user's existed items. Best Regards, 2011/1/14 Sebastian Schelter <[EMAIL PROTECTED]> > Hi Han, > > I extended the unit test in > org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest.testCompleteJob() > to explicitly check for that behavior and everything seems fine. > > Can you provide some input data where you see this happening? > > > --sebastian > > > > On 14.01.2011 09:43, han henry wrote: > >> Hi, Sebastian, >> >> I mean this one: >> >> >> http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/RecommenderJob.java>> >> >> 2011/1/14 Sebastian Schelter <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> >> >> >> Hi, >> >> which recommender are you talking about? The distributed recommender >> does not do this, I checked it and will include a test for that into >> our unit tests. >> >> --sebastian >> >> >> >> On 14.01.2011 07:13, han henry wrote: >> >> Hi,All >> >> Now I have a issue ,Mahout 0.4 seems recommend user's existed >> items to user. >> >> I remembered that Mahout has skips those user's existed items >> when recommend >> items to user. >> >> But I have not found the logic for skipping the existed items >> in Mahout >> 0.4. >> >> Can anyone confirm that or let me know where can find the logic >> for skipping >> existed items ? >> >> Best Regards, >> >> >> >> >
-
Re: Mahout 0.4 seems recommend user's existed items to user.
Sean Owen 2011-01-14, 09:23
Look at ItemFilterAsVectorAndPrefsReducer. This does what you are looking for.
On Fri, Jan 14, 2011 at 9:17 AM, han henry <[EMAIL PROTECTED]> wrote: > Hi,Sebastian > > Because my data is on the production ,it 's very large .so sorry that I can > not give you input data. > > But we can try to review the code . > > The initial version cooccurence arithmetic has logic to skip user's existed > items. > > Best Regards,
-
Re: Mahout 0.4 seems recommend user's existed items to user.
han henry 2011-01-14, 09:36
Thank you Sean and sebastian :)
2011/1/14 Sean Owen <[EMAIL PROTECTED]>
> Look at ItemFilterAsVectorAndPrefsReducer. This does what you are looking > for. > > On Fri, Jan 14, 2011 at 9:17 AM, han henry <[EMAIL PROTECTED]> wrote: > > Hi,Sebastian > > > > Because my data is on the production ,it 's very large .so sorry that I > can > > not give you input data. > > > > But we can try to review the code . > > > > The initial version cooccurence arithmetic has logic to skip user's > existed > > items. > > > > Best Regards, >
-
Re: Mahout 0.4 seems recommend user's existed items to user.
han henry 2011-01-14, 10:19
Hi,Sean and sebastian We have two type preference . 1) ,Preferences that user does not want to see them ,we store those preference in filterFile. 2) ,All preferences (include those in the #1) ,also those data can use to calculate similarity. We can not recommend those items to user #1, Invalid items or expired items .we store those items in itemSFile. #2, User Non-interested items ,we store those user ,item pairs in filterFile . #3, User existed items (user already has those item in preferences ). ItemFilterAsVectorAndPrefsReducer seems can make those items been skiped in last step. so we do #1 and #2 in the last step (AggregateAndRecommendReducer.java< http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java>), but I have not found logic to skip #3. Am I right ? Best Regards, 2011/1/14 han henry <[EMAIL PROTECTED]> > Thank you Sean and sebastian :) > > 2011/1/14 Sean Owen <[EMAIL PROTECTED]> > > Look at ItemFilterAsVectorAndPrefsReducer. This does what you are looking >> for. >> >> On Fri, Jan 14, 2011 at 9:17 AM, han henry <[EMAIL PROTECTED]> wrote: >> > Hi,Sebastian >> > >> > Because my data is on the production ,it 's very large .so sorry that I >> can >> > not give you input data. >> > >> > But we can try to review the code . >> > >> > The initial version cooccurence arithmetic has logic to skip user's >> existed >> > items. >> > >> > Best Regards, >> > >
-
Re: Mahout 0.4 seems recommend user's existed items to user.
Sean Owen 2011-01-14, 10:24
ItemFilterAsVectorAndPrefsReducer does #3. You can always post-process the recommendations however you like and ignore whatever items you want. On Fri, Jan 14, 2011 at 10:19 AM, han henry <[EMAIL PROTECTED]> wrote: > Hi,Sean and sebastian > > We have two type preference . > > 1) ,Preferences that user does not want to see them ,we store those > preference in filterFile. > 2) ,All preferences (include those in the #1) ,also those data can use to > calculate similarity. > > We can not recommend those items to user > > #1, Invalid items or expired items .we store those items in itemSFile. > #2, User Non-interested items ,we store those user ,item pairs in filterFile > . > #3, User existed items (user already has those item in preferences ). > > ItemFilterAsVectorAndPrefsReducer seems can make those items been skiped > in last step. > > so we do #1 and #2 in the last step > (AggregateAndRecommendReducer.java< http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java>), > but I have not found logic to skip #3. > > Am I right ? > > Best Regards, > > 2011/1/14 han henry <[EMAIL PROTECTED]> > >> Thank you Sean and sebastian :) >> >> 2011/1/14 Sean Owen <[EMAIL PROTECTED]> >> >> Look at ItemFilterAsVectorAndPrefsReducer. This does what you are looking >>> for. >>> >>> On Fri, Jan 14, 2011 at 9:17 AM, han henry <[EMAIL PROTECTED]> wrote: >>> > Hi,Sebastian >>> > >>> > Because my data is on the production ,it 's very large .so sorry that I >>> can >>> > not give you input data. >>> > >>> > But we can try to review the code . >>> > >>> > The initial version cooccurence arithmetic has logic to skip user's >>> existed >>> > items. >>> > >>> > Best Regards, >>> >> >> >
-
Re: Mahout 0.4 seems recommend user's existed items to user.
han henry 2011-01-14, 11:09
Got your meaning. It's a easy and efficient way. Thanks, 2011/1/14 Sean Owen <[EMAIL PROTECTED]> > ItemFilterAsVectorAndPrefsReducer does #3. > > You can always post-process the recommendations however you like and > ignore whatever items you want. > > On Fri, Jan 14, 2011 at 10:19 AM, han henry <[EMAIL PROTECTED]> wrote: > > Hi,Sean and sebastian > > > > We have two type preference . > > > > 1) ,Preferences that user does not want to see them ,we store those > > preference in filterFile. > > 2) ,All preferences (include those in the #1) ,also those data can use > to > > calculate similarity. > > > > We can not recommend those items to user > > > > #1, Invalid items or expired items .we store those items in itemSFile. > > #2, User Non-interested items ,we store those user ,item pairs in > filterFile > > . > > #3, User existed items (user already has those item in preferences ). > > > > ItemFilterAsVectorAndPrefsReducer seems can make those items been > skiped > > in last step. > > > > so we do #1 and #2 in the last step > > (AggregateAndRecommendReducer.java< > http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java> >), > > but I have not found logic to skip #3. > > > > Am I right ? > > > > Best Regards, > > > > 2011/1/14 han henry <[EMAIL PROTECTED]> > > > >> Thank you Sean and sebastian :) > >> > >> 2011/1/14 Sean Owen <[EMAIL PROTECTED]> > >> > >> Look at ItemFilterAsVectorAndPrefsReducer. This does what you are > looking > >>> for. > >>> > >>> On Fri, Jan 14, 2011 at 9:17 AM, han henry <[EMAIL PROTECTED]> > wrote: > >>> > Hi,Sebastian > >>> > > >>> > Because my data is on the production ,it 's very large .so sorry that > I > >>> can > >>> > not give you input data. > >>> > > >>> > But we can try to review the code . > >>> > > >>> > The initial version cooccurence arithmetic has logic to skip user's > >>> existed > >>> > items. > >>> > > >>> > Best Regards, > >>> > >> > >> > > >
-
Re: Mahout 0.4 seems recommend user's existed items to user.
Sebastian Schelter 2011-01-14, 12:39
Hi Han, It's hard to see from the sources how the users' already preferred items (#3) are excluded from the final results but it's definitely done. I'll walk you through the code: In SimilarityMatrixRowWrapperMapper.map() we map all similar items for each item as a vector, notice that the similarity value of each item to itself is set to NaN here. When AggregateAndRecommender computes the final recommendations, it receives a PrefAndSimilarityColumnWritable for each item preferred by the user. Those similarity vectors and preference values are used to compute the weighted sum that gives the prediction value for each item to recommend. For each item that has already been preferred by the user we can be sure that there is the NaN value from above added to its sum which makes it NaN too. Finally all NaN predictions are explicitly filtered in AggregateAndRecommendReducer.writeRecommendedItems(). --sebastian On 14.01.2011 11:19, han henry wrote: > Hi,Sean and sebastian > > We have two type preference . > > 1) ,Preferences that user does not want to see them ,we store those > preference in filterFile. > 2) ,All preferences (include those in the #1) ,also those data can use to > calculate similarity. > > We can not recommend those items to user > > #1, Invalid items or expired items .we store those items in itemSFile. > #2, User Non-interested items ,we store those user ,item pairs in filterFile > . > #3, User existed items (user already has those item in preferences ). > > ItemFilterAsVectorAndPrefsReducer seems can make those items been skiped > in last step. > > so we do #1 and #2 in the last step > (AggregateAndRecommendReducer.java< http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java>), > but I have not found logic to skip #3. > > Am I right ? > > Best Regards, > > 2011/1/14 han henry<[EMAIL PROTECTED]> > >> Thank you Sean and sebastian :) >> >> 2011/1/14 Sean Owen<[EMAIL PROTECTED]> >> >> Look at ItemFilterAsVectorAndPrefsReducer. This does what you are looking >>> for. >>> >>> On Fri, Jan 14, 2011 at 9:17 AM, han henry<[EMAIL PROTECTED]> wrote: >>>> Hi,Sebastian >>>> >>>> Because my data is on the production ,it 's very large .so sorry that I >>> can >>>> not give you input data. >>>> >>>> But we can try to review the code . >>>> >>>> The initial version cooccurence arithmetic has logic to skip user's >>> existed >>>> items. >>>> >>>> Best Regards, >>
-
Re: Mahout 0.4 seems recommend user's existed items to user.
han henry 2011-01-17, 15:57
Hi,Sebastian, I have viewed the code today. Assume that the output of job partialMultiply as following: context.write(key, vectorAndPrefs); ItemA -->(([itemB,0.9],[itemC,0.1]),({user1,user2)),({10,1})) ItemB--> (([itemA,0.9]),{user1,user2),(5,1)). It meas that user1 has existed item itemA and ItemB,it also may recommend user1 with itemA or ItemB. Am I right ? Best Regards, --Henry Han 2011/1/14 Sebastian Schelter <[EMAIL PROTECTED]> > Hi Han, > > It's hard to see from the sources how the users' already preferred items > (#3) are excluded from the final results but it's definitely done. > > I'll walk you through the code: > > In SimilarityMatrixRowWrapperMapper.map() we map all similar items for each > item as a vector, notice that the similarity value of each item to itself is > set to NaN here. > > When AggregateAndRecommender computes the final recommendations, it > receives a PrefAndSimilarityColumnWritable for each item preferred by the > user. Those similarity vectors and preference values are used to compute the > weighted sum that gives the prediction value for each item to recommend. > > For each item that has already been preferred by the user we can be sure > that there is the NaN value from above added to its sum which makes it NaN > too. Finally all NaN predictions are explicitly filtered in > AggregateAndRecommendReducer.writeRecommendedItems(). > > > --sebastian > > > > > > > On 14.01.2011 11:19, han henry wrote: > >> Hi,Sean and sebastian >> >> We have two type preference . >> >> 1) ,Preferences that user does not want to see them ,we store those >> preference in filterFile. >> 2) ,All preferences (include those in the #1) ,also those data can use to >> calculate similarity. >> >> We can not recommend those items to user >> >> #1, Invalid items or expired items .we store those items in itemSFile. >> #2, User Non-interested items ,we store those user ,item pairs in >> filterFile >> . >> #3, User existed items (user already has those item in preferences ). >> >> ItemFilterAsVectorAndPrefsReducer seems can make those items been skiped >> in last step. >> >> so we do #1 and #2 in the last step >> (AggregateAndRecommendReducer.java< >> http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java>> >), >> >> but I have not found logic to skip #3. >> >> Am I right ? >> >> Best Regards, >> >> 2011/1/14 han henry<[EMAIL PROTECTED]> >> >> Thank you Sean and sebastian :) >>> >>> 2011/1/14 Sean Owen<[EMAIL PROTECTED]> >>> >>> Look at ItemFilterAsVectorAndPrefsReducer. This does what you are looking >>> >>>> for. >>>> >>>> On Fri, Jan 14, 2011 at 9:17 AM, han henry<[EMAIL PROTECTED]> >>>> wrote: >>>> >>>>> Hi,Sebastian >>>>> >>>>> Because my data is on the production ,it 's very large .so sorry that I >>>>> >>>> can >>>> >>>>> not give you input data. >>>>> >>>>> But we can try to review the code . >>>>> >>>>> The initial version cooccurence arithmetic has logic to skip user's >>>>> >>>> existed >>>> >>>>> items. >>>>> >>>>> Best Regards, >>>>> >>>> >>> >
-
Re: Mahout 0.4 seems recommend user's existed items to user.
Sebastian Schelter 2011-01-17, 16:55
It's true that already preferred items might be looked at in AggregateAndRecommendReducer but the prediction for them will always be NaN so they will be filtered out. --sebastian On 17.01.2011 16:57, han henry wrote: > Hi,Sebastian, > > I have viewed the code today. > > Assume that the output of job partialMultiply as following: > > context.write(key, vectorAndPrefs); > > ItemA -->(([itemB,0.9],[itemC,0.1]),({user1,user2)),({10,1})) > ItemB--> (([itemA,0.9]),{user1,user2),(5,1)). > > It meas that user1 has existed item itemA and ItemB,it also may > recommend user1 with itemA or ItemB. > > Am I right ? > > Best Regards, > > --Henry Han > > > 2011/1/14 Sebastian Schelter <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> > > Hi Han, > > It's hard to see from the sources how the users' already preferred > items (#3) are excluded from the final results but it's definitely done. > > I'll walk you through the code: > > In SimilarityMatrixRowWrapperMapper.map() we map all similar items > for each item as a vector, notice that the similarity value of each > item to itself is set to NaN here. > > When AggregateAndRecommender computes the final recommendations, it > receives a PrefAndSimilarityColumnWritable for each item preferred > by the user. Those similarity vectors and preference values are used > to compute the weighted sum that gives the prediction value for each > item to recommend. > > For each item that has already been preferred by the user we can be > sure that there is the NaN value from above added to its sum which > makes it NaN too. Finally all NaN predictions are explicitly > filtered in AggregateAndRecommendReducer.writeRecommendedItems(). > > > --sebastian > > > > > > > On 14.01.2011 11:19, han henry wrote: > > Hi,Sean and sebastian > > We have two type preference . > > 1) ,Preferences that user does not want to see them ,we store those > preference in filterFile. > 2) ,All preferences (include those in the #1) ,also those data > can use to > calculate similarity. > > We can not recommend those items to user > > #1, Invalid items or expired items .we store those items in > itemSFile. > #2, User Non-interested items ,we store those user ,item pairs > in filterFile > . > #3, User existed items (user already has those item in > preferences ). > > ItemFilterAsVectorAndPrefsReducer seems can make those items > been skiped > in last step. > > so we do #1 and #2 in the last step > (AggregateAndRecommendReducer.java< http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java>), > > but I have not found logic to skip #3. > > Am I right ? > > Best Regards, > > 2011/1/14 han henry<[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> > > Thank you Sean and sebastian :) > > 2011/1/14 Sean Owen<[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> > > Look at ItemFilterAsVectorAndPrefsReducer. This does what > you are looking > > for. > > On Fri, Jan 14, 2011 at 9:17 AM, han > henry<[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> wrote: > > Hi,Sebastian > > Because my data is on the production ,it 's very > large .so sorry that I > > can > > not give you input data. > > But we can try to review the code . > > The initial version cooccurence arithmetic has logic > to skip user's > > existed > > items. > > Best Regards, > > > >
|
|