|
|
-
Re: Frequent itemset mining戴清灏 2011-12-02, 05:54
For a sequential implementation, fpgrowth.java might be the first.
For a parallel implementation, pfpgrowth.java might be. there are 5 steps at total and 4 out of them are mapreduce. Sent from my mobile phone 在 2011-12-2 下午12:48,"Dave Fry" <[EMAIL PROTECTED]>写道: > That would be fantastic, thank you! > > In the meantime, can you direct me to where in the source I should start > looking? (ie, which class would be the entry point I'm looking for?) > > 2011/12/1 戴清灏 <[EMAIL PROTECTED]> > > > There is actually a lack of the doc for the frequent pattern mining > usage. > > Actually, you are not the first one who claims the need of it. > > I will be pleased to write one for that usage since I've read almost the > > source code of it. > > > > 在 2011年12月2日星期五,Dave Fry 写道: > > > > > Hi! I apologize for the newbie question, I'm just getting started with > > > Mahout. > > > > > > On the "Overview" page on Mahout's website: > > > https://cwiki.apache.org/confluence/display/MAHOUT/Overview > > > > > > It mentions this as the four primary targeted use cases for Mahout: > > > 1) Recommendation mining takes users' behavior and from that tries to > > find > > > items users might like. > > > 2) Clustering takes e.g. text documents and groups them into groups of > > > topically related documents. > > > 3) Classification learns from exisiting categorized documents what > > > documents of a specific category look like and is able to assign > > unlabelled > > > documents to the (hopefully) correct category. > > > 4) Frequent itemset mining takes a set of item groups (terms in a query > > > session, shopping cart content) and identifies, which individual items > > > usually appear together. > > > > > > But, based on the Mahout documentation that I've read through, I can't > > seem > > > to find a clear mapping from that use case description to where in the > > > Mahout distribution I should be looking. I've found several leads for > > use > > > case #1, but #4 seems to be a bit of a mystery (and searches for > > "frequent > > > itemset mining" don't seem to lead me to where I need to go.) > > > > > > Basically, I'm looking to the answer to the question "Which items > appear > > > most often with item X in browse histories and shopping carts?". (As > > > opposed to "Based on what I know about your preferences, here are the > > items > > > that I predict you would be most likely to browse/add to your cart".) > > > > > > Any help is appreciated! > > > Thanks, > > > Dave > > > > > > > > > -- > > Regards, > > Q > > > |