|
Nishant Chandra
2011-11-27, 09:44
Paritosh Ranjan
2011-11-27, 09:57
Nishant Chandra
2011-11-27, 10:09
Paritosh Ranjan
2011-11-27, 10:14
Nishant Chandra
2011-11-27, 10:17
Paritosh Ranjan
2011-11-27, 10:25
Lee Carroll
2011-11-27, 10:26
Isabel Drost
2011-11-27, 16:54
Ted Dunning
2011-11-28, 00:14
Nishant Chandra
2011-11-28, 13:21
Ted Dunning
2011-11-28, 18:26
|
-
Sequential Pattern MiningNishant Chandra 2011-11-27, 09:44
Hi,
Is there any implementation for Sequential Pattern Mining in Mahout? I see there is an implementation of Sequential Pattern Mining but I am unsure if it can be used for my use case. Thanks, Nishant
-
Re: Sequential Pattern MiningParitosh Ranjan 2011-11-27, 09:57
Can you tell something about your use case?
Paritosh On 27-11-2011 15:14, Nishant Chandra wrote: > Hi, > > Is there any implementation for Sequential Pattern Mining in Mahout? I > see there is an implementation of Sequential Pattern Mining but I am > unsure if it can be used for my use case. > > Thanks, > Nishant > > > ----- > No virus found in this message. > Checked by AVG - www.avg.com > Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11
-
Re: Sequential Pattern MiningNishant Chandra 2011-11-27, 10:09
Use case is related to purchase transactions.
Sample data set: Customer ID Acquisition time Products 101 30 June 2007 Product 1 101 12 August 2007 Product 3 101 20 December 2008 Product 4 102 10 September 2008 Product 3 102 12 September 2008 Product 5 102 20 January 2009 Product 5..... Sample rule: Rule ID Consequent Antecedents Support % Confidence % Rule 1 Product 4 Product 1 then Product 3 57.1 75.0 I want to identify rules such as: after acquiring product 1 and then product 3, customers have an increased likelihood (75%) of purchasing product 4 next. Thanks, Nishant On Sun, Nov 27, 2011 at 3:27 PM, Paritosh Ranjan <[EMAIL PROTECTED]> wrote: > Can you tell something about your use case? > > Paritosh > > On 27-11-2011 15:14, Nishant Chandra wrote: >> >> Hi, >> >> Is there any implementation for Sequential Pattern Mining in Mahout? I >> see there is an implementation of Sequential Pattern Mining but I am >> unsure if it can be used for my use case. >> >> Thanks, >> Nishant >> >> >> ----- >> No virus found in this message. >> Checked by AVG - www.avg.com >> Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11 > >
-
Re: Sequential Pattern MiningParitosh Ranjan 2011-11-27, 10:14
Have you checked out the recommendation algorithms? I think this can be
easily done using them. Paritosh On 27-11-2011 15:39, Nishant Chandra wrote: > Use case is related to purchase transactions. > > Sample data set: > Customer ID Acquisition time Products > 101 30 June 2007 Product 1 > 101 12 August 2007 Product 3 > 101 20 December 2008 Product 4 > 102 10 September 2008 Product 3 > 102 12 September 2008 Product 5 > 102 20 January 2009 Product 5..... > > Sample rule: > Rule ID Consequent Antecedents Support % Confidence % > Rule 1 Product 4 Product 1 then Product 3 57.1 75.0 > > I want to identify rules such as: after acquiring product 1 and then > product 3, customers have an increased likelihood > (75%) of purchasing product 4 next. > > Thanks, > Nishant > > > On Sun, Nov 27, 2011 at 3:27 PM, Paritosh Ranjan<[EMAIL PROTECTED]> wrote: >> Can you tell something about your use case? >> >> Paritosh >> >> On 27-11-2011 15:14, Nishant Chandra wrote: >>> Hi, >>> >>> Is there any implementation for Sequential Pattern Mining in Mahout? I >>> see there is an implementation of Sequential Pattern Mining but I am >>> unsure if it can be used for my use case. >>> >>> Thanks, >>> Nishant >>> >>> >>> ----- >>> No virus found in this message. >>> Checked by AVG - www.avg.com >>> Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11 >> > > ----- > No virus found in this message. > Checked by AVG - www.avg.com > Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11
-
Re: Sequential Pattern MiningNishant Chandra 2011-11-27, 10:17
Are you talking about CF? Can you please explain a bit?
To be clear, for my use case, temporal sequence is important. Nishant On Sun, Nov 27, 2011 at 3:44 PM, Paritosh Ranjan <[EMAIL PROTECTED]> wrote: > Have you checked out the recommendation algorithms? I think this can be > easily done using them. > > Paritosh > > On 27-11-2011 15:39, Nishant Chandra wrote: >> >> Use case is related to purchase transactions. >> >> Sample data set: >> Customer ID Acquisition time Products >> 101 30 June 2007 Product 1 >> 101 12 August 2007 Product 3 >> 101 20 December 2008 Product 4 >> 102 10 September 2008 Product 3 >> 102 12 September 2008 Product 5 >> 102 20 January 2009 Product 5..... >> >> Sample rule: >> Rule ID Consequent Antecedents Support % >> Confidence % >> Rule 1 Product 4 Product 1 then Product 3 57.1 >> 75.0 >> >> I want to identify rules such as: after acquiring product 1 and then >> product 3, customers have an increased likelihood >> (75%) of purchasing product 4 next. >> >> Thanks, >> Nishant >> >> >> On Sun, Nov 27, 2011 at 3:27 PM, Paritosh Ranjan<[EMAIL PROTECTED]> >> wrote: >>> >>> Can you tell something about your use case? >>> >>> Paritosh >>> >>> On 27-11-2011 15:14, Nishant Chandra wrote: >>>> >>>> Hi, >>>> >>>> Is there any implementation for Sequential Pattern Mining in Mahout? I >>>> see there is an implementation of Sequential Pattern Mining but I am >>>> unsure if it can be used for my use case. >>>> >>>> Thanks, >>>> Nishant >>>> >>>> >>>> ----- >>>> No virus found in this message. >>>> Checked by AVG - www.avg.com >>>> Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11 >>> >> >> ----- >> No virus found in this message. >> Checked by AVG - www.avg.com >> Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11 > >
-
Re: Sequential Pattern MiningParitosh Ranjan 2011-11-27, 10:25
I am talking about the implementations available in Mahout where you can
find similarity between users by analyzing some datamodel and then recommend items based on that. If this can solve your problem. I see this implemented in Mahout. And its very easy to use. On 27-11-2011 15:47, Nishant Chandra wrote: > Are you talking about CF? Can you please explain a bit? > > To be clear, for my use case, temporal sequence is important. > > Nishant > > On Sun, Nov 27, 2011 at 3:44 PM, Paritosh Ranjan<[EMAIL PROTECTED]> wrote: >> Have you checked out the recommendation algorithms? I think this can be >> easily done using them. >> >> Paritosh >> >> On 27-11-2011 15:39, Nishant Chandra wrote: >>> Use case is related to purchase transactions. >>> >>> Sample data set: >>> Customer ID Acquisition time Products >>> 101 30 June 2007 Product 1 >>> 101 12 August 2007 Product 3 >>> 101 20 December 2008 Product 4 >>> 102 10 September 2008 Product 3 >>> 102 12 September 2008 Product 5 >>> 102 20 January 2009 Product 5..... >>> >>> Sample rule: >>> Rule ID Consequent Antecedents Support % >>> Confidence % >>> Rule 1 Product 4 Product 1 then Product 3 57.1 >>> 75.0 >>> >>> I want to identify rules such as: after acquiring product 1 and then >>> product 3, customers have an increased likelihood >>> (75%) of purchasing product 4 next. >>> >>> Thanks, >>> Nishant >>> >>> >>> On Sun, Nov 27, 2011 at 3:27 PM, Paritosh Ranjan<[EMAIL PROTECTED]> >>> wrote: >>>> Can you tell something about your use case? >>>> >>>> Paritosh >>>> >>>> On 27-11-2011 15:14, Nishant Chandra wrote: >>>>> Hi, >>>>> >>>>> Is there any implementation for Sequential Pattern Mining in Mahout? I >>>>> see there is an implementation of Sequential Pattern Mining but I am >>>>> unsure if it can be used for my use case. >>>>> >>>>> Thanks, >>>>> Nishant >>>>> >>>>> >>>>> ----- >>>>> No virus found in this message. >>>>> Checked by AVG - www.avg.com >>>>> Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11 >>> ----- >>> No virus found in this message. >>> Checked by AVG - www.avg.com >>> Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11 >> > > ----- > No virus found in this message. > Checked by AVG - www.avg.com > Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11 >
-
Re: Sequential Pattern MiningLee Carroll 2011-11-27, 10:26
I don't want to go off topic but perhaps its possible to use cf by
creating super product id's of combined purchases 101 1 101 2 > implies 101 3 where 3 is a product which represents the purchase of 1 and 2 in a time frame. users / items associated with 3 have a pref for item / users ... However I know its not your original focus of the your question so maybe theiris a much better way lee c On 27 November 2011 10:17, Nishant Chandra <[EMAIL PROTECTED]> wrote: > Are you talking about CF? Can you please explain a bit? > > To be clear, for my use case, temporal sequence is important. > > Nishant > > On Sun, Nov 27, 2011 at 3:44 PM, Paritosh Ranjan <[EMAIL PROTECTED]> wrote: >> Have you checked out the recommendation algorithms? I think this can be >> easily done using them. >> >> Paritosh >> >> On 27-11-2011 15:39, Nishant Chandra wrote: >>> >>> Use case is related to purchase transactions. >>> >>> Sample data set: >>> Customer ID Acquisition time Products >>> 101 30 June 2007 Product 1 >>> 101 12 August 2007 Product 3 >>> 101 20 December 2008 Product 4 >>> 102 10 September 2008 Product 3 >>> 102 12 September 2008 Product 5 >>> 102 20 January 2009 Product 5..... >>> >>> Sample rule: >>> Rule ID Consequent Antecedents Support % >>> Confidence % >>> Rule 1 Product 4 Product 1 then Product 3 57.1 >>> 75.0 >>> >>> I want to identify rules such as: after acquiring product 1 and then >>> product 3, customers have an increased likelihood >>> (75%) of purchasing product 4 next. >>> >>> Thanks, >>> Nishant >>> >>> >>> On Sun, Nov 27, 2011 at 3:27 PM, Paritosh Ranjan<[EMAIL PROTECTED]> >>> wrote: >>>> >>>> Can you tell something about your use case? >>>> >>>> Paritosh >>>> >>>> On 27-11-2011 15:14, Nishant Chandra wrote: >>>>> >>>>> Hi, >>>>> >>>>> Is there any implementation for Sequential Pattern Mining in Mahout? I >>>>> see there is an implementation of Sequential Pattern Mining but I am >>>>> unsure if it can be used for my use case. >>>>> >>>>> Thanks, >>>>> Nishant >>>>> >>>>> >>>>> ----- >>>>> No virus found in this message. >>>>> Checked by AVG - www.avg.com >>>>> Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11 >>>> >>> >>> ----- >>> No virus found in this message. >>> Checked by AVG - www.avg.com >>> Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11 >> >> >
-
Re: Sequential Pattern MiningIsabel Drost 2011-11-27, 16:54
On 27.11.2011 Nishant Chandra wrote:
> I want to identify rules such as: after acquiring product 1 and then > product 3, customers have an increased likelihood > (75%) of purchasing product 4 next. What is your goal with discovering these rules? Assuming what you want is implementing a feature that recommends items to customers they are likely to buy: Did you check the fpgrowth implementation already? Though it does not cover the temporal aspect you mention it might still be of value for you as it is capable of discovering items that are typically puchased together. If you would rather personalize your offerings to the preferences of each of your customers you might be better of taking a closer look at the collaborative filtering implementations of Mahout. Isabel
-
Re: Sequential Pattern MiningTed Dunning 2011-11-28, 00:14
There are several good ways to deal with this. The idea of super-products
which are composite features that are derived from history is a good one. I would recommend that you limit the number of such super features by first finding which products cooccur within a reasonable time window more than you would expect. The cooccurrence analysis system in Mahout can be misused for this analysis by building one document per user per sliding window period. This is a bit flawed since the sliding windows overlap and thus the appearances of a transaction in multiple documents is not really an indication of independent appearances. Also, the intermediate window documents are much larger than you might like and they won't take ordering into account. A better approach is to adapt the current code. The basic data you need to collect are: - the number of times each product appears in a single users transaction history before another product. - the number of times each product appears in a transaction history after another product - the number of times product i appears after product j. You can then use the LLR code in Mahout to find cases where a product sequence occurs anomalously often. You can then use a Bloom filter or similar data structure to analyze histories so that you emit product and super-products as input to a conventional collaborative filtering analysis. The second major approach to this problem is to build a separate classifier for each product of interest. I wouldn't recommend that if you have lots of possible products, but this can work very well if you have a reasonably small number of products (say a few hundred or thousand) that you might be about to recommend. On Sun, Nov 27, 2011 at 2:09 AM, Nishant Chandra <[EMAIL PROTECTED]>wrote: > Use case is related to purchase transactions. > > Sample data set: > Customer ID Acquisition time Products > 101 30 June 2007 Product 1 > 101 12 August 2007 Product 3 > 101 20 December 2008 Product 4 > 102 10 September 2008 Product 3 > 102 12 September 2008 Product 5 > 102 20 January 2009 Product 5..... > > Sample rule: > Rule ID Consequent Antecedents Support % > Confidence % > Rule 1 Product 4 Product 1 then Product 3 57.1 > 75.0 > > I want to identify rules such as: after acquiring product 1 and then > product 3, customers have an increased likelihood > (75%) of purchasing product 4 next. > > Thanks, > Nishant > > > On Sun, Nov 27, 2011 at 3:27 PM, Paritosh Ranjan <[EMAIL PROTECTED]> > wrote: > > Can you tell something about your use case? > > > > Paritosh > > > > On 27-11-2011 15:14, Nishant Chandra wrote: > >> > >> Hi, > >> > >> Is there any implementation for Sequential Pattern Mining in Mahout? I > >> see there is an implementation of Sequential Pattern Mining but I am > >> unsure if it can be used for my use case. > >> > >> Thanks, > >> Nishant > >> > >> > >> ----- > >> No virus found in this message. > >> Checked by AVG - www.avg.com > >> Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11 > > > > >
-
Re: Sequential Pattern MiningNishant Chandra 2011-11-28, 13:21
Hi Ted,
I dont understand the composite features and super-products that you mentioned. Please explain a bit. Are you pointing to a specific data mining method? Thanks, Nishant On Mon, Nov 28, 2011 at 5:44 AM, Ted Dunning <[EMAIL PROTECTED]> wrote: > There are several good ways to deal with this. The idea of super-products > which are composite features that are derived from history is a good one. > I would recommend that you limit the number of such super features by > first finding which products cooccur within a reasonable time window more > than you would expect. > > The cooccurrence analysis system in Mahout can be misused for this analysis > by building one document per user per sliding window period. This is a bit > flawed since the sliding windows overlap and thus the appearances of a > transaction in multiple documents is not really an indication of > independent appearances. Also, the intermediate window documents are much > larger than you might like and they won't take ordering into account. > > A better approach is to adapt the current code. The basic data you need to > collect are: > > - the number of times each product appears in a single users transaction > history before another product. > > - the number of times each product appears in a transaction history after > another product > > - the number of times product i appears after product j. > > You can then use the LLR code in Mahout to find cases where a product > sequence occurs anomalously often. You can then use a Bloom filter or > similar data structure to analyze histories so that you emit product and > super-products as input to a conventional collaborative filtering analysis. > > > The second major approach to this problem is to build a separate classifier > for each product of interest. I wouldn't recommend that if you have lots > of possible products, but this can work very well if you have a reasonably > small number of products (say a few hundred or thousand) that you might be > about to recommend. > > > On Sun, Nov 27, 2011 at 2:09 AM, Nishant Chandra > <[EMAIL PROTECTED]>wrote: > >> Use case is related to purchase transactions. >> >> Sample data set: >> Customer ID Acquisition time Products >> 101 30 June 2007 Product 1 >> 101 12 August 2007 Product 3 >> 101 20 December 2008 Product 4 >> 102 10 September 2008 Product 3 >> 102 12 September 2008 Product 5 >> 102 20 January 2009 Product 5..... >> >> Sample rule: >> Rule ID Consequent Antecedents Support % >> Confidence % >> Rule 1 Product 4 Product 1 then Product 3 57.1 >> 75.0 >> >> I want to identify rules such as: after acquiring product 1 and then >> product 3, customers have an increased likelihood >> (75%) of purchasing product 4 next. >> >> Thanks, >> Nishant >> >> >> On Sun, Nov 27, 2011 at 3:27 PM, Paritosh Ranjan <[EMAIL PROTECTED]> >> wrote: >> > Can you tell something about your use case? >> > >> > Paritosh >> > >> > On 27-11-2011 15:14, Nishant Chandra wrote: >> >> >> >> Hi, >> >> >> >> Is there any implementation for Sequential Pattern Mining in Mahout? I >> >> see there is an implementation of Sequential Pattern Mining but I am >> >> unsure if it can be used for my use case. >> >> >> >> Thanks, >> >> Nishant >> >> >> >> >> >> ----- >> >> No virus found in this message. >> >> Checked by AVG - www.avg.com >> >> Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11 >> > >> > >> >
-
Re: Sequential Pattern MiningTed Dunning 2011-11-28, 18:26
OK.
Suppose that people who buy milk and chocolate chip cookies are good prospects for buying life insurance, but buy either product alone is not a strong indicator. You can build a special feature milk_and_chocolate_chip_cookies in addition to the separate features for the individual milk and chocolate_chip_cookies products. The composite features can be order specific or not. The potential number of such features is huge. Clearly, a first cut is to limit your consideration to composites that actually appear. You probably should limit the number that you consider even more stringently than that. On Mon, Nov 28, 2011 at 5:21 AM, Nishant Chandra <[EMAIL PROTECTED]>wrote: > Hi Ted, > > I dont understand the composite features and super-products that you > mentioned. Please explain a bit. Are you pointing to a specific data > mining method? > > Thanks, > Nishant > > On Mon, Nov 28, 2011 at 5:44 AM, Ted Dunning <[EMAIL PROTECTED]> > wrote: > > There are several good ways to deal with this. The idea of > super-products > > which are composite features that are derived from history is a good one. > > I would recommend that you limit the number of such super features by > > first finding which products cooccur within a reasonable time window more > > than you would expect. > > > > The cooccurrence analysis system in Mahout can be misused for this > analysis > > by building one document per user per sliding window period. This is a > bit > > flawed since the sliding windows overlap and thus the appearances of a > > transaction in multiple documents is not really an indication of > > independent appearances. Also, the intermediate window documents are > much > > larger than you might like and they won't take ordering into account. > > > > A better approach is to adapt the current code. The basic data you need > to > > collect are: > > > > - the number of times each product appears in a single users transaction > > history before another product. > > > > - the number of times each product appears in a transaction history after > > another product > > > > - the number of times product i appears after product j. > > > > You can then use the LLR code in Mahout to find cases where a product > > sequence occurs anomalously often. You can then use a Bloom filter or > > similar data structure to analyze histories so that you emit product and > > super-products as input to a conventional collaborative filtering > analysis. > > > > > > The second major approach to this problem is to build a separate > classifier > > for each product of interest. I wouldn't recommend that if you have lots > > of possible products, but this can work very well if you have a > reasonably > > small number of products (say a few hundred or thousand) that you might > be > > about to recommend. > > > > > > On Sun, Nov 27, 2011 at 2:09 AM, Nishant Chandra > > <[EMAIL PROTECTED]>wrote: > > > >> Use case is related to purchase transactions. > >> > >> Sample data set: > >> Customer ID Acquisition time Products > >> 101 30 June 2007 Product 1 > >> 101 12 August 2007 Product 3 > >> 101 20 December 2008 Product 4 > >> 102 10 September 2008 Product 3 > >> 102 12 September 2008 Product 5 > >> 102 20 January 2009 Product 5..... > >> > >> Sample rule: > >> Rule ID Consequent Antecedents Support % > >> Confidence % > >> Rule 1 Product 4 Product 1 then Product 3 57.1 > >> 75.0 > >> > >> I want to identify rules such as: after acquiring product 1 and then > >> product 3, customers have an increased likelihood > >> (75%) of purchasing product 4 next. > >> > >> Thanks, > >> Nishant > >> > >> > >> On Sun, Nov 27, 2011 at 3:27 PM, Paritosh Ranjan <[EMAIL PROTECTED]> > >> wrote: > >> > Can you tell something about your use case? > >> > > >> > Paritosh > >> > > >> > On 27-11-2011 15:14, Nishant Chandra wrote: > >> >> > >> >> Hi, > >> >> > >> >> Is there any implementation for Sequential Pattern Mining in Mahout? > I > > |