On Wed, May 30, 2012 at 3:54 AM, Ted Dunning <[EMAIL PROTECTED]> wrote:
> On Tue, May 29, 2012 at 5:36 AM, Geek Gamer <[EMAIL PROTECTED]> wrote:
>
>> ...
>> Do you have any refernce paper or system about how people have used it
>> to improve recommendation systems? How people define context vectors
>> using extra information.
>>
>
> I don't have references at hand, but I remember that a google search got
> some good ones last time I looked.
>
> A quick idea I got was to use LDA to build topic vectors and use them
>> as context vectors, any thoughts on that.
>>
>
> That can work well.
>
>
>> RI seems to be a good candidate for contribution to mahout.
>>
>
> Could be. We are less than enthusiastic about purely theoretical
> contributions. If you use it seriously, that would count for a lot, but
> just writing it for the sake of writing it isn't as likely to find a
> positive reception.
Right now it is pretty much in theory, i am doing some ground work,
but since i am using mahout heavily I would end up using it in mahout.
I'll keep the group updated.
>
> On Wed, May 23, 2012 at 12:11 PM, Ted Dunning <[EMAIL PROTECTED]> wrote:
>> > RI, per se, probably won't help that much with the coincidence problem.
>> >
>> > The Mahout math libraries would help a lot with a random indexing
>> > implementation.
>> >
>> > Kitenga has some very nice random indexing support. See
>> >
http://www.kitenga.com/>> >
>> > They offer commercial software, but you get what you pay for.
>> >
>> > On Wed, May 23, 2012 at 12:18 AM, Mugoma Joseph Okomba <
>> [EMAIL PROTECTED]>wrote:
>> >
>> >>
>> >> Thanks for all the comments. They give us idea on what direction to
>> take.
>> >>
>> >> We have been zeroing on idea of Random Indexing, but R.I seems missing
>> in
>> >> mahout currently. Are there future plans for implementing R.I in mahout?
>> >> Any libraries out that that would be useful for R.I?
>> >>
>> >> On Sun, May 20, 2012 9:47 am, Ted Dunning wrote:
>> >> > The basic reasoning here is that any cooccurrence measure without
>> >> > smoothing
>> >> > is will have zero overlap whenever all the others have zero overlap.
>> >> This
>> >> > seems to be the root of your problem. The solution is to increase
>> >> overlap
>> >> > or increase data.
>> >> >
>> >> > The problem with correlation based approaches is that they over state
>> >> > coincidental overlaps. Fixing that can't fix the problem of no
>> overlap.
>> >> >
>> >>
>> >>
>> >>
>>