|
|
Siddharth Tiwari 2012-08-20, 16:37
what should be the steps to use Mahout Kmeans over normal text. We have huge amount of Database server logs. How can I use mahout to cluster similar ones. Please help Thank you
*------------------------*
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!! "Every duty is holy, and devotion to duty is the highest form of worship of God.”
"Maybe other people will try to limit me but I don't limit myself"
Jeff Eastman 2012-08-20, 19:52
Siddharth,
Have you looked at examples/bin/cluster-reuters.sh? It is a good example of clustering normal text. On 8/20/12 12:37 PM, Siddharth Tiwari wrote: > what should be the steps to use Mahout Kmeans over normal text. > We have huge amount of Database server logs. How can I use mahout to cluster similar ones. > Please help > Thank you > > *------------------------* > > Cheers !!! > > Siddharth Tiwari > > Have a refreshing day !!! > "Every duty is holy, and devotion to duty is the highest form of worship of God.� > > "Maybe other people will try to limit me but I don't limit myself" >
Siddharth Tiwari 2012-08-20, 21:40
Hi Jeff
I did see it. I wanted to understand how shall I prepare my text to be usable with it. It was of no help :( can you please guide me a bit on it, as I am a newbei here ?
*------------------------*
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!! "Every duty is holy, and devotion to duty is the highest form of worship of God.”
"Maybe other people will try to limit me but I don't limit myself" > Date: Mon, 20 Aug 2012 15:52:44 -0400 > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > Subject: Re: Regarding K-Means > > Siddharth, > > Have you looked at examples/bin/cluster-reuters.sh? It is a good example > of clustering normal text. > > > On 8/20/12 12:37 PM, Siddharth Tiwari wrote: > > what should be the steps to use Mahout Kmeans over normal text. > > We have huge amount of Database server logs. How can I use mahout to cluster similar ones. > > Please help > > Thank you > > > > *------------------------* > > > > Cheers !!! > > > > Siddharth Tiwari > > > > Have a refreshing day !!! > > "Every duty is holy, and devotion to duty is the highest form of worship of God.” > > > > "Maybe other people will try to limit me but I don't limit myself" > > >
Paritosh Ranjan 2012-08-21, 04:39
If I look at cluster-reuters.sh, I see following mahout commands are executed ( in sequence ).
seqdirectory : Generate sequence files (of Text) from a directory seq2sparse: Sparse Vector generation from Text sequence files kmeans/fkmeans/dirichlet: Respective clustering algorithm clusterdump : Dump cluster output to text
I am sure that if you would explore these commands, you will at least move ahead.
On 21-08-2012 03:10, Siddharth Tiwari wrote: > Hi Jeff > > I did see it. > I wanted to understand how shall I prepare my text to be usable with it. It was of no help :( > can you please guide me a bit on it, as I am a newbei here ? > > *------------------------* > > Cheers !!! > > Siddharth Tiwari > > Have a refreshing day !!! > "Every duty is holy, and devotion to duty is the highest form of worship of God.� > > "Maybe other people will try to limit me but I don't limit myself" > > >> Date: Mon, 20 Aug 2012 15:52:44 -0400 >> From: [EMAIL PROTECTED] >> To: [EMAIL PROTECTED] >> Subject: Re: Regarding K-Means >> >> Siddharth, >> >> Have you looked at examples/bin/cluster-reuters.sh? It is a good example >> of clustering normal text. >> >> >> On 8/20/12 12:37 PM, Siddharth Tiwari wrote: >>> what should be the steps to use Mahout Kmeans over normal text. >>> We have huge amount of Database server logs. How can I use mahout to cluster similar ones. >>> Please help >>> Thank you >>> >>> *------------------------* >>> >>> Cheers !!! >>> >>> Siddharth Tiwari >>> >>> Have a refreshing day !!! >>> "Every duty is holy, and devotion to duty is the highest form of worship of God.� >>> >>> "Maybe other people will try to limit me but I don't limit myself" >>> >
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext