|
praneet mhatre
2011-03-29, 05:50
praneet mhatre
2011-03-29, 16:30
deneche abdelhakim
2011-03-31, 12:15
praneet mhatre
2011-03-31, 21:34
deneche abdelhakim
2011-04-06, 04:14
ext-ranjit.chellappa...@...
2011-04-11, 13:58
deneche abdelhakim
2011-04-11, 21:06
praneet mhatre
2011-04-11, 21:18
deneche abdelhakim
2011-04-18, 18:14
praneet mhatre
2011-04-18, 21:46
Sean Owen
2011-04-18, 21:47
Ted Dunning
2011-04-18, 22:04
praneet mhatre
2011-04-18, 22:37
Ted Dunning
2011-04-18, 22:54
praneet mhatre
2011-04-18, 22:57
deneche abdelhakim
2011-04-19, 05:15
deneche abdelhakim
2011-04-19, 05:21
Sean Owen
2011-04-19, 06:48
Sara Del Río García
2013-02-28, 21:06
|
-
Partial Implementation of Random Forestpraneet mhatre 2011-03-29, 05:50
Hello all,
I very recently started working on Mahout. To get the feel of things, I was trying to run the sample implementation of Random Forest posted on the Wiki ( https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation). However, even when I issue the exact same commands, I get an EOFException error as follows: Exception in thread "main" java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145) at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119) at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338) at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$ Can you please tell me what the problem is? Also, can you please suggest me the best way to get started? So far, I've been going through these examples and trying to run them on a pseudo cluster. Thank you, -- Praneet Mhatre Graduate Student Donald Bren School of ICS University of California, Irvine
-
Partial Implementation of Random Forestpraneet mhatre 2011-03-29, 16:30
I think my previous mail did not get through.
---------- Forwarded message ---------- From: praneet mhatre <[EMAIL PROTECTED]> Date: Mon, Mar 28, 2011 at 10:50 PM Subject: Partial Implementation of Random Forest To: [EMAIL PROTECTED] Hello all, I very recently started working on Mahout. To get the feel of things, I was trying to run the sample implementation of Random Forest posted on the Wiki ( https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation). However, even when I issue the exact same commands, I get an EOFException error as follows: Exception in thread "main" java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145) at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119) at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338) at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$ Can you please tell me what the problem is? Thank you, -- Praneet Mhatre Graduate Student Donald Bren School of ICS University of California, Irvine
-
Re : Partial Implementation of Random Forestdeneche abdelhakim 2011-03-31, 12:15
Hi Prannet,
I fixed various bugs since 0.4, could you try using the trunk, and see if it's happening again ? --- En date de : Mar 29.3.11, praneet mhatre <[EMAIL PROTECTED]> a écrit : De: praneet mhatre <[EMAIL PROTECTED]> Objet: Partial Implementation of Random Forest À: [EMAIL PROTECTED] Date: Mardi 29 mars 2011, 18h30 I think my previous mail did not get through. ---------- Forwarded message ---------- From: praneet mhatre <[EMAIL PROTECTED]> Date: Mon, Mar 28, 2011 at 10:50 PM Subject: Partial Implementation of Random Forest To: [EMAIL PROTECTED] Hello all, I very recently started working on Mahout. To get the feel of things, I was trying to run the sample implementation of Random Forest posted on the Wiki ( https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation). However, even when I issue the exact same commands, I get an EOFException error as follows: Exception in thread "main" java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145) at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119) at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338) at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$ Can you please tell me what the problem is? Thank you, -- Praneet Mhatre Graduate Student Donald Bren School of ICS University of California, Irvine
-
Re: Re : Partial Implementation of Random Forestpraneet mhatre 2011-03-31, 21:34
Hi Deneche,
I used the trunk. I still encounter the same error. By the way, I am running mahout on top of Cloudera's Linux image. I was just wondering if that has anything to do with the error. Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61) ... 13 more cloudera@cloudera-demo:~/Downloads/trunk$ Thanks, On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <[EMAIL PROTECTED]>wrote: > Hi Prannet, > > I fixed various bugs since 0.4, could you try using the trunk, and see if > it's happening again ? > > --- En date de : Mar 29.3.11, praneet mhatre <[EMAIL PROTECTED]> a > écrit : > > De: praneet mhatre <[EMAIL PROTECTED]> > Objet: Partial Implementation of Random Forest > À: [EMAIL PROTECTED] > Date: Mardi 29 mars 2011, 18h30 > > I think my previous mail did not get through. > > ---------- Forwarded message ---------- > From: praneet mhatre <[EMAIL PROTECTED]> > Date: Mon, Mar 28, 2011 at 10:50 PM > Subject: Partial Implementation of Random Forest > To: [EMAIL PROTECTED] > > > Hello all, > > I very recently started working on Mahout. To get the feel of things, I was > trying to run the sample implementation of Random Forest posted on the > Wiki > ( > https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation > ). > However, even when I issue the exact same commands, I get an > EOFException > error as follows: > Exception in thread "main" java.io.EOFException > at java.io.DataInputStream.readFully(DataInputStream.java:180) > at java.io.DataInputStream.readFully(DataInputStream.java:152) > at > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) > at > > org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145) > at > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119) > at > > org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) Praneet Mhatre Graduate Student Donald Bren School of ICS University of California, Irvine
-
Re: Re : Partial Implementation of Random Forestdeneche abdelhakim 2011-04-06, 04:14
There was a new bug in the code and I fixed it. Please try again after updating the code. I am also using Cloudera's Hadoop and it's running just fine
--- En date de : Jeu 31.3.11, praneet mhatre <[EMAIL PROTECTED]> a écrit : De: praneet mhatre <[EMAIL PROTECTED]> Objet: Re: Re : Partial Implementation of Random Forest À: [EMAIL PROTECTED] Cc: "deneche abdelhakim" <[EMAIL PROTECTED]> Date: Jeudi 31 mars 2011, 23h34 Hi Deneche, I used the trunk. I still encounter the same error. By the way, I am running mahout on top of Cloudera's Linux image. I was just wondering if that has anything to do with the error. Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61) ... 13 more cloudera@cloudera-demo:~/Downloads/trunk$ Thanks, On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <[EMAIL PROTECTED]> wrote: Hi Prannet, I fixed various bugs since 0.4, could you try using the trunk, and see if it's happening again ? --- En date de : Mar 29.3.11, praneet mhatre <[EMAIL PROTECTED]> a écrit : De: praneet mhatre <[EMAIL PROTECTED]> Objet: Partial Implementation of Random Forest À: [EMAIL PROTECTED] Date: Mardi 29 mars 2011, 18h30 I think my previous mail did not get through. ---------- Forwarded message ---------- From: praneet mhatre <[EMAIL PROTECTED]> Date: Mon, Mar 28, 2011 at 10:50 PM Subject: Partial Implementation of Random Forest To: [EMAIL PROTECTED] Hello all, I very recently started working on Mahout. To get the feel of things, I was trying to run the sample implementation of Random Forest posted on the Wiki ( https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation). However, even when I issue the exact same commands, I get an EOFException error as follows: Exception in thread "main" java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145) at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119) at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338) at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$ Can you please tell me what the problem is? Thank you, Praneet Mhatre Graduate Student Donald Bren School of ICS University of California, Irvine Praneet Mhatre Graduate Student Donald Bren School of ICS University of California, Irvine
-
RE: Re : Partial Implementation of Random Forestext-ranjit.chellappa...@... 2011-04-11, 13:58
Hi Deneche,
I used the mahout latest code from the trunk and while running the BuildForest on KDD dataset I am getting an EOF exception. Please find the exception I am getting below:- Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61) ... 13 more Any help in resolving the above error will be greately appreciated. Thanks and Regards, Ranjit.C -----Original Message----- From: ext deneche abdelhakim [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 06, 2011 9:44 AM To: [EMAIL PROTECTED] Subject: Re: Re : Partial Implementation of Random Forest There was a new bug in the code and I fixed it. Please try again after updating the code. I am also using Cloudera's Hadoop and it's running just fine --- En date de : Jeu 31.3.11, praneet mhatre <[EMAIL PROTECTED]> a écrit : De: praneet mhatre <[EMAIL PROTECTED]> Objet: Re: Re : Partial Implementation of Random Forest À: [EMAIL PROTECTED] Cc: "deneche abdelhakim" <[EMAIL PROTECTED]> Date: Jeudi 31 mars 2011, 23h34 Hi Deneche, I used the trunk. I still encounter the same error. By the way, I am running mahout on top of Cloudera's Linux image. I was just wondering if that has anything to do with the error. Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61) ... 13 more cloudera@cloudera-demo:~/Downloads/trunk$ Thanks, On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <[EMAIL PROTECTED]> wrote: Hi Prannet, I fixed various bugs since 0.4, could you try using the trunk, and see if it's happening again ? De: praneet mhatre <[EMAIL PROTECTED]> Objet: Partial Implementation of Random Forest À: [EMAIL PROTECTED] Date: Mardi 29 mars 2011, 18h30 I think my previous mail did not get through. From: praneet mhatre <[EMAIL PROTECTED]> Date: Mon, Mar 28, 2011 at 10:50 PM Subject: Partial Implementation of Random Forest To: [EMAIL PROTECTED] Hello all, I very recently started working on Mahout. To get the feel of things, I was trying to run the sample implementation of Random Forest posted on the Wiki ( https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation). However, even when I issue the exact same commands, I get an EOFExcept
-
RE: Re : Partial Implementation of Random Forestdeneche abdelhakim 2011-04-11, 21:06
hmm, I will give it a look and see what's causing this
--- En date de : Lun 11.4.11, [EMAIL PROTECTED] <[EMAIL PROTECTED]> a écrit : De: [EMAIL PROTECTED] <[EMAIL PROTECTED]> Objet: RE: Re : Partial Implementation of Random Forest À: [EMAIL PROTECTED] Date: Lundi 11 avril 2011, 15h58 Hi Deneche, I used the mahout latest code from the trunk and while running the BuildForest on KDD dataset I am getting an EOF exception. Please find the exception I am getting below:- Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61) ... 13 more Any help in resolving the above error will be greately appreciated. Thanks and Regards, Ranjit.C -----Original Message----- From: ext deneche abdelhakim [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 06, 2011 9:44 AM To: [EMAIL PROTECTED] Subject: Re: Re : Partial Implementation of Random Forest There was a new bug in the code and I fixed it. Please try again after updating the code. I am also using Cloudera's Hadoop and it's running just fine --- En date de : Jeu 31.3.11, praneet mhatre <[EMAIL PROTECTED]> a écrit : De: praneet mhatre <[EMAIL PROTECTED]> Objet: Re: Re : Partial Implementation of Random Forest À: [EMAIL PROTECTED] Cc: "deneche abdelhakim" <[EMAIL PROTECTED]> Date: Jeudi 31 mars 2011, 23h34 Hi Deneche, I used the trunk. I still encounter the same error. By the way, I am running mahout on top of Cloudera's Linux image. I was just wondering if that has anything to do with the error. Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61) ... 13 more cloudera@cloudera-demo:~/Downloads/trunk$ Thanks, On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <[EMAIL PROTECTED]> wrote: Hi Prannet, I fixed various bugs since 0.4, could you try using the trunk, and see if it's happening again ? De: praneet mhatre <[EMAIL PROTECTED]> Objet: Partial Implementation of Random Forest À: [EMAIL PROTECTED] Date: Mardi 29 mars 2011, 18h30 I think my previous mail did not get through. From: praneet mhatre <[EMAIL PROTECTED]> Date: Mon, Mar 28, 2011 at 10:50 PM Subject: Partial Implementation of R
-
Re: Re : Partial Implementation of Random Forestpraneet mhatre 2011-04-11, 21:18
Me too. Used the latest code. Still the exact same error as before.
Thanks, On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <[EMAIL PROTECTED]>wrote: > hmm, I will give it a look and see what's causing this > > --- En date de : Lun 11.4.11, [EMAIL PROTECTED] < > [EMAIL PROTECTED]> a écrit : > > De: [EMAIL PROTECTED] < > [EMAIL PROTECTED]> > Objet: RE: Re : Partial Implementation of Random Forest > À: [EMAIL PROTECTED] > Date: Lundi 11 avril 2011, 15h58 > > Hi Deneche, > > I used the mahout latest code from the trunk and while running the > BuildForest on KDD dataset I am getting an EOF exception. Please find the > exception I am getting below:- > > Exception in thread "main" java.lang.IllegalStateException: > java.io.EOFException > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) > at > org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) > at > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) > at > org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) > at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) > at > org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) > at > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > Caused by: java.io.EOFException > at java.io.DataInputStream.readFully(DataInputStream.java:180) > at java.io.DataInputStream.readFully(DataInputStream.java:152) > at > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59) > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61) > ... 13 more > > Any help in resolving the above error will be greately appreciated. > > Thanks and Regards, > Ranjit.C > > -----Original Message----- > From: ext deneche abdelhakim [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, April 06, 2011 9:44 AM > To: [EMAIL PROTECTED] > Subject: Re: Re : Partial Implementation of Random Forest > > There was a new bug in the code and I fixed it. Please try again after > updating the code. I am also using Cloudera's Hadoop and it's running just > fine > > --- En date de : Jeu 31.3.11, praneet mhatre <[EMAIL PROTECTED]> a > écrit : > > De: praneet mhatre <[EMAIL PROTECTED]> > Objet: Re: Re : Partial Implementation of Random Forest > À: [EMAIL PROTECTED] > Cc: "deneche abdelhakim" <[EMAIL PROTECTED]> > Date: Jeudi 31 mars 2011, 23h34 > > Hi Deneche, > > I used the trunk. I still encounter the same error. By the way, I am > running mahout on top of Cloudera's Linux image. I was just wondering if > that has anything to do with the error. > > Exception in thread "main" java.lang.IllegalStateException: > java.io.EOFException > > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) Praneet Mhatre Graduate Student Donald Bren School of ICS University of California, Irvine
-
Re: Re : Partial Implementation of Random Forestdeneche abdelhakim 2011-04-18, 18:14
Ok I was able to finally reproduce this bug, it appears when using Cloudera's distribution of Hadoop. Apparently this distribution contains some patches from Hadoop 0.21 that create a _SUCCEED file in the output path, the current code doesn't assume such file thus it can't parse it.
I tried the standard Hadoop O.20 distribution and it's working just fine. So for now I think it's safe to just use the standard distribution. --- En date de : Lun 11.4.11, praneet mhatre <[EMAIL PROTECTED]> a écrit : De: praneet mhatre <[EMAIL PROTECTED]> Objet: Re: Re : Partial Implementation of Random Forest À: [EMAIL PROTECTED] Cc: "deneche abdelhakim" <[EMAIL PROTECTED]> Date: Lundi 11 avril 2011, 23h18 Me too. Used the latest code. Still the exact same error as before. Thanks, On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <[EMAIL PROTECTED]>wrote: > hmm, I will give it a look and see what's causing this > > --- En date de : Lun 11.4.11, [EMAIL PROTECTED] < > [EMAIL PROTECTED]> a écrit : > > De: [EMAIL PROTECTED] < > [EMAIL PROTECTED]> > Objet: RE: Re : Partial Implementation of Random Forest > À: [EMAIL PROTECTED] > Date: Lundi 11 avril 2011, 15h58 > > Hi Deneche, > > I used the mahout latest code from the trunk and while running the > BuildForest on KDD dataset I am getting an EOF exception. Please find the > exception I am getting below:- > > Exception in thread "main" java.lang.IllegalStateException: > java.io.EOFException > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) > at > org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) > at > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) > at > org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) > at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) > at > org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) > at > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > Caused by: java.io.EOFException > at java.io.DataInputStream.readFully(DataInputStream.java:180) > at java.io.DataInputStream.readFully(DataInputStream.java:152) > at > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59) > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61) > ... 13 more > > Any help in resolving the above error will be greately appreciated. > > Thanks and Regards, > Ranjit.C > > -----Original Message----- > From: ext deneche abdelhakim [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, April 06, 2011 9:44 AM > To: [EMAIL PROTECTED] > Subject: Re: Re : Partial Implementation of Random Forest > > There was a new bug in the code and I fixed it. Please try again after > updating the code. I am also using Cloudera's Hadoop and it's running just Praneet Mhatre Graduate Student Donald Bren School of ICS University of California, Irvine
-
Re: Re : Partial Implementation of Random Forestpraneet mhatre 2011-04-18, 21:46
That would be really helpful. It'll save me the trouble of reverting to an
older version of Hadoop! Please update us here after it is done. Thank you, On Mon, Apr 18, 2011 at 2:36 PM, Sean Owen <[EMAIL PROTECTED]> wrote: > I can easily add a bit to this method call that will cause it to skip files > and directories like .crc, _logs, etc. Seems like the right thing to do > here > as it's evidently causing a problem otherwise. > > On Mon, Apr 18, 2011 at 7:14 PM, deneche abdelhakim <[EMAIL PROTECTED] > >wrote: > > > Ok I was able to finally reproduce this bug, it appears when using > > Cloudera's distribution of Hadoop. Apparently this distribution contains > > some patches from Hadoop 0.21 that create a _SUCCEED file in the output > > path, the current code doesn't assume such file thus it can't parse it. > > I tried the standard Hadoop O.20 distribution and it's working just fine. > > So for now I think it's safe to just use the standard distribution. > > > > --- En date de : Lun 11.4.11, praneet mhatre <[EMAIL PROTECTED]> a > > écrit : > > > > De: praneet mhatre <[EMAIL PROTECTED]> > > Objet: Re: Re : Partial Implementation of Random Forest > > À: [EMAIL PROTECTED] > > Cc: "deneche abdelhakim" <[EMAIL PROTECTED]> > > Date: Lundi 11 avril 2011, 23h18 > > > > Me too. Used the latest code. Still the exact same error as before. > > > > Thanks, > > > > On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <[EMAIL PROTECTED] > > >wrote: > > > > > hmm, I will give it a look and see what's causing this > > > > > > --- En date de : Lun 11.4.11, [EMAIL PROTECTED] < > > > [EMAIL PROTECTED]> a écrit : > > > > > > De: [EMAIL PROTECTED] < > > > [EMAIL PROTECTED]> > > > Objet: RE: Re : Partial Implementation of Random Forest > > > À: [EMAIL PROTECTED] > > > Date: Lundi 11 avril 2011, 15h58 > > > > > > Hi Deneche, > > > > > > I used the mahout latest code from the trunk and while running the > > > BuildForest on KDD dataset I am getting an EOF exception. Please find > the > > > exception I am getting below:- > > > > > > Exception in thread "main" java.lang.IllegalStateException: > > > java.io.EOFException > > > at > > > > > > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) > > > at > > > > > > org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) > > > at > > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) > > > at > > > > > > org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) > > > at > org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) > > > at > > > > > > org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) > > > at > > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) > > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > > at > > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > at > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > > at > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > > > Caused by: java.io.EOFException > > > at java.io.DataInputStream.readFully(DataInputStream.java:180) > > > at java.io.DataInputStream.readFully(DataInputStream.java:152) > > > at > > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) > > > at > > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) > > > at > > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) Praneet Mhatre Graduate Student Donald Bren School of ICS University of California, Irvine
-
Re: Re : Partial Implementation of Random ForestSean Owen 2011-04-18, 21:47
I committed my change just now -- try it out.
On Mon, Apr 18, 2011 at 10:46 PM, praneet mhatre <[EMAIL PROTECTED]>wrote: > That would be really helpful. It'll save me the trouble of reverting to an > older version of Hadoop! Please update us here after it is done. > > Thank you, > >
-
Re: Re : Partial Implementation of Random ForestTed Dunning 2011-04-18, 22:04
Deneche,
My local map-reduce guy says that he doesn't see this in the CDH sources. Can you say which version of CDH you were using? On Mon, Apr 18, 2011 at 2:36 PM, Sean Owen <[EMAIL PROTECTED]> wrote: > I can easily add a bit to this method call that will cause it to skip files > and directories like .crc, _logs, etc. Seems like the right thing to do > here > as it's evidently causing a problem otherwise. > > On Mon, Apr 18, 2011 at 7:14 PM, deneche abdelhakim <[EMAIL PROTECTED] > >wrote: > > > Ok I was able to finally reproduce this bug, it appears when using > > Cloudera's distribution of Hadoop. Apparently this distribution contains > > some patches from Hadoop 0.21 that create a _SUCCEED file in the output > > path, the current code doesn't assume such file thus it can't parse it. > > I tried the standard Hadoop O.20 distribution and it's working just fine. > > So for now I think it's safe to just use the standard distribution. > > > > --- En date de : Lun 11.4.11, praneet mhatre <[EMAIL PROTECTED]> a > > écrit : > > > > De: praneet mhatre <[EMAIL PROTECTED]> > > Objet: Re: Re : Partial Implementation of Random Forest > > À: [EMAIL PROTECTED] > > Cc: "deneche abdelhakim" <[EMAIL PROTECTED]> > > Date: Lundi 11 avril 2011, 23h18 > > > > Me too. Used the latest code. Still the exact same error as before. > > > > Thanks, > > > > On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <[EMAIL PROTECTED] > > >wrote: > > > > > hmm, I will give it a look and see what's causing this > > > > > > --- En date de : Lun 11.4.11, [EMAIL PROTECTED] < > > > [EMAIL PROTECTED]> a écrit : > > > > > > De: [EMAIL PROTECTED] < > > > [EMAIL PROTECTED]> > > > Objet: RE: Re : Partial Implementation of Random Forest > > > À: [EMAIL PROTECTED] > > > Date: Lundi 11 avril 2011, 15h58 > > > > > > Hi Deneche, > > > > > > I used the mahout latest code from the trunk and while running the > > > BuildForest on KDD dataset I am getting an EOF exception. Please find > the > > > exception I am getting below:- > > > > > > Exception in thread "main" java.lang.IllegalStateException: > > > java.io.EOFException > > > at > > > > > > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) > > > at > > > > > > org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) > > > at > > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) > > > at > > > > > > org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) > > > at > org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) > > > at > > > > > > org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) > > > at > > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) > > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > > at > > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > at > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > > at > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > > > Caused by: java.io.EOFException > > > at java.io.DataInputStream.readFully(DataInputStream.java:180) > > > at java.io.DataInputStream.readFully(DataInputStream.java:152) > > > at > > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) > > > at > > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) > > > at > > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
-
Re: Re : Partial Implementation of Random Forestpraneet mhatre 2011-04-18, 22:37
Sean,
I still see the _SUCCESS file in my output path. And I'm sure that I used the latest trunk. I'm in class right now, so I'll repeat the whole process carefully again in some time. Just wanted to let you know. On Mon, Apr 18, 2011 at 3:04 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > Deneche, > > My local map-reduce guy says that he doesn't see this in the CDH sources. > Can you say which version of CDH you were using? > > > On Mon, Apr 18, 2011 at 2:36 PM, Sean Owen <[EMAIL PROTECTED]> wrote: > > > I can easily add a bit to this method call that will cause it to skip > files > > and directories like .crc, _logs, etc. Seems like the right thing to do > > here > > as it's evidently causing a problem otherwise. > > > > On Mon, Apr 18, 2011 at 7:14 PM, deneche abdelhakim <[EMAIL PROTECTED] > > >wrote: > > > > > Ok I was able to finally reproduce this bug, it appears when using > > > Cloudera's distribution of Hadoop. Apparently this distribution > contains > > > some patches from Hadoop 0.21 that create a _SUCCEED file in the output > > > path, the current code doesn't assume such file thus it can't parse it. > > > I tried the standard Hadoop O.20 distribution and it's working just > fine. > > > So for now I think it's safe to just use the standard distribution. > > > > > > --- En date de : Lun 11.4.11, praneet mhatre <[EMAIL PROTECTED]> > a > > > écrit : > > > > > > De: praneet mhatre <[EMAIL PROTECTED]> > > > Objet: Re: Re : Partial Implementation of Random Forest > > > À: [EMAIL PROTECTED] > > > Cc: "deneche abdelhakim" <[EMAIL PROTECTED]> > > > Date: Lundi 11 avril 2011, 23h18 > > > > > > Me too. Used the latest code. Still the exact same error as before. > > > > > > Thanks, > > > > > > On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim < > [EMAIL PROTECTED] > > > >wrote: > > > > > > > hmm, I will give it a look and see what's causing this > > > > > > > > --- En date de : Lun 11.4.11, [EMAIL PROTECTED] < > > > > [EMAIL PROTECTED]> a écrit : > > > > > > > > De: [EMAIL PROTECTED] < > > > > [EMAIL PROTECTED]> > > > > Objet: RE: Re : Partial Implementation of Random Forest > > > > À: [EMAIL PROTECTED] > > > > Date: Lundi 11 avril 2011, 15h58 > > > > > > > > Hi Deneche, > > > > > > > > I used the mahout latest code from the trunk and while running the > > > > BuildForest on KDD dataset I am getting an EOF exception. Please find > > the > > > > exception I am getting below:- > > > > > > > > Exception in thread "main" java.lang.IllegalStateException: > > > > java.io.EOFException > > > > at > > > > > > > > > > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) > > > > at > > > > > > > > > > org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) > > > > at > > > > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) > > > > at > > > > > > > > > > org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) > > > > at > > org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) > > > > at > > > > > > > > > > org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) > > > > at > > > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) > > > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > > > at > > > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > > > > at > > > > > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > > > at > > > > > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Praneet Mhatre Graduate Student Donald Bren School of ICS University of California, Irvine
-
Re: Re : Partial Implementation of Random ForestTed Dunning 2011-04-18, 22:54
Praneet,
What version of CDH is that? On Mon, Apr 18, 2011 at 3:37 PM, praneet mhatre <[EMAIL PROTECTED]>wrote: > Sean, > > I still see the _SUCCESS file in my output path. And I'm sure that I used > the latest trunk. I'm in class right now, so I'll repeat the whole process > carefully again in some time. Just wanted to let you know. > > On Mon, Apr 18, 2011 at 3:04 PM, Ted Dunning <[EMAIL PROTECTED]> > wrote: > > > Deneche, > > > > My local map-reduce guy says that he doesn't see this in the CDH sources. > > Can you say which version of CDH you were using? > > > > > > On Mon, Apr 18, 2011 at 2:36 PM, Sean Owen <[EMAIL PROTECTED]> wrote: > > > > > I can easily add a bit to this method call that will cause it to skip > > files > > > and directories like .crc, _logs, etc. Seems like the right thing to do > > > here > > > as it's evidently causing a problem otherwise. > > > > > > On Mon, Apr 18, 2011 at 7:14 PM, deneche abdelhakim < > [EMAIL PROTECTED] > > > >wrote: > > > > > > > Ok I was able to finally reproduce this bug, it appears when using > > > > Cloudera's distribution of Hadoop. Apparently this distribution > > contains > > > > some patches from Hadoop 0.21 that create a _SUCCEED file in the > output > > > > path, the current code doesn't assume such file thus it can't parse > it. > > > > I tried the standard Hadoop O.20 distribution and it's working just > > fine. > > > > So for now I think it's safe to just use the standard distribution. > > > > > > > > --- En date de : Lun 11.4.11, praneet mhatre < > [EMAIL PROTECTED]> > > a > > > > écrit : > > > > > > > > De: praneet mhatre <[EMAIL PROTECTED]> > > > > Objet: Re: Re : Partial Implementation of Random Forest > > > > À: [EMAIL PROTECTED] > > > > Cc: "deneche abdelhakim" <[EMAIL PROTECTED]> > > > > Date: Lundi 11 avril 2011, 23h18 > > > > > > > > Me too. Used the latest code. Still the exact same error as before. > > > > > > > > Thanks, > > > > > > > > On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim < > > [EMAIL PROTECTED] > > > > >wrote: > > > > > > > > > hmm, I will give it a look and see what's causing this > > > > > > > > > > --- En date de : Lun 11.4.11, [EMAIL PROTECTED]< > > > > > [EMAIL PROTECTED]> a écrit : > > > > > > > > > > De: [EMAIL PROTECTED] < > > > > > [EMAIL PROTECTED]> > > > > > Objet: RE: Re : Partial Implementation of Random Forest > > > > > À: [EMAIL PROTECTED] > > > > > Date: Lundi 11 avril 2011, 15h58 > > > > > > > > > > Hi Deneche, > > > > > > > > > > I used the mahout latest code from the trunk and while running the > > > > > BuildForest on KDD dataset I am getting an EOF exception. Please > find > > > the > > > > > exception I am getting below:- > > > > > > > > > > Exception in thread "main" java.lang.IllegalStateException: > > > > > java.io.EOFException > > > > > at > > > > > > > > > > > > > > > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) > > > > > at > > > > > > > > > > > > > > > org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) > > > > > at > > > > > > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) > > > > > at > > > > > > > > > > > > > > > org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) > > > > > at > > > org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) > > > > > at > > > > > > > > > > > > > > > org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) > > > > > at > > > > > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) > > > > > at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > > > > at > > > > > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) > > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
-
Re: Re : Partial Implementation of Random Forestpraneet mhatre 2011-04-18, 22:57
It's CDH3
On Mon, Apr 18, 2011 at 3:54 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > Praneet, > > What version of CDH is that? > > > On Mon, Apr 18, 2011 at 3:37 PM, praneet mhatre <[EMAIL PROTECTED]>wrote: > >> Sean, >> >> I still see the _SUCCESS file in my output path. And I'm sure that I used >> the latest trunk. I'm in class right now, so I'll repeat the whole process >> carefully again in some time. Just wanted to let you know. >> >> On Mon, Apr 18, 2011 at 3:04 PM, Ted Dunning <[EMAIL PROTECTED]> >> wrote: >> >> > Deneche, >> > >> > My local map-reduce guy says that he doesn't see this in the CDH >> sources. >> > Can you say which version of CDH you were using? >> > >> > >> > On Mon, Apr 18, 2011 at 2:36 PM, Sean Owen <[EMAIL PROTECTED]> wrote: >> > >> > > I can easily add a bit to this method call that will cause it to skip >> > files >> > > and directories like .crc, _logs, etc. Seems like the right thing to >> do >> > > here >> > > as it's evidently causing a problem otherwise. >> > > >> > > On Mon, Apr 18, 2011 at 7:14 PM, deneche abdelhakim < >> [EMAIL PROTECTED] >> > > >wrote: >> > > >> > > > Ok I was able to finally reproduce this bug, it appears when using >> > > > Cloudera's distribution of Hadoop. Apparently this distribution >> > contains >> > > > some patches from Hadoop 0.21 that create a _SUCCEED file in the >> output >> > > > path, the current code doesn't assume such file thus it can't parse >> it. >> > > > I tried the standard Hadoop O.20 distribution and it's working just >> > fine. >> > > > So for now I think it's safe to just use the standard distribution. >> > > > >> > > > --- En date de : Lun 11.4.11, praneet mhatre < >> [EMAIL PROTECTED]> >> > a >> > > > écrit : >> > > > >> > > > De: praneet mhatre <[EMAIL PROTECTED]> >> > > > Objet: Re: Re : Partial Implementation of Random Forest >> > > > À: [EMAIL PROTECTED] >> > > > Cc: "deneche abdelhakim" <[EMAIL PROTECTED]> >> > > > Date: Lundi 11 avril 2011, 23h18 >> > > > >> > > > Me too. Used the latest code. Still the exact same error as before. >> > > > >> > > > Thanks, >> > > > >> > > > On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim < >> > [EMAIL PROTECTED] >> > > > >wrote: >> > > > >> > > > > hmm, I will give it a look and see what's causing this >> > > > > >> > > > > --- En date de : Lun 11.4.11, [EMAIL PROTECTED]< >> > > > > [EMAIL PROTECTED]> a écrit : >> > > > > >> > > > > De: [EMAIL PROTECTED] < >> > > > > [EMAIL PROTECTED]> >> > > > > Objet: RE: Re : Partial Implementation of Random Forest >> > > > > À: [EMAIL PROTECTED] >> > > > > Date: Lundi 11 avril 2011, 15h58 >> > > > > >> > > > > Hi Deneche, >> > > > > >> > > > > I used the mahout latest code from the trunk and while running the >> > > > > BuildForest on KDD dataset I am getting an EOF exception. Please >> find >> > > the >> > > > > exception I am getting below:- >> > > > > >> > > > > Exception in thread "main" java.lang.IllegalStateException: >> > > > > java.io.EOFException >> > > > > at >> > > > > >> > > > >> > > >> > >> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) >> > > > > at >> > > > > >> > > > >> > > >> > >> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) >> > > > > at >> > > > > >> > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) >> > > > > at >> > > > > >> > > > >> > > >> > >> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) >> > > > > at >> > > org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) >> > > > > at >> > > > > >> > > > >> > > >> > >> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) >> > > > > at >> > > > > >> org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) >> > > > > at >> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) Praneet Mhatre Graduate Student Donald Bren School of ICS University of California, Irvine
-
Re: Re : Partial Implementation of Random Forestdeneche abdelhakim 2011-04-19, 05:15
actually that's the first thing I did, but I got another unexplained error that I didn't get with the standard Hadoop distribution, so until I fix the new error I think it's safer to use the standard distribution for now.
--- En date de : Lun 18.4.11, Sean Owen <[EMAIL PROTECTED]> a écrit : De: Sean Owen <[EMAIL PROTECTED]> Objet: Re: Re : Partial Implementation of Random Forest À: [EMAIL PROTECTED] Date: Lundi 18 avril 2011, 23h36 I can easily add a bit to this method call that will cause it to skip files and directories like .crc, _logs, etc. Seems like the right thing to do here as it's evidently causing a problem otherwise. On Mon, Apr 18, 2011 at 7:14 PM, deneche abdelhakim <[EMAIL PROTECTED]>wrote: > Ok I was able to finally reproduce this bug, it appears when using > Cloudera's distribution of Hadoop. Apparently this distribution contains > some patches from Hadoop 0.21 that create a _SUCCEED file in the output > path, the current code doesn't assume such file thus it can't parse it. > I tried the standard Hadoop O.20 distribution and it's working just fine. > So for now I think it's safe to just use the standard distribution. > > --- En date de : Lun 11.4.11, praneet mhatre <[EMAIL PROTECTED]> a > écrit : > > De: praneet mhatre <[EMAIL PROTECTED]> > Objet: Re: Re : Partial Implementation of Random Forest > À: [EMAIL PROTECTED] > Cc: "deneche abdelhakim" <[EMAIL PROTECTED]> > Date: Lundi 11 avril 2011, 23h18 > > Me too. Used the latest code. Still the exact same error as before. > > Thanks, > > On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <[EMAIL PROTECTED] > >wrote: > > > hmm, I will give it a look and see what's causing this > > > > --- En date de : Lun 11.4.11, [EMAIL PROTECTED] < > > [EMAIL PROTECTED]> a écrit : > > > > De: [EMAIL PROTECTED] < > > [EMAIL PROTECTED]> > > Objet: RE: Re : Partial Implementation of Random Forest > > À: [EMAIL PROTECTED] > > Date: Lundi 11 avril 2011, 15h58 > > > > Hi Deneche, > > > > I used the mahout latest code from the trunk and while running the > > BuildForest on KDD dataset I am getting an EOF exception. Please find the > > exception I am getting below:- > > > > Exception in thread "main" java.lang.IllegalStateException: > > java.io.EOFException > > at > > > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) > > at > > > org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) > > at > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) > > at > > > org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) > > at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) > > at > > > org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) > > at > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > at > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > > Caused by: java.io.EOFException > > at java.io.DataInputStream.readFully(DataInputStream.java:180) > > at java.io.DataInputStream.readFully(DataInputStream.java:152) > > at > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) > > at > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) > > at > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
-
Re: Re : Partial Implementation of Random Forestdeneche abdelhakim 2011-04-19, 05:21
it's hadoop-0.20.2-cdh3u0 for now I'm running it in a standalone mode.
--- En date de : Mar 19.4.11, Ted Dunning <[EMAIL PROTECTED]> a écrit : De: Ted Dunning <[EMAIL PROTECTED]> Objet: Re: Re : Partial Implementation of Random Forest À: [EMAIL PROTECTED] Cc: "Sean Owen" <[EMAIL PROTECTED]> Date: Mardi 19 avril 2011, 0h04 Deneche, My local map-reduce guy says that he doesn't see this in the CDH sources. Can you say which version of CDH you were using? On Mon, Apr 18, 2011 at 2:36 PM, Sean Owen <[EMAIL PROTECTED]> wrote: > I can easily add a bit to this method call that will cause it to skip files > and directories like .crc, _logs, etc. Seems like the right thing to do > here > as it's evidently causing a problem otherwise. > > On Mon, Apr 18, 2011 at 7:14 PM, deneche abdelhakim <[EMAIL PROTECTED] > >wrote: > > > Ok I was able to finally reproduce this bug, it appears when using > > Cloudera's distribution of Hadoop. Apparently this distribution contains > > some patches from Hadoop 0.21 that create a _SUCCEED file in the output > > path, the current code doesn't assume such file thus it can't parse it. > > I tried the standard Hadoop O.20 distribution and it's working just fine. > > So for now I think it's safe to just use the standard distribution. > > > > --- En date de : Lun 11.4.11, praneet mhatre <[EMAIL PROTECTED]> a > > écrit : > > > > De: praneet mhatre <[EMAIL PROTECTED]> > > Objet: Re: Re : Partial Implementation of Random Forest > > À: [EMAIL PROTECTED] > > Cc: "deneche abdelhakim" <[EMAIL PROTECTED]> > > Date: Lundi 11 avril 2011, 23h18 > > > > Me too. Used the latest code. Still the exact same error as before. > > > > Thanks, > > > > On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <[EMAIL PROTECTED] > > >wrote: > > > > > hmm, I will give it a look and see what's causing this > > > > > > --- En date de : Lun 11.4.11, [EMAIL PROTECTED] < > > > [EMAIL PROTECTED]> a écrit : > > > > > > De: [EMAIL PROTECTED] < > > > [EMAIL PROTECTED]> > > > Objet: RE: Re : Partial Implementation of Random Forest > > > À: [EMAIL PROTECTED] > > > Date: Lundi 11 avril 2011, 15h58 > > > > > > Hi Deneche, > > > > > > I used the mahout latest code from the trunk and while running the > > > BuildForest on KDD dataset I am getting an EOF exception. Please find > the > > > exception I am getting below:- > > > > > > Exception in thread "main" java.lang.IllegalStateException: > > > java.io.EOFException > > > at > > > > > > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63) > > > at > > > > > > org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142) > > > at > > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120) > > > at > > > > > > org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115) > > > at > org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324) > > > at > > > > > > org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195) > > > at > > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159) > > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > > at > > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239) > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > at > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > > at > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > > > Caused by: java.io.EOFException > > > at java.io.DataInputStream.readFully(DataInputStream.java:180)
-
Re: Re : Partial Implementation of Random ForestSean Owen 2011-04-19, 06:48
The _SUCCESS fie has nothing to do with Mahout -- it's Cloudera behavior. My
change would ignore the file. Deneche if that doesn't fix it, well I figure that's good policy anyway to apply the standard filter to ignore _logs, _SUCCESS, .crc, etc. On Mon, Apr 18, 2011 at 11:37 PM, praneet mhatre <[EMAIL PROTECTED]>wrote: > Sean, > > I still see the _SUCCESS file in my output path. And I'm sure that I used > the latest trunk. I'm in class right now, so I'll repeat the whole process > carefully again in some time. Just wanted to let you know. > >
-
Partial Implementation of Random ForestsSara Del Río García 2013-02-28, 21:06
Hello all:
I'm testing the Random Forest Partial version in the version of Hadoop: Hadoop 2.0.0-cdh4.1.1 I'm trying to modify the algorithm, all I do is add more information to the leaves of the tree. Currently containing the label and I want to add another label more: @Override public void readFields(DataInput in) throws IOException{ label = in.readDouble(); leafWeight = in.readDouble(); } @Override protected void writeNode(DataOutput out) throws IOException{ out.writeDouble(label); out.writeDouble(leafWeight); } And I get the following error: 13/02/27 06:53:27 INFO mapreduce.BuildForest: Partial Mapred implementation 13/02/27 06:53:27 INFO mapreduce.BuildForest: Building the forest... 13/02/27 06:53:27 INFO mapreduce.BuildForest: Weights Estimation: IR 13/02/27 06:53:37 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 13/02/27 06:53:39 INFO input.FileInputFormat: Total input paths to process : 1 13/02/27 06:53:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 13/02/27 06:53:39 WARN snappy.LoadSnappy: Snappy native library not loaded 13/02/27 06:53:39 INFO mapred.JobClient: Running job: job_201302270205_0013 13/02/27 06:53:40 INFO mapred.JobClient: map 0% reduce 0% 13/02/27 06:54:18 INFO mapred.JobClient: map 20% reduce 0% 13/02/27 06:54:42 INFO mapred.JobClient: map 40% reduce 0% 13/02/27 06:55:03 INFO mapred.JobClient: map 60% reduce 0% 13/02/27 06:55:26 INFO mapred.JobClient: map 70% reduce 0% 13/02/27 06:55:27 INFO mapred.JobClient: map 80% reduce 0% 13/02/27 06:55:49 INFO mapred.JobClient: map 100% reduce 0% 13/02/27 06:56:04 INFO mapred.JobClient: Job complete: job_201302270205_0013 13/02/27 06:56:04 INFO mapred.JobClient: Counters: 24 13/02/27 06:56:04 INFO mapred.JobClient: File System Counters 13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of bytes read=0 13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of bytes written=1828230 13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of read operations=0 13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of large read operations=0 13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of write operations=0 13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of bytes read=1381649 13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of bytes written=1680 13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of read operations=30 13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of large read operations=0 13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of write operations=10 13/02/27 06:56:04 INFO mapred.JobClient: Job Counters 13/02/27 06:56:04 INFO mapred.JobClient: Launched map tasks=10 13/02/27 06:56:04 INFO mapred.JobClient: Data-local map tasks=10 13/02/27 06:56:04 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=254707 13/02/27 06:56:04 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=0 13/02/27 06:56:04 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 13/02/27 06:56:04 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 13/02/27 06:56:04 INFO mapred.JobClient: Map-Reduce Framework 13/02/27 06:56:04 INFO mapred.JobClient: Map input records=20 13/02/27 06:56:04 INFO mapred.JobClient: Map output records=10 13/02/27 06:56:04 INFO mapred.JobClient: Input split bytes=1540 13/02/27 06:56:04 INFO mapred.JobClient: Spilled Records=0 13/02/27 06:56:04 INFO mapred.JobClient: CPU time spent (ms)=12070 13/02/27 06:56:04 INFO mapred.JobClient: Physical memory (bytes) snapshot=949579776 13/02/27 06:56:04 INFO mapred.JobClient: Virtual memory (bytes) snapshot=8412340224 13/02/27 06:56:04 INFO mapred.JobClient: Total committed heap usage (bytes)=478412800 Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:104) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilder.processOutput(PartialBuilder.java:129) at org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:96) at org.apache.mahout.classifier.df.mapreduce.Builder.build(Builder.java:312) at org.apache.mahout.classifier.df.mapreduce.BuildForest.buildForest(BuildForest.java:246) at org.apache.mahout.classifier.df.mapreduce.BuildForest.run(BuildForest.java:200) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.mahout.classifier.df.mapreduce.BuildForest.main(BuildForest.java:270) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readLong(DataInputStream.java:399) at java.io.DataInputStream.readDouble(DataInputStream.java:451) at org.apache.mahout.classifier.df.node.Leaf.readFields(Leaf.java:136) at org.apache.mahout.classifier.df.node.Node.read(Node.java:85) at org.apache.mahout.classifier.df.mapreduce.MapredOutput.readFields(MapredOutput.java:64) at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:2114) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2242) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95) ... 10 more What's the problem? You can try to write more information in the leaves of the tree? Thank you very much. Best regards, Sara |