|
Markus Holtermann
2011-09-23, 01:37
Dan Brickley
2011-09-23, 02:12
Dmitriy Lyubimov
2011-09-23, 02:28
Ted Dunning
2011-09-23, 02:56
Ted Dunning
2011-09-23, 02:57
Danny Bickson
2011-09-23, 05:56
Markus Holtermann
2011-09-23, 22:42
Lance Norskog
2011-09-23, 23:03
Dmitriy Lyubimov
2011-09-23, 23:42
Dmitriy Lyubimov
2011-09-24, 00:29
Dan Brickley
2011-09-24, 01:34
Ted Dunning
2011-09-24, 03:33
Dmitriy Lyubimov
2011-09-24, 03:46
Lance Norskog
2011-09-24, 21:51
Ted Dunning
2011-09-25, 11:39
Markus Holtermann
2011-09-28, 20:32
Dmitriy Lyubimov
2011-09-29, 02:15
Dmitriy Lyubimov
2011-09-29, 03:08
Ted Dunning
2011-09-29, 12:36
|
-
Singular Value Decomposition does not return correct eigenvalues and -vectorsMarkus Holtermann 2011-09-23, 01:37
Hello there,
I'm trying to run Mahout's Singular Value Decomposition but realized, that the resulting eigenvalues are wrong in most cases. So I took two small 3x3 matrices and calculated their eigenvalues and eigenvectors by hand and compared the results to Mahout. Only in one of eight cases the results for Mahout and my pen & paper matched. Lets take A = {{1,2,3},{2,4,5},{3,5,6}} and B = {{5,2,4},{-3,6,2},{3,-3,1}} As you can see, A is symmetric, B is not. I ran `mahout svd --output out/ --numRows 3 --numCols 3` eight times with different arguments: 1) --input A --rank 3 --symmetric true result is wrong 2) --input A --rank 4 --symmetric true result is wrong 3) --input A --rank 3 --symmetric false result is wrong 4) --input A --rank 4 --symmetric false result is CORRECT 5) --input B --rank 3 --symmetric true result is wrong 6) --input B --rank 4 --symmetric true result is wrong 7) --input B --rank 3 --symmetric false result is wrong 8) --input B --rank 4 --symmetric false result is wrong To verify that my input data is correct, this is the result of `mahout seqdumper` For A: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: {0:1.0,1:2.0,2:3.0} Key: 1: Value: {0:2.0,1:4.0,2:5.0} Key: 2: Value: {0:3.0,1:5.0,2:6.0} Count: 3 For B: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: {0:5.0,1:2.0,2:4.0} Key: 1: Value: {0:-3.0,1:6.0,2:2.0} Key: 2: Value: {0:3.0,1:-3.0,2:1.0} Count: 3 And finally, the correct eigenvalues should be: For A: λ1 = 11.3448 λ2 = -0.515729 λ3 = 0.170915 For B: λ1 = 7 λ2 = 3 λ3 = 2 So, are there any known bugs in Mahout's SVD implementation? Am I doing something wrong? Is this algorithm known to produce wrong results? Thanks in advance. Markus
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsDan Brickley 2011-09-23, 02:12
On 22 Sep 2011, at 18:37, Markus Holtermann <[EMAIL PROTECTED]> wrote: > Hello there, > > I'm trying to run Mahout's Singular Value Decomposition but realized, > that the resulting eigenvalues are wrong in most cases. So I took two > small 3x3 matrices and calculated their eigenvalues and eigenvectors by > hand and compared the results to Mahout. > > Only in one of eight cases the results for Mahout and my pen & paper > matched. > > Lets take > A = {{1,2,3},{2,4,5},{3,5,6}} > and > B = {{5,2,4},{-3,6,2},{3,-3,1}} > > As you can see, A is symmetric, B is not. > > I ran `mahout svd --output out/ --numRows 3 --numCols 3` eight times > with different arguments: > > 1) --input A --rank 3 --symmetric true result is wrong > 2) --input A --rank 4 --symmetric true result is wrong > 3) --input A --rank 3 --symmetric false result is wrong > 4) --input A --rank 4 --symmetric false result is CORRECT > > 5) --input B --rank 3 --symmetric true result is wrong > 6) --input B --rank 4 --symmetric true result is wrong > 7) --input B --rank 3 --symmetric false result is wrong > 8) --input B --rank 4 --symmetric false result is wrong > > To verify that my input data is correct, this is the result of `mahout > seqdumper` > > For A: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: {0:1.0,1:2.0,2:3.0} > Key: 1: Value: {0:2.0,1:4.0,2:5.0} > Key: 2: Value: {0:3.0,1:5.0,2:6.0} > Count: 3 > > > For B: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: {0:5.0,1:2.0,2:4.0} > Key: 1: Value: {0:-3.0,1:6.0,2:2.0} > Key: 2: Value: {0:3.0,1:-3.0,2:1.0} > Count: 3 > > > And finally, the correct eigenvalues should be: > For A: > λ1 = 11.3448 > λ2 = -0.515729 > λ3 = 0.170915 > > For B: > λ1 = 7 > λ2 = 3 > λ3 = 2 > > So, are there any known bugs in Mahout's SVD implementation? Am I doing > something wrong? Is this algorithm known to produce wrong results? > > Thanks in advance. > I have the impression from somewhere that there is a problem with sending tiny matrices to mahout lanczos/svd. Something like - it doesn't then get enough iterations to settle on decent values. Sorry I can't find a ref/link for this; hope I didn't dream it... Dan > Markus
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsDmitriy Lyubimov 2011-09-23, 02:28
as far as i understand, Mahout's Lanczos is to deal with larger inputs.
You can also try mahout ssvd with -k=3, -p=0, i am pretty sure you will get exact results for a 3x3 matrix :) -d On Thu, Sep 22, 2011 at 6:37 PM, Markus Holtermann <[EMAIL PROTECTED]> wrote: > Hello there, > > I'm trying to run Mahout's Singular Value Decomposition but realized, > that the resulting eigenvalues are wrong in most cases. So I took two > small 3x3 matrices and calculated their eigenvalues and eigenvectors by > hand and compared the results to Mahout. > > Only in one of eight cases the results for Mahout and my pen & paper > matched. > > Lets take > A = {{1,2,3},{2,4,5},{3,5,6}} > and > B = {{5,2,4},{-3,6,2},{3,-3,1}} > > As you can see, A is symmetric, B is not. > > I ran `mahout svd --output out/ --numRows 3 --numCols 3` eight times > with different arguments: > > 1) --input A --rank 3 --symmetric true result is wrong > 2) --input A --rank 4 --symmetric true result is wrong > 3) --input A --rank 3 --symmetric false result is wrong > 4) --input A --rank 4 --symmetric false result is CORRECT > > 5) --input B --rank 3 --symmetric true result is wrong > 6) --input B --rank 4 --symmetric true result is wrong > 7) --input B --rank 3 --symmetric false result is wrong > 8) --input B --rank 4 --symmetric false result is wrong > > To verify that my input data is correct, this is the result of `mahout > seqdumper` > > For A: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: {0:1.0,1:2.0,2:3.0} > Key: 1: Value: {0:2.0,1:4.0,2:5.0} > Key: 2: Value: {0:3.0,1:5.0,2:6.0} > Count: 3 > > > For B: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: {0:5.0,1:2.0,2:4.0} > Key: 1: Value: {0:-3.0,1:6.0,2:2.0} > Key: 2: Value: {0:3.0,1:-3.0,2:1.0} > Count: 3 > > > And finally, the correct eigenvalues should be: > For A: > λ1 = 11.3448 > λ2 = -0.515729 > λ3 = 0.170915 > > For B: > λ1 = 7 > λ2 = 3 > λ3 = 2 > > So, are there any known bugs in Mahout's SVD implementation? Am I doing > something wrong? Is this algorithm known to produce wrong results? > > Thanks in advance. > > Markus >
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsTed Dunning 2011-09-23, 02:56
Can you say what the results you got actually were? Did you account for the
fact that eigenvalues are only unique to sign? On Thu, Sep 22, 2011 at 6:37 PM, Markus Holtermann <[EMAIL PROTECTED] > wrote: > Hello there, > > I'm trying to run Mahout's Singular Value Decomposition but realized, > that the resulting eigenvalues are wrong in most cases. So I took two > small 3x3 matrices and calculated their eigenvalues and eigenvectors by > hand and compared the results to Mahout. > > Only in one of eight cases the results for Mahout and my pen & paper > matched. > > Lets take > A = {{1,2,3},{2,4,5},{3,5,6}} > and > B = {{5,2,4},{-3,6,2},{3,-3,1}} > > As you can see, A is symmetric, B is not. > > I ran `mahout svd --output out/ --numRows 3 --numCols 3` eight times > with different arguments: > > 1) --input A --rank 3 --symmetric true result is wrong > 2) --input A --rank 4 --symmetric true result is wrong > 3) --input A --rank 3 --symmetric false result is wrong > 4) --input A --rank 4 --symmetric false result is CORRECT > > 5) --input B --rank 3 --symmetric true result is wrong > 6) --input B --rank 4 --symmetric true result is wrong > 7) --input B --rank 3 --symmetric false result is wrong > 8) --input B --rank 4 --symmetric false result is wrong > > To verify that my input data is correct, this is the result of `mahout > seqdumper` > > For A: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: {0:1.0,1:2.0,2:3.0} > Key: 1: Value: {0:2.0,1:4.0,2:5.0} > Key: 2: Value: {0:3.0,1:5.0,2:6.0} > Count: 3 > > > For B: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: {0:5.0,1:2.0,2:4.0} > Key: 1: Value: {0:-3.0,1:6.0,2:2.0} > Key: 2: Value: {0:3.0,1:-3.0,2:1.0} > Count: 3 > > > And finally, the correct eigenvalues should be: > For A: > λ1 = 11.3448 > λ2 = -0.515729 > λ3 = 0.170915 > > For B: > λ1 = 7 > λ2 = 3 > λ3 = 2 > > So, are there any known bugs in Mahout's SVD implementation? Am I doing > something wrong? Is this algorithm known to produce wrong results? > > Thanks in advance. > > Markus >
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsTed Dunning 2011-09-23, 02:57
Which SVD were you using? The in-memory one? The map-reduce version? If
map-reduce, which one? On Thu, Sep 22, 2011 at 6:37 PM, Markus Holtermann <[EMAIL PROTECTED] > wrote: > Hello there, > > I'm trying to run Mahout's Singular Value Decomposition but realized, > that the resulting eigenvalues are wrong in most cases. So I took two > small 3x3 matrices and calculated their eigenvalues and eigenvectors by > hand and compared the results to Mahout. > > Only in one of eight cases the results for Mahout and my pen & paper > matched. > > Lets take > A = {{1,2,3},{2,4,5},{3,5,6}} > and > B = {{5,2,4},{-3,6,2},{3,-3,1}} > > As you can see, A is symmetric, B is not. > > I ran `mahout svd --output out/ --numRows 3 --numCols 3` eight times > with different arguments: > > 1) --input A --rank 3 --symmetric true result is wrong > 2) --input A --rank 4 --symmetric true result is wrong > 3) --input A --rank 3 --symmetric false result is wrong > 4) --input A --rank 4 --symmetric false result is CORRECT > > 5) --input B --rank 3 --symmetric true result is wrong > 6) --input B --rank 4 --symmetric true result is wrong > 7) --input B --rank 3 --symmetric false result is wrong > 8) --input B --rank 4 --symmetric false result is wrong > > To verify that my input data is correct, this is the result of `mahout > seqdumper` > > For A: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: {0:1.0,1:2.0,2:3.0} > Key: 1: Value: {0:2.0,1:4.0,2:5.0} > Key: 2: Value: {0:3.0,1:5.0,2:6.0} > Count: 3 > > > For B: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: {0:5.0,1:2.0,2:4.0} > Key: 1: Value: {0:-3.0,1:6.0,2:2.0} > Key: 2: Value: {0:3.0,1:-3.0,2:1.0} > Count: 3 > > > And finally, the correct eigenvalues should be: > For A: > λ1 = 11.3448 > λ2 = -0.515729 > λ3 = 0.170915 > > For B: > λ1 = 7 > λ2 = 3 > λ3 = 2 > > So, are there any known bugs in Mahout's SVD implementation? Am I doing > something wrong? Is this algorithm known to produce wrong results? > > Thanks in advance. > > Markus >
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsDanny Bickson 2011-09-23, 05:56
Hi Markus!
1) Regarding rank, I observed here: http://bickson.blogspot.com/2011/02/some-thoughts-about-accuracy-of-mahouts.html that you need to request rank+1 to get the desired rank. So your runs with --rank 4 are the correct ones. 2) There are two transformations which makes comparison of results with matlab (or pen & paper) harder: a) The scaleFactor. Defined in: ./math/src/main/java/org/apache/mahout/math/decomposer/lanczos/LanczosState.java I quote a documentation remark in: ./math/src/main/java/org/apache/mahout/math/decomposer/lanczos/LanczosSolver.java:48 *" * To avoid floating point overflow problems which arise in power-methods like Lanczos, an initial pass is made * through the input matrix to * <li>generate a good starting seed vector by summing all the rows of the input matrix, and</li> * <li>compute the trace(inputMatrix<sup>t</sup>*matrix) * This latter value, being the sum of all of the singular values, is used to rescale the entire matrix, effectively * forcing the largest singular value to be strictly less than one, and transforming floating point overflow * problems into floating point underflow (ie, very small singular values will become invisible, as they * will appear to be zero and the algorithm will terminate).* Now - did you take the scale factor into account in your comparison? If not, you will surely get different results. b) The second transformation is orthonogolization of the resulting vector. This step is optional (IMHO). see: ./math/src/main/java/org/apache/mahout/math/decomposer/lanczos/LanczosSolver.java:118 The function call is: orthoganalizeAgainstAllButLast(nextVector, state); Again I quote from documentation: ** <p>This implementation uses {@link org.apache.mahout.math.matrix.linalg.EigenvalueDecomposition} to do the * eigenvalue extraction from the small (desiredRank x desiredRank) tridiagonal matrix. Numerical stability is * achieved via brute-force: re-orthogonalization against all previous eigenvectors is computed after every pass. * This can be made smarter if (when!) this proves to be a major bottleneck. Of course, this step can be parallelized * as well. * </p>* Did you take orthogonalization into account when comparing? Matlab eig() command does not perform this step as far as I recall. Let me know if you have further questions. Best, Danny Bickson On Fri, Sep 23, 2011 at 4:37 AM, Markus Holtermann <[EMAIL PROTECTED] > wrote: > Hello there, > > I'm trying to run Mahout's Singular Value Decomposition but realized, > that the resulting eigenvalues are wrong in most cases. So I took two > small 3x3 matrices and calculated their eigenvalues and eigenvectors by > hand and compared the results to Mahout. > > Only in one of eight cases the results for Mahout and my pen & paper > matched. > > Lets take > A = {{1,2,3},{2,4,5},{3,5,6}} > and > B = {{5,2,4},{-3,6,2},{3,-3,1}} > > As you can see, A is symmetric, B is not. > > I ran `mahout svd --output out/ --numRows 3 --numCols 3` eight times > with different arguments: > > 1) --input A --rank 3 --symmetric true result is wrong > 2) --input A --rank 4 --symmetric true result is wrong > 3) --input A --rank 3 --symmetric false result is wrong > 4) --input A --rank 4 --symmetric false result is CORRECT > > 5) --input B --rank 3 --symmetric true result is wrong > 6) --input B --rank 4 --symmetric true result is wrong > 7) --input B --rank 3 --symmetric false result is wrong > 8) --input B --rank 4 --symmetric false result is wrong > > To verify that my input data is correct, this is the result of `mahout > seqdumper` > > For A: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: {0:1.0,1:2.0,2:3.0} > Key: 1: Value: {0:2.0,1:4.0,2:5.0} > Key: 2: Value: {0:3.0,1:5.0,2:6.0} > Count: 3 > > > For B: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsMarkus Holtermann 2011-09-23, 22:42
Thank you for all your responses.
ref. Dan Brickley: ------------------ hopefully you did dream ;-) ref. Dmitriy Lyubimov: ---------------------- When I run `mahout ssvd -i A.seq -o A-ssvd/ -k 3 -p 0` I get an IllegalArgumentException. You can find the traceback at http://paste.pocoo.org/show/481168/ . ref. Ted Dunning: ----------------- I am running the M/R version of SVD in local mode. I didn't install Hadoop except what is coming via `mvn install`. If I understand the code correctly, the `--inMemory` argument is only relevant for the "EigenVerificationJob" -- I didn't run that. Here are the latest results for the calculations as described in my previous mail: For 1: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 11.344411508600611: {0:0.8940505788976013,1:0.05761556873901637,2:-0.44424543735613486} Key: 1: Value: eigenVector1, eigenvalue = 0.0: {0:-0.3030457633656634,1:0.8081220356417685,2:-0.5050762722761053} Key: 2: Value: eigenVector2, eigenvalue = -0.4362482432944815: {0:0.3299042704770375,1:0.5861904313011974,2:0.7399621277956934} Count: 3 For 2: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 11.344814282762082: {0:0.7369762290995766,1:0.3279852776056837,2:-0.5910090485061045} Key: 1: Value: eigenVector1, eigenvalue = 0.17091518882717976: {0:0.9225878132457447,1:0.3812202473600341,2:0.05918487858557608} Key: 2: Value: eigenVector2, eigenvalue = 0.0: {0:-0.5910090485061055,1:0.7369762290995774,2:-0.3279852776056802} Key: 3: Value: eigenVector3, eigenvalue -0.5157294715892533:{0:-0.32798527760568197,1:-0.5910090485061036,2:-0.7369762290995783} Count: 4 For 3: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 11.344814080004587: {0:0.2870124314018251,1:-0.8054865010309287,2:0.5184740696291035} Key: 1: Value: eigenVector1, eigenvalue = 0.4852290375835231: {0:0.9000472484774761,1:0.041469409433508436,2:-0.4338147514658307} Key: 2: Value: eigenVector2, eigenvalue = 0.0: {0:0.3279311127797073,1:0.5911613863727806,2:0.7368781449689461} Count: 3 For 4: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 11.34481428276208: {0:0.788451139115581,1:0.5058848349238699,2:0.3498933194866569} Key: 1: Value: eigenVector1, eigenvalue = 0.5157294715892401: {0:-0.5910090485061453,1:0.7369762290995597,2:-0.32798527760564816} Key: 2: Value: eigenVector2, eigenvalue = 0.1709151888272022: {0:-0.7369762290995447,1:-0.3279852776057236,2:0.5910090485061223} Key: 3: Value: eigenVector3, eigenvalue = 0.0: {0:-0.3279852776056819,1:-0.5910090485061036,2:-0.7369762290995783} Count: 4 For 5: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 7.7949818262315: {0:-0.3998289016610171,1:0.3486764982772797,2:0.8476800982361441} Key: 1: Value: eigenVector1, eigenvalue = 0.0: {0:0.3244428422615253,1:-0.8111071056538125,2:0.4866642633922878} Key: 2: Value: eigenVector2, eigenvalue = -2.2686660367578133: {0:0.8572477421969729,1:0.4696061783100697,2:0.21117846905213422} Count: 3 For 6: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 9.903422603237882: {0:-0.305869782876591,1:-0.012493432384138303,2:0.9519913813004245} Key: 1: Value: eigenVector1, eigenvalue = 6.002722238353203: {0:-0.7781330995244824,1:0.06366543541563939,2:0.624864458709054} Key: 2: Value: eigenVector2, eigenvalue = 0.0: {0:0.2988138112963618,1:0.9481291552697455,2:0.10845003967736172} Key: 3: Value: eigenVector3, eigenvalue = -3.906144841591079: {0:0.9039656974142156,1:-0.3176397630567398,2:0.2862708487144453} Count: 4 For 7: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 7.04924152040162: {0:-0.4082482904638631,1:0.8164965809277261,2:-0.4082482904638631} Key: 1: Value: eigenVector1, eigenvalue = 3.782617346103868: {0:0.7808892910047764,1:0.08072916428282848,2:-0.6194309624391194} Key: 2: Value: eigenVector2, eigenvalue = 0.0: {0:0.47280571964327067,1:0.5716783495703939,2:0.6705509794975171} Count: 3 For 8: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 7.964450219004663: {0:NaN,1:NaN,2:NaN} Key: 1: Value: eigenVector1, eigenvalue = 7.000000000000002: {0:NaN,1:NaN,2:NaN} Key: 2: Value: eigenVector2, eigenvalue = 0.753347668076679: {0:NaN,1:NaN,2:NaN} Key: 3: Value: eigenVector3, eigenvalue = 0.0: {0:NaN,1:NaN,2:NaN} Count: 4 ref. Danny Bickson: Thanks for your confirmation on how to use the rank. Regarding the scale factor and orthogonalization: Yes, I take it into account. I'm running SVD from trunk without any changes. And even after commenting out those parts of the code, the results are still wrong in the cases 1, 2, 3, 7 and 8 Thank you for your help. Markus
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsLance Norskog 2011-09-23, 23:03
Markus-
Probably the best approach is to crosscheck your results with live data of various sizes with the R statistical system. (You will often get results with opposing signs.) Lance On Fri, Sep 23, 2011 at 3:42 PM, Markus Holtermann <[EMAIL PROTECTED] > wrote: > Thank you for all your responses. > > ref. Dan Brickley: > ------------------ > hopefully you did dream ;-) > > ref. Dmitriy Lyubimov: > ---------------------- > When I run `mahout ssvd -i A.seq -o A-ssvd/ -k 3 -p 0` I get an > IllegalArgumentException. You can find the traceback at > http://paste.pocoo.org/show/481168/ . > > ref. Ted Dunning: > ----------------- > I am running the M/R version of SVD in local mode. I didn't install > Hadoop except what is coming via `mvn install`. > If I understand the code correctly, the `--inMemory` argument is only > relevant for the "EigenVerificationJob" -- I didn't run that. > > Here are the latest results for the calculations as described in my > previous mail: > > For 1: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: eigenVector0, eigenvalue = 11.344411508600611: > {0:0.8940505788976013,1:0.05761556873901637,2:-0.44424543735613486} > Key: 1: Value: eigenVector1, eigenvalue = 0.0: > {0:-0.3030457633656634,1:0.8081220356417685,2:-0.5050762722761053} > Key: 2: Value: eigenVector2, eigenvalue = -0.4362482432944815: > {0:0.3299042704770375,1:0.5861904313011974,2:0.7399621277956934} > Count: 3 > > For 2: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: eigenVector0, eigenvalue = 11.344814282762082: > {0:0.7369762290995766,1:0.3279852776056837,2:-0.5910090485061045} > Key: 1: Value: eigenVector1, eigenvalue = 0.17091518882717976: > {0:0.9225878132457447,1:0.3812202473600341,2:0.05918487858557608} > Key: 2: Value: eigenVector2, eigenvalue = 0.0: > {0:-0.5910090485061055,1:0.7369762290995774,2:-0.3279852776056802} > Key: 3: Value: eigenVector3, eigenvalue > > -0.5157294715892533:{0:-0.32798527760568197,1:-0.5910090485061036,2:-0.7369762290995783} > Count: 4 > > For 3: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: eigenVector0, eigenvalue = 11.344814080004587: > {0:0.2870124314018251,1:-0.8054865010309287,2:0.5184740696291035} > Key: 1: Value: eigenVector1, eigenvalue = 0.4852290375835231: > {0:0.9000472484774761,1:0.041469409433508436,2:-0.4338147514658307} > Key: 2: Value: eigenVector2, eigenvalue = 0.0: > {0:0.3279311127797073,1:0.5911613863727806,2:0.7368781449689461} > Count: 3 > > For 4: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: eigenVector0, eigenvalue = 11.34481428276208: > {0:0.788451139115581,1:0.5058848349238699,2:0.3498933194866569} > Key: 1: Value: eigenVector1, eigenvalue = 0.5157294715892401: > {0:-0.5910090485061453,1:0.7369762290995597,2:-0.32798527760564816} > Key: 2: Value: eigenVector2, eigenvalue = 0.1709151888272022: > {0:-0.7369762290995447,1:-0.3279852776057236,2:0.5910090485061223} > Key: 3: Value: eigenVector3, eigenvalue = 0.0: > {0:-0.3279852776056819,1:-0.5910090485061036,2:-0.7369762290995783} > Count: 4 > > For 5: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: eigenVector0, eigenvalue = 7.7949818262315: > {0:-0.3998289016610171,1:0.3486764982772797,2:0.8476800982361441} > Key: 1: Value: eigenVector1, eigenvalue = 0.0: > {0:0.3244428422615253,1:-0.8111071056538125,2:0.4866642633922878} > Key: 2: Value: eigenVector2, eigenvalue = -2.2686660367578133: > {0:0.8572477421969729,1:0.4696061783100697,2:0.21117846905213422} > Count: 3 > > For 6: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: eigenVector0, eigenvalue = 9.903422603237882: Lance Norskog [EMAIL PROTECTED]
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsDmitriy Lyubimov 2011-09-23, 23:42
oh, ok, apparently you need to use p>0.
but then there's a problem that ther's k+p >=m (input height) requirement so I guess this is a corner case i did not account for. you can use k=2 and p=1 and caveat is that even though 3 singular values will be computed, only 2 of them will be saved. this solver always assumes "thin" decomposition requirement\s, although distinction is purely technical, it is only a matter a patch to enable p=0. It is only a case because your input so small. In practice, input is much "longer" than k+p rows so it hasn't come up as an issue. Point is, it will not do full rank decomposition with small matrices; but then, you don't want to use it with small matrices :) alhough i can engineer a patch to allow p=0 and full rank decompositions for short wide matrices if it is that important. -dmitriy On Fri, Sep 23, 2011 at 3:42 PM, Markus Holtermann <[EMAIL PROTECTED]> wrote: > Thank you for all your responses. > > ref. Dan Brickley: > ------------------ > hopefully you did dream ;-) > > ref. Dmitriy Lyubimov: > ---------------------- > When I run `mahout ssvd -i A.seq -o A-ssvd/ -k 3 -p 0` I get an > IllegalArgumentException. You can find the traceback at > http://paste.pocoo.org/show/481168/ . > > ref. Ted Dunning: > ----------------- > I am running the M/R version of SVD in local mode. I didn't install > Hadoop except what is coming via `mvn install`. > If I understand the code correctly, the `--inMemory` argument is only > relevant for the "EigenVerificationJob" -- I didn't run that. > > Here are the latest results for the calculations as described in my > previous mail: > > For 1: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: eigenVector0, eigenvalue = 11.344411508600611: > {0:0.8940505788976013,1:0.05761556873901637,2:-0.44424543735613486} > Key: 1: Value: eigenVector1, eigenvalue = 0.0: > {0:-0.3030457633656634,1:0.8081220356417685,2:-0.5050762722761053} > Key: 2: Value: eigenVector2, eigenvalue = -0.4362482432944815: > {0:0.3299042704770375,1:0.5861904313011974,2:0.7399621277956934} > Count: 3 > > For 2: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: eigenVector0, eigenvalue = 11.344814282762082: > {0:0.7369762290995766,1:0.3279852776056837,2:-0.5910090485061045} > Key: 1: Value: eigenVector1, eigenvalue = 0.17091518882717976: > {0:0.9225878132457447,1:0.3812202473600341,2:0.05918487858557608} > Key: 2: Value: eigenVector2, eigenvalue = 0.0: > {0:-0.5910090485061055,1:0.7369762290995774,2:-0.3279852776056802} > Key: 3: Value: eigenVector3, eigenvalue > -0.5157294715892533:{0:-0.32798527760568197,1:-0.5910090485061036,2:-0.7369762290995783} > Count: 4 > > For 3: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: eigenVector0, eigenvalue = 11.344814080004587: > {0:0.2870124314018251,1:-0.8054865010309287,2:0.5184740696291035} > Key: 1: Value: eigenVector1, eigenvalue = 0.4852290375835231: > {0:0.9000472484774761,1:0.041469409433508436,2:-0.4338147514658307} > Key: 2: Value: eigenVector2, eigenvalue = 0.0: > {0:0.3279311127797073,1:0.5911613863727806,2:0.7368781449689461} > Count: 3 > > For 4: > Key class: class org.apache.hadoop.io.IntWritable > Value Class: class org.apache.mahout.math.VectorWritable > Key: 0: Value: eigenVector0, eigenvalue = 11.34481428276208: > {0:0.788451139115581,1:0.5058848349238699,2:0.3498933194866569} > Key: 1: Value: eigenVector1, eigenvalue = 0.5157294715892401: > {0:-0.5910090485061453,1:0.7369762290995597,2:-0.32798527760564816} > Key: 2: Value: eigenVector2, eigenvalue = 0.1709151888272022: > {0:-0.7369762290995447,1:-0.3279852776057236,2:0.5910090485061223} > Key: 3: Value: eigenVector3, eigenvalue = 0.0: > {0:-0.3279852776056819,1:-0.5910090485061036,2:-0.7369762290995783} > Count: 4 > > For 5:
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsDmitriy Lyubimov 2011-09-24, 00:29
Markus, ok, use of p=0 is enabled on the trunk, verified and committed.
On Fri, Sep 23, 2011 at 4:42 PM, Dmitriy Lyubimov <[EMAIL PROTECTED]> wrote: > oh, ok, apparently you need to use p>0. > > but then there's a problem that ther's k+p >=m (input height) > requirement so I guess this is a corner case i did not account for. > > you can use k=2 and p=1 and caveat is that even though 3 singular > values will be computed, only 2 of them will be saved. this solver > always assumes "thin" decomposition requirement\s, although > distinction is purely technical, it is only a matter a patch to enable > p=0. > > It is only a case because your input so small. In practice, input is > much "longer" than k+p rows so it hasn't come up as an issue. Point > is, it will not do full rank decomposition with small matrices; but > then, you don't want to use it with small matrices :) > > alhough i can engineer a patch to allow p=0 and full rank > decompositions for short wide matrices if it is that important. > > -dmitriy > > On Fri, Sep 23, 2011 at 3:42 PM, Markus Holtermann > <[EMAIL PROTECTED]> wrote: >> Thank you for all your responses. >> >> ref. Dan Brickley: >> ------------------ >> hopefully you did dream ;-) >> >> ref. Dmitriy Lyubimov: >> ---------------------- >> When I run `mahout ssvd -i A.seq -o A-ssvd/ -k 3 -p 0` I get an >> IllegalArgumentException. You can find the traceback at >> http://paste.pocoo.org/show/481168/ . >> >> ref. Ted Dunning: >> ----------------- >> I am running the M/R version of SVD in local mode. I didn't install >> Hadoop except what is coming via `mvn install`. >> If I understand the code correctly, the `--inMemory` argument is only >> relevant for the "EigenVerificationJob" -- I didn't run that. >> >> Here are the latest results for the calculations as described in my >> previous mail: >> >> For 1: >> Key class: class org.apache.hadoop.io.IntWritable >> Value Class: class org.apache.mahout.math.VectorWritable >> Key: 0: Value: eigenVector0, eigenvalue = 11.344411508600611: >> {0:0.8940505788976013,1:0.05761556873901637,2:-0.44424543735613486} >> Key: 1: Value: eigenVector1, eigenvalue = 0.0: >> {0:-0.3030457633656634,1:0.8081220356417685,2:-0.5050762722761053} >> Key: 2: Value: eigenVector2, eigenvalue = -0.4362482432944815: >> {0:0.3299042704770375,1:0.5861904313011974,2:0.7399621277956934} >> Count: 3 >> >> For 2: >> Key class: class org.apache.hadoop.io.IntWritable >> Value Class: class org.apache.mahout.math.VectorWritable >> Key: 0: Value: eigenVector0, eigenvalue = 11.344814282762082: >> {0:0.7369762290995766,1:0.3279852776056837,2:-0.5910090485061045} >> Key: 1: Value: eigenVector1, eigenvalue = 0.17091518882717976: >> {0:0.9225878132457447,1:0.3812202473600341,2:0.05918487858557608} >> Key: 2: Value: eigenVector2, eigenvalue = 0.0: >> {0:-0.5910090485061055,1:0.7369762290995774,2:-0.3279852776056802} >> Key: 3: Value: eigenVector3, eigenvalue >> -0.5157294715892533:{0:-0.32798527760568197,1:-0.5910090485061036,2:-0.7369762290995783} >> Count: 4 >> >> For 3: >> Key class: class org.apache.hadoop.io.IntWritable >> Value Class: class org.apache.mahout.math.VectorWritable >> Key: 0: Value: eigenVector0, eigenvalue = 11.344814080004587: >> {0:0.2870124314018251,1:-0.8054865010309287,2:0.5184740696291035} >> Key: 1: Value: eigenVector1, eigenvalue = 0.4852290375835231: >> {0:0.9000472484774761,1:0.041469409433508436,2:-0.4338147514658307} >> Key: 2: Value: eigenVector2, eigenvalue = 0.0: >> {0:0.3279311127797073,1:0.5911613863727806,2:0.7368781449689461} >> Count: 3 >> >> For 4: >> Key class: class org.apache.hadoop.io.IntWritable >> Value Class: class org.apache.mahout.math.VectorWritable >> Key: 0: Value: eigenVector0, eigenvalue = 11.34481428276208: >> {0:0.788451139115581,1:0.5058848349238699,2:0.3498933194866569} >> Key: 1: Value: eigenVector1, eigenvalue = 0.5157294715892401: >> {0:-0.5910090485061453,1:0.7369762290995597,2:-0.32798527760564816} >> Key: 2: Value
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsDan Brickley 2011-09-24, 01:34
On 23 September 2011 16:03, Lance Norskog <[EMAIL PROTECTED]> wrote:
> Markus- > > Probably the best approach is to crosscheck your results with live data of > various sizes with the R statistical system. (You will often get results > with opposing signs.) So, that's exactly where I was, with Ruby and Matlab(<cheapskate>GNU Octave</cheapskate>) taking the place of R there. It didn't help me that my grasp of the relevant linear algebra was somewhat journalistic, for sure. But precisely because it was shaky, I thought "right, let's stay sane since I'm not an expert either in the maths, or in hadoop, or in mahout, so ... I'll take a simple tiny testcase example, make sure I can run it in Octave and Ruby, ... and use that to build out my understanding of Mahout's SVD". That turned out to be a disappointing learning experience, for reasons recently summarised here. I was using a tiny example taken from http://www.igvita.com/2007/01/15/svd-recommendation-system-in-ruby/ because I thought that was a nice way of re-using a helpful writeup as Mahout documentation. Bad idea due to dataset size. Looking again at https://cwiki.apache.org/MAHOUT/dimensional-reduction.html I see that there is in fact a good sample dataset now; the mailing list stuff. Maybe I'd missed it at the time. It deserves more attention, as a common hub for documentation, user education, and for comparison testing and sanity-checking against non-Mahout environments like R etc. (Perhaps the EC2 aspect is an issue for non-Amazon users?). I'm not sure if "Overall, there are 6,094,444 key-value pairs in 283 files taking around 5.7GB of disk." makes it too big for many non-Mahout environments. But the sooner there's a single dataset people use to get started experimenting with Mahout SVD, the sooner we'll avoid everyone revisiting the "I don't understand what Lanczos has done..." thread. Should there be a FAQ on the Lanczos page? Q: Will this work with a test matrix of e.g. 5x8 size? A: No, ... it needs to be substantially bigger,... Q: How much bigger? A: <... somebody write something here ... > cheers, Dan
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsTed Dunning 2011-09-24, 03:33
Markus,
Try testing on a 20x20 matrix if you want to use p>0. The issue is that this is an approximation algorithm that works for reasonably high dimension. 3 is not reasonably high. 20 is probably marginal. On Fri, Sep 23, 2011 at 4:42 PM, Dmitriy Lyubimov <[EMAIL PROTECTED]>wrote: > oh, ok, apparently you need to use p>0. > > but then there's a problem that ther's k+p >=m (input height) > requirement so I guess this is a corner case i did not account for. > > you can use k=2 and p=1 and caveat is that even though 3 singular > values will be computed, only 2 of them will be saved. this solver > always assumes "thin" decomposition requirement\s, although > distinction is purely technical, it is only a matter a patch to enable > p=0. > > It is only a case because your input so small. In practice, input is > much "longer" than k+p rows so it hasn't come up as an issue. Point > is, it will not do full rank decomposition with small matrices; but > then, you don't want to use it with small matrices :) > > alhough i can engineer a patch to allow p=0 and full rank > decompositions for short wide matrices if it is that important. > > -dmitriy > > On Fri, Sep 23, 2011 at 3:42 PM, Markus Holtermann > <[EMAIL PROTECTED]> wrote: > > Thank you for all your responses. > > > > ref. Dan Brickley: > > ------------------ > > hopefully you did dream ;-) > > > > ref. Dmitriy Lyubimov: > > ---------------------- > > When I run `mahout ssvd -i A.seq -o A-ssvd/ -k 3 -p 0` I get an > > IllegalArgumentException. You can find the traceback at > > http://paste.pocoo.org/show/481168/ . > > > > ref. Ted Dunning: > > ----------------- > > I am running the M/R version of SVD in local mode. I didn't install > > Hadoop except what is coming via `mvn install`. > > If I understand the code correctly, the `--inMemory` argument is only > > relevant for the "EigenVerificationJob" -- I didn't run that. > > > > Here are the latest results for the calculations as described in my > > previous mail: > > > > For 1: > > Key class: class org.apache.hadoop.io.IntWritable > > Value Class: class org.apache.mahout.math.VectorWritable > > Key: 0: Value: eigenVector0, eigenvalue = 11.344411508600611: > > {0:0.8940505788976013,1:0.05761556873901637,2:-0.44424543735613486} > > Key: 1: Value: eigenVector1, eigenvalue = 0.0: > > {0:-0.3030457633656634,1:0.8081220356417685,2:-0.5050762722761053} > > Key: 2: Value: eigenVector2, eigenvalue = -0.4362482432944815: > > {0:0.3299042704770375,1:0.5861904313011974,2:0.7399621277956934} > > Count: 3 > > > > For 2: > > Key class: class org.apache.hadoop.io.IntWritable > > Value Class: class org.apache.mahout.math.VectorWritable > > Key: 0: Value: eigenVector0, eigenvalue = 11.344814282762082: > > {0:0.7369762290995766,1:0.3279852776056837,2:-0.5910090485061045} > > Key: 1: Value: eigenVector1, eigenvalue = 0.17091518882717976: > > {0:0.9225878132457447,1:0.3812202473600341,2:0.05918487858557608} > > Key: 2: Value: eigenVector2, eigenvalue = 0.0: > > {0:-0.5910090485061055,1:0.7369762290995774,2:-0.3279852776056802} > > Key: 3: Value: eigenVector3, eigenvalue > > > -0.5157294715892533:{0:-0.32798527760568197,1:-0.5910090485061036,2:-0.7369762290995783} > > Count: 4 > > > > For 3: > > Key class: class org.apache.hadoop.io.IntWritable > > Value Class: class org.apache.mahout.math.VectorWritable > > Key: 0: Value: eigenVector0, eigenvalue = 11.344814080004587: > > {0:0.2870124314018251,1:-0.8054865010309287,2:0.5184740696291035} > > Key: 1: Value: eigenVector1, eigenvalue = 0.4852290375835231: > > {0:0.9000472484774761,1:0.041469409433508436,2:-0.4338147514658307} > > Key: 2: Value: eigenVector2, eigenvalue = 0.0: > > {0:0.3279311127797073,1:0.5911613863727806,2:0.7368781449689461} > > Count: 3 > > > > For 4: > > Key class: class org.apache.hadoop.io.IntWritable > > Value Class: class org.apache.mahout.math.VectorWritable > > Key: 0: Value: eigenVector0, eigenvalue = 11.34481428276208: > > {0:0.788451139115581,1:0.5058848349238699,2:0.3498933194866569}
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsDmitriy Lyubimov 2011-09-24, 03:46
I already fixed full rank (p =0) on the trunk. It was just an invalid
assertion, the algorithm isn't limiting that. So k=3 p=0 should be ok now in the trunk. On Sep 23, 2011 8:34 PM, "Ted Dunning" <[EMAIL PROTECTED]> wrote: > Markus, > > Try testing on a 20x20 matrix if you want to use p>0. The issue is that > this is an approximation algorithm that works for reasonably high dimension. > 3 is not reasonably high. 20 is probably marginal. > > On Fri, Sep 23, 2011 at 4:42 PM, Dmitriy Lyubimov <[EMAIL PROTECTED] >wrote: > >> oh, ok, apparently you need to use p>0. >> >> but then there's a problem that ther's k+p >=m (input height) >> requirement so I guess this is a corner case i did not account for. >> >> you can use k=2 and p=1 and caveat is that even though 3 singular >> values will be computed, only 2 of them will be saved. this solver >> always assumes "thin" decomposition requirement\s, although >> distinction is purely technical, it is only a matter a patch to enable >> p=0. >> >> It is only a case because your input so small. In practice, input is >> much "longer" than k+p rows so it hasn't come up as an issue. Point >> is, it will not do full rank decomposition with small matrices; but >> then, you don't want to use it with small matrices :) >> >> alhough i can engineer a patch to allow p=0 and full rank >> decompositions for short wide matrices if it is that important. >> >> -dmitriy >> >> On Fri, Sep 23, 2011 at 3:42 PM, Markus Holtermann >> <[EMAIL PROTECTED]> wrote: >> > Thank you for all your responses. >> > >> > ref. Dan Brickley: >> > ------------------ >> > hopefully you did dream ;-) >> > >> > ref. Dmitriy Lyubimov: >> > ---------------------- >> > When I run `mahout ssvd -i A.seq -o A-ssvd/ -k 3 -p 0` I get an >> > IllegalArgumentException. You can find the traceback at >> > http://paste.pocoo.org/show/481168/ . >> > >> > ref. Ted Dunning: >> > ----------------- >> > I am running the M/R version of SVD in local mode. I didn't install >> > Hadoop except what is coming via `mvn install`. >> > If I understand the code correctly, the `--inMemory` argument is only >> > relevant for the "EigenVerificationJob" -- I didn't run that. >> > >> > Here are the latest results for the calculations as described in my >> > previous mail: >> > >> > For 1: >> > Key class: class org.apache.hadoop.io.IntWritable >> > Value Class: class org.apache.mahout.math.VectorWritable >> > Key: 0: Value: eigenVector0, eigenvalue = 11.344411508600611: >> > {0:0.8940505788976013,1:0.05761556873901637,2:-0.44424543735613486} >> > Key: 1: Value: eigenVector1, eigenvalue = 0.0: >> > {0:-0.3030457633656634,1:0.8081220356417685,2:-0.5050762722761053} >> > Key: 2: Value: eigenVector2, eigenvalue = -0.4362482432944815: >> > {0:0.3299042704770375,1:0.5861904313011974,2:0.7399621277956934} >> > Count: 3 >> > >> > For 2: >> > Key class: class org.apache.hadoop.io.IntWritable >> > Value Class: class org.apache.mahout.math.VectorWritable >> > Key: 0: Value: eigenVector0, eigenvalue = 11.344814282762082: >> > {0:0.7369762290995766,1:0.3279852776056837,2:-0.5910090485061045} >> > Key: 1: Value: eigenVector1, eigenvalue = 0.17091518882717976: >> > {0:0.9225878132457447,1:0.3812202473600341,2:0.05918487858557608} >> > Key: 2: Value: eigenVector2, eigenvalue = 0.0: >> > {0:-0.5910090485061055,1:0.7369762290995774,2:-0.3279852776056802} >> > Key: 3: Value: eigenVector3, eigenvalue >> > >> -0.5157294715892533:{0:-0.32798527760568197,1:-0.5910090485061036,2:-0.7369762290995783} >> > Count: 4 >> > >> > For 3: >> > Key class: class org.apache.hadoop.io.IntWritable >> > Value Class: class org.apache.mahout.math.VectorWritable >> > Key: 0: Value: eigenVector0, eigenvalue = 11.344814080004587: >> > {0:0.2870124314018251,1:-0.8054865010309287,2:0.5184740696291035} >> > Key: 1: Value: eigenVector1, eigenvalue = 0.4852290375835231: >> > {0:0.9000472484774761,1:0.041469409433508436,2:-0.4338147514658307} >> > Key: 2: Value: eigenVector2, eigenvalue = 0.0:
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsLance Norskog 2011-09-24, 21:51
As a side note, there are also a few in-memory SVD implementations. There
is a SingularValueDecomposition which uses "pre-Mahout" data structures. There are also a few Factorizer classes which are apparently SVD but only supply right&left matrices but no singular values. What are the minimum sizes expected to "work" in these algorithms? Are they intended to be canonical implementations that are correct from "2x2" to "out of memory" or "numerical instability"? Lance On Fri, Sep 23, 2011 at 6:34 PM, Dan Brickley <[EMAIL PROTECTED]> wrote: > On 23 September 2011 16:03, Lance Norskog <[EMAIL PROTECTED]> wrote: > > Markus- > > > > Probably the best approach is to crosscheck your results with live data > of > > various sizes with the R statistical system. (You will often get results > > with opposing signs.) > > So, that's exactly where I was, with Ruby and Matlab(<cheapskate>GNU > Octave</cheapskate>) taking the place of R there. > > It didn't help me that my grasp of the relevant linear algebra was > somewhat journalistic, for sure. But precisely because it was shaky, > I thought "right, let's stay sane since I'm not an expert either in > the maths, or in hadoop, or in mahout, so ... I'll take a simple tiny > testcase example, make sure I can run it in Octave and Ruby, ... and > use that to build out my understanding of Mahout's SVD". > > That turned out to be a disappointing learning experience, for reasons > recently summarised here. I was using a tiny example taken from > http://www.igvita.com/2007/01/15/svd-recommendation-system-in-ruby/ > because I thought that was a nice way of re-using a helpful writeup as > Mahout documentation. Bad idea due to dataset size. > > Looking again at > https://cwiki.apache.org/MAHOUT/dimensional-reduction.html I see that > there is in fact a good sample dataset now; the mailing list stuff. > Maybe I'd missed it at the time. It deserves more attention, as a > common hub for documentation, user education, and for comparison > testing and sanity-checking against non-Mahout environments like R > etc. (Perhaps the EC2 aspect is an issue for non-Amazon users?). I'm > not sure if "Overall, there are 6,094,444 key-value pairs in 283 > files taking around 5.7GB of disk." makes it too big for many > non-Mahout environments. But the sooner there's a single dataset > people use to get started experimenting with Mahout SVD, the sooner > we'll avoid everyone revisiting the "I don't understand what Lanczos > has done..." thread. > > Should there be a FAQ on the Lanczos page? > > Q: Will this work with a test matrix of e.g. 5x8 size? > A: No, ... it needs to be substantially bigger,... > > Q: How much bigger? > A: <... somebody write something here ... > > > cheers, > > Dan > -- Lance Norskog [EMAIL PROTECTED]
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsTed Dunning 2011-09-25, 11:39
Also there are a few random projection single machine implementations about
to be committed. These will allow a middle ground for scalability. As Dmitriy pointed out with p = 0 and k = full rank, these should work on any size matrix. That isn't very interesting of course since they devolve down to doing an in-memory SVD of full size in that case. When computing less than a full SVD, the approximation is much better with higher dimensions. On Sat, Sep 24, 2011 at 2:51 PM, Lance Norskog <[EMAIL PROTECTED]> wrote: > As a side note, there are also a few in-memory SVD implementations. There > is a SingularValueDecomposition which uses "pre-Mahout" data structures. > There are also a few Factorizer classes which are apparently SVD but only > supply right&left matrices but no singular values. > > What are the minimum sizes expected to "work" in these algorithms? Are they > intended to be canonical implementations that are correct from "2x2" to > "out > of memory" or "numerical instability"? > > Lance > > On Fri, Sep 23, 2011 at 6:34 PM, Dan Brickley <[EMAIL PROTECTED]> wrote: > > > On 23 September 2011 16:03, Lance Norskog <[EMAIL PROTECTED]> wrote: > > > Markus- > > > > > > Probably the best approach is to crosscheck your results with live data > > of > > > various sizes with the R statistical system. (You will often get > results > > > with opposing signs.) > > > > So, that's exactly where I was, with Ruby and Matlab(<cheapskate>GNU > > Octave</cheapskate>) taking the place of R there. > > > > It didn't help me that my grasp of the relevant linear algebra was > > somewhat journalistic, for sure. But precisely because it was shaky, > > I thought "right, let's stay sane since I'm not an expert either in > > the maths, or in hadoop, or in mahout, so ... I'll take a simple tiny > > testcase example, make sure I can run it in Octave and Ruby, ... and > > use that to build out my understanding of Mahout's SVD". > > > > That turned out to be a disappointing learning experience, for reasons > > recently summarised here. I was using a tiny example taken from > > http://www.igvita.com/2007/01/15/svd-recommendation-system-in-ruby/ > > because I thought that was a nice way of re-using a helpful writeup as > > Mahout documentation. Bad idea due to dataset size. > > > > Looking again at > > https://cwiki.apache.org/MAHOUT/dimensional-reduction.html I see that > > there is in fact a good sample dataset now; the mailing list stuff. > > Maybe I'd missed it at the time. It deserves more attention, as a > > common hub for documentation, user education, and for comparison > > testing and sanity-checking against non-Mahout environments like R > > etc. (Perhaps the EC2 aspect is an issue for non-Amazon users?). I'm > > not sure if "Overall, there are 6,094,444 key-value pairs in 283 > > files taking around 5.7GB of disk." makes it too big for many > > non-Mahout environments. But the sooner there's a single dataset > > people use to get started experimenting with Mahout SVD, the sooner > > we'll avoid everyone revisiting the "I don't understand what Lanczos > > has done..." thread. > > > > Should there be a FAQ on the Lanczos page? > > > > Q: Will this work with a test matrix of e.g. 5x8 size? > > A: No, ... it needs to be substantially bigger,... > > > > Q: How much bigger? > > A: <... somebody write something here ... > > > > > cheers, > > > > Dan > > > > > > -- > Lance Norskog > [EMAIL PROTECTED] >
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsMarkus Holtermann 2011-09-28, 20:32
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1 Hey guys. Thank you for all the information about Singular Value Decomposition. The Lanczos algorithm seems to be a bad choice for small matrices. But the Stochastic SVD with k = full rank and p = 0 (thanks Dmitriy Lyubimov for implementing that) works fine. So far, Markus On 09/23/2011 08:46 PM, Dmitriy Lyubimov wrote: > I already fixed full rank (p =0) on the trunk. It was just an > invalid assertion, the algorithm isn't limiting that. So k=3 p=0 > should be ok now in the trunk. On Sep 23, 2011 8:34 PM, "Ted > Dunning" <[EMAIL PROTECTED]> wrote: >> Markus, >> >> Try testing on a 20x20 matrix if you want to use p>0. The issue >> is that this is an approximation algorithm that works for >> reasonably high > dimension. >> 3 is not reasonably high. 20 is probably marginal. >> >> On Fri, Sep 23, 2011 at 4:42 PM, Dmitriy Lyubimov >> <[EMAIL PROTECTED] wrote: >> >>> oh, ok, apparently you need to use p>0. >>> >>> but then there's a problem that ther's k+p >=m (input height) >>> requirement so I guess this is a corner case i did not account >>> for. >>> >>> you can use k=2 and p=1 and caveat is that even though 3 >>> singular values will be computed, only 2 of them will be saved. >>> this solver always assumes "thin" decomposition requirement\s, >>> although distinction is purely technical, it is only a matter a >>> patch to enable p=0. >>> >>> It is only a case because your input so small. In practice, >>> input is much "longer" than k+p rows so it hasn't come up as an >>> issue. Point is, it will not do full rank decomposition with >>> small matrices; but then, you don't want to use it with small >>> matrices :) >>> >>> alhough i can engineer a patch to allow p=0 and full rank >>> decompositions for short wide matrices if it is that >>> important. >>> >>> -dmitriy >>> >>> On Fri, Sep 23, 2011 at 3:42 PM, Markus Holtermann >>> <[EMAIL PROTECTED]> wrote: >>>> Thank you for all your responses. >>>> >>>> ref. Dan Brickley: ------------------ hopefully you did dream >>>> ;-) >>>> >>>> ref. Dmitriy Lyubimov: ---------------------- When I run >>>> `mahout ssvd -i A.seq -o A-ssvd/ -k 3 -p 0` I get an >>>> IllegalArgumentException. You can find the traceback at >>>> http://paste.pocoo.org/show/481168/ . >>>> >>>> ref. Ted Dunning: ----------------- I am running the M/R >>>> version of SVD in local mode. I didn't install Hadoop except >>>> what is coming via `mvn install`. If I understand the code >>>> correctly, the `--inMemory` argument is only relevant for the >>>> "EigenVerificationJob" -- I didn't run that. >>>> >>>> Here are the latest results for the calculations as described >>>> in my previous mail: >>>> >>>> For 1: Key class: class org.apache.hadoop.io.IntWritable >>>> Value Class: class org.apache.mahout.math.VectorWritable Key: >>>> 0: Value: eigenVector0, eigenvalue = 11.344411508600611: >>>> {0:0.8940505788976013,1:0.05761556873901637,2:-0.44424543735613486} >>>> >>>> Key: 1: Value: eigenVector1, eigenvalue = 0.0: >>>> {0:-0.3030457633656634,1:0.8081220356417685,2:-0.5050762722761053} >>>> >>>> Key: 2: Value: eigenVector2, eigenvalue = -0.4362482432944815: >>>> {0:0.3299042704770375,1:0.5861904313011974,2:0.7399621277956934} >>>> >>>> Count: 3 >>>> >>>> For 2: Key class: class org.apache.hadoop.io.IntWritable >>>> Value Class: class org.apache.mahout.math.VectorWritable Key: >>>> 0: Value: eigenVector0, eigenvalue = 11.344814282762082: >>>> {0:0.7369762290995766,1:0.3279852776056837,2:-0.5910090485061045} >>>> >>>> Key: 1: Value: eigenVector1, eigenvalue = 0.17091518882717976: >>>> {0:0.9225878132457447,1:0.3812202473600341,2:0.05918487858557608} >>>> >>>> Key: 2: Value: eigenVector2, eigenvalue = 0.0: >>>> {0:-0.5910090485061055,1:0.7369762290995774,2:-0.3279852776056802} >>>> >>>> Key: 3: Value: eigenVector3, eigenvalue >>>> >>> > -0.5157294715892533:{0:-0.32798527760568197,1:-0.5910090485061036,2:-0.7369762290995783} Count: 4 Key: 1: Value: eigenVector1, eigenvalue = 0.4852290375835231: Key: 2: Value: eigenVector2, eigenvalue = 0.0: Count: 3 Key: 1: Value: eigenVector1, eigenvalue = 0.5157294715892401: Key: 2: Value: eigenVector2, eigenvalue = 0.1709151888272022: Key: 3: Value: eigenVector3, eigenvalue = 0.0: Count: 4 Key: 1: Value: eigenVector1, eigenvalue = 0.0: Key: 2: Value: eigenVector2, eigenvalue = -2.2686660367578133: Count: 3 Key: 1: Value: eigenVector1, eigenvalue = 6.002722238353203: Key: 2: Value: eigenVector2, eigenvalue = 0.0: Key: 3: Value: eigenVector3, eigenvalue = -3.906144841591079: Count: 4 Key: 1: Value: eigenVector1, eigenvalue = 3.782617346103868: Key: 2: Value: eigenVector2, eigenvalue = 0.0: Count: 3 Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6DhEIACgkQA8JzLzUe2LNSHwCgpc/ZgUXPaq0aNwrbcPGH4AXB MVgAnjrgbceGHNHcHheCPPGydoAvcr57 =DBHE
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsDmitriy Lyubimov 2011-09-29, 02:15
Well... I think any Mapreduce based implementation is a bad choice for small
matrices, frankly... regardless of precision it gives. There are much faster methods out here. On Sep 28, 2011 1:32 PM, "Markus Holtermann" <[EMAIL PROTECTED]> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hey guys. > > Thank you for all the information about Singular Value Decomposition. > The Lanczos algorithm seems to be a bad choice for small matrices. But > the Stochastic SVD with k = full rank and p = 0 (thanks Dmitriy > Lyubimov for implementing that) works fine. > > So far, Markus > > On 09/23/2011 08:46 PM, Dmitriy Lyubimov wrote: >> I already fixed full rank (p =0) on the trunk. It was just an >> invalid assertion, the algorithm isn't limiting that. So k=3 p=0 >> should be ok now in the trunk. On Sep 23, 2011 8:34 PM, "Ted >> Dunning" <[EMAIL PROTECTED]> wrote: >>> Markus, >>> >>> Try testing on a 20x20 matrix if you want to use p>0. The issue >>> is that this is an approximation algorithm that works for >>> reasonably high >> dimension. >>> 3 is not reasonably high. 20 is probably marginal. >>> >>> On Fri, Sep 23, 2011 at 4:42 PM, Dmitriy Lyubimov >>> <[EMAIL PROTECTED] wrote: >>> >>>> oh, ok, apparently you need to use p>0. >>>> >>>> but then there's a problem that ther's k+p >=m (input height) >>>> requirement so I guess this is a corner case i did not account >>>> for. >>>> >>>> you can use k=2 and p=1 and caveat is that even though 3 >>>> singular values will be computed, only 2 of them will be saved. >>>> this solver always assumes "thin" decomposition requirement\s, >>>> although distinction is purely technical, it is only a matter a >>>> patch to enable p=0. >>>> >>>> It is only a case because your input so small. In practice, >>>> input is much "longer" than k+p rows so it hasn't come up as an >>>> issue. Point is, it will not do full rank decomposition with >>>> small matrices; but then, you don't want to use it with small >>>> matrices :) >>>> >>>> alhough i can engineer a patch to allow p=0 and full rank >>>> decompositions for short wide matrices if it is that >>>> important. >>>> >>>> -dmitriy >>>> >>>> On Fri, Sep 23, 2011 at 3:42 PM, Markus Holtermann >>>> <[EMAIL PROTECTED]> wrote: >>>>> Thank you for all your responses. >>>>> >>>>> ref. Dan Brickley: ------------------ hopefully you did dream >>>>> ;-) >>>>> >>>>> ref. Dmitriy Lyubimov: ---------------------- When I run >>>>> `mahout ssvd -i A.seq -o A-ssvd/ -k 3 -p 0` I get an >>>>> IllegalArgumentException. You can find the traceback at >>>>> http://paste.pocoo.org/show/481168/ . >>>>> >>>>> ref. Ted Dunning: ----------------- I am running the M/R >>>>> version of SVD in local mode. I didn't install Hadoop except >>>>> what is coming via `mvn install`. If I understand the code >>>>> correctly, the `--inMemory` argument is only relevant for the >>>>> "EigenVerificationJob" -- I didn't run that. >>>>> >>>>> Here are the latest results for the calculations as described >>>>> in my previous mail: >>>>> >>>>> For 1: Key class: class org.apache.hadoop.io.IntWritable >>>>> Value Class: class org.apache.mahout.math.VectorWritable Key: >>>>> 0: Value: eigenVector0, eigenvalue = 11.344411508600611: >>>>> {0:0.8940505788976013,1:0.05761556873901637,2:-0.44424543735613486} >>>>> >>>>> > Key: 1: Value: eigenVector1, eigenvalue = 0.0: >>>>> {0:-0.3030457633656634,1:0.8081220356417685,2:-0.5050762722761053} >>>>> >>>>> > Key: 2: Value: eigenVector2, eigenvalue = -0.4362482432944815: >>>>> {0:0.3299042704770375,1:0.5861904313011974,2:0.7399621277956934} >>>>> >>>>> > Count: 3 >>>>> >>>>> For 2: Key class: class org.apache.hadoop.io.IntWritable >>>>> Value Class: class org.apache.mahout.math.VectorWritable Key: >>>>> 0: Value: eigenVector0, eigenvalue = 11.344814282762082: >>>>> {0:0.7369762290995766,1:0.3279852776056837,2:-0.5910090485061045} >>>>> >>>>> > Key: 1: Value: eigenVector1, eigenvalue = 0.17091518882717976-0.5157294715892533:{0:-0.32798527760568197,1:-0.5910090485061036,2:-0.7369762290995783}
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsDmitriy Lyubimov 2011-09-29, 03:08
Also, like Ted said, using k= full rank is the same as running regular in
core method, both memory and could wise. This method is for computing thin svd (k + p not to exceed perhaps 1000 for practical purposes ) on otherwise really large inputs. Please beware of misuse. -Dmitriy On Sep 28, 2011 1:32 PM, "Markus Holtermann" <[EMAIL PROTECTED]> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hey guys. > > Thank you for all the information about Singular Value Decomposition. > The Lanczos algorithm seems to be a bad choice for small matrices. But > the Stochastic SVD with k = full rank and p = 0 (thanks Dmitriy > Lyubimov for implementing that) works fine. > > So far, Markus > > On 09/23/2011 08:46 PM, Dmitriy Lyubimov wrote: >> I already fixed full rank (p =0) on the trunk. It was just an >> invalid assertion, the algorithm isn't limiting that. So k=3 p=0 >> should be ok now in the trunk. On Sep 23, 2011 8:34 PM, "Ted >> Dunning" <[EMAIL PROTECTED]> wrote: >>> Markus, >>> >>> Try testing on a 20x20 matrix if you want to use p>0. The issue >>> is that this is an approximation algorithm that works for >>> reasonably high >> dimension. >>> 3 is not reasonably high. 20 is probably marginal. >>> >>> On Fri, Sep 23, 2011 at 4:42 PM, Dmitriy Lyubimov >>> <[EMAIL PROTECTED] wrote: >>> >>>> oh, ok, apparently you need to use p>0. >>>> >>>> but then there's a problem that ther's k+p >=m (input height) >>>> requirement so I guess this is a corner case i did not account >>>> for. >>>> >>>> you can use k=2 and p=1 and caveat is that even though 3 >>>> singular values will be computed, only 2 of them will be saved. >>>> this solver always assumes "thin" decomposition requirement\s, >>>> although distinction is purely technical, it is only a matter a >>>> patch to enable p=0. >>>> >>>> It is only a case because your input so small. In practice, >>>> input is much "longer" than k+p rows so it hasn't come up as an >>>> issue. Point is, it will not do full rank decomposition with >>>> small matrices; but then, you don't want to use it with small >>>> matrices :) >>>> >>>> alhough i can engineer a patch to allow p=0 and full rank >>>> decompositions for short wide matrices if it is that >>>> important. >>>> >>>> -dmitriy >>>> >>>> On Fri, Sep 23, 2011 at 3:42 PM, Markus Holtermann >>>> <[EMAIL PROTECTED]> wrote: >>>>> Thank you for all your responses. >>>>> >>>>> ref. Dan Brickley: ------------------ hopefully you did dream >>>>> ;-) >>>>> >>>>> ref. Dmitriy Lyubimov: ---------------------- When I run >>>>> `mahout ssvd -i A.seq -o A-ssvd/ -k 3 -p 0` I get an >>>>> IllegalArgumentException. You can find the traceback at >>>>> http://paste.pocoo.org/show/481168/ . >>>>> >>>>> ref. Ted Dunning: ----------------- I am running the M/R >>>>> version of SVD in local mode. I didn't install Hadoop except >>>>> what is coming via `mvn install`. If I understand the code >>>>> correctly, the `--inMemory` argument is only relevant for the >>>>> "EigenVerificationJob" -- I didn't run that. >>>>> >>>>> Here are the latest results for the calculations as described >>>>> in my previous mail: >>>>> >>>>> For 1: Key class: class org.apache.hadoop.io.IntWritable >>>>> Value Class: class org.apache.mahout.math.VectorWritable Key: >>>>> 0: Value: eigenVector0, eigenvalue = 11.344411508600611: >>>>> {0:0.8940505788976013,1:0.05761556873901637,2:-0.44424543735613486} >>>>> >>>>> > Key: 1: Value: eigenVector1, eigenvalue = 0.0: >>>>> {0:-0.3030457633656634,1:0.8081220356417685,2:-0.5050762722761053} >>>>> >>>>> > Key: 2: Value: eigenVector2, eigenvalue = -0.4362482432944815: >>>>> {0:0.3299042704770375,1:0.5861904313011974,2:0.7399621277956934} >>>>> >>>>> > Count: 3 >>>>> >>>>> For 2: Key class: class org.apache.hadoop.io.IntWritable >>>>> Value Class: class org.apache.mahout.math.VectorWritable Key: >>>>> 0: Value: eigenVector0, eigenvalue = 11.344814282762082: >>>>> {0:0.7369762290995766,1:0.3279852776056837,2:-0.5910090485061045} -0.5157294715892533:{0:-0.32798527760568197,1:-0.5910090485061036,2:-0.7369762290995783}
-
Re: Singular Value Decomposition does not return correct eigenvalues and -vectorsTed Dunning 2011-09-29, 12:36
As well, the stochastic projection algorithms in general have little to
recommend them for full rank decompositions. It may help a tiny bit to start with a QR decomposition, but the internals of an in-memory implementation should handle that. If you are doing a partial decomposition, however, the random projection stuff is great for in-memory decompositions as well as out-of-core or map-reduce implementations. On Thu, Sep 29, 2011 at 12:08 PM, Dmitriy Lyubimov <[EMAIL PROTECTED]>wrote: > Also, like Ted said, using k= full rank is the same as running regular in > core method, both memory and could wise. > > This method is for computing thin svd (k + p not to exceed perhaps 1000 for > practical purposes ) on otherwise really large inputs. > > Please beware of misuse. > > -Dmitriy > On Sep 28, 2011 1:32 PM, "Markus Holtermann" <[EMAIL PROTECTED]> > wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > Hey guys. > > > > Thank you for all the information about Singular Value Decomposition. > > The Lanczos algorithm seems to be a bad choice for small matrices. But > > the Stochastic SVD with k = full rank and p = 0 (thanks Dmitriy > > Lyubimov for implementing that) works fine. > > > > So far, Markus > > > > On 09/23/2011 08:46 PM, Dmitriy Lyubimov wrote: > >> I already fixed full rank (p =0) on the trunk. It was just an > >> invalid assertion, the algorithm isn't limiting that. So k=3 p=0 > >> should be ok now in the trunk. On Sep 23, 2011 8:34 PM, "Ted > >> Dunning" <[EMAIL PROTECTED]> wrote: > >>> Markus, > >>> > >>> Try testing on a 20x20 matrix if you want to use p>0. The issue > >>> is that this is an approximation algorithm that works for > >>> reasonably high > >> dimension. > >>> 3 is not reasonably high. 20 is probably marginal. > >>> > >>> On Fri, Sep 23, 2011 at 4:42 PM, Dmitriy Lyubimov > >>> <[EMAIL PROTECTED] wrote: > >>> > >>>> oh, ok, apparently you need to use p>0. > >>>> > >>>> but then there's a problem that ther's k+p >=m (input height) > >>>> requirement so I guess this is a corner case i did not account > >>>> for. > >>>> > >>>> you can use k=2 and p=1 and caveat is that even though 3 > >>>> singular values will be computed, only 2 of them will be saved. > >>>> this solver always assumes "thin" decomposition requirement\s, > >>>> although distinction is purely technical, it is only a matter a > >>>> patch to enable p=0. > >>>> > >>>> It is only a case because your input so small. In practice, > >>>> input is much "longer" than k+p rows so it hasn't come up as an > >>>> issue. Point is, it will not do full rank decomposition with > >>>> small matrices; but then, you don't want to use it with small > >>>> matrices :) > >>>> > >>>> alhough i can engineer a patch to allow p=0 and full rank > >>>> decompositions for short wide matrices if it is that > >>>> important. > >>>> > >>>> -dmitriy > >>>> > >>>> On Fri, Sep 23, 2011 at 3:42 PM, Markus Holtermann > >>>> <[EMAIL PROTECTED]> wrote: > >>>>> Thank you for all your responses. > >>>>> > >>>>> ref. Dan Brickley: ------------------ hopefully you did dream > >>>>> ;-) > >>>>> > >>>>> ref. Dmitriy Lyubimov: ---------------------- When I run > >>>>> `mahout ssvd -i A.seq -o A-ssvd/ -k 3 -p 0` I get an > >>>>> IllegalArgumentException. You can find the traceback at > >>>>> http://paste.pocoo.org/show/481168/ . > >>>>> > >>>>> ref. Ted Dunning: ----------------- I am running the M/R > >>>>> version of SVD in local mode. I didn't install Hadoop except > >>>>> what is coming via `mvn install`. If I understand the code > >>>>> correctly, the `--inMemory` argument is only relevant for the > >>>>> "EigenVerificationJob" -- I didn't run that. > >>>>> > >>>>> Here are the latest results for the calculations as described > >>>>> in my previous mail: > >>>>> > >>>>> For 1: Key class: class org.apache.hadoop.io.IntWritable > >>>>> Value Class: class org.apache.mahout.math.VectorWritable Key: |