|
myn
2011-08-29, 07:15
Sebastian Schelter
2011-08-29, 07:29
Danny Bickson
2011-08-29, 07:29
myn
2011-08-29, 07:37
myn
2011-08-29, 07:44
Dan Brickley
2011-08-29, 07:50
Danny Bickson
2011-08-29, 07:50
Lance Norskog
2011-08-29, 08:02
myn
2011-08-29, 11:00
myn
2011-08-29, 11:03
myn
2011-08-29, 11:05
Dan Brickley
2011-08-29, 11:13
Jeff Hansen
2011-08-29, 12:38
|
-
is there some place to study Singular Value Decomposition algorithmsmyn 2011-08-29, 07:15
i want to study Singular Value Decomposition algorithms;
I also have a book called mahout in action,but i can`t found sth about this algorithm; is there someplace introduce how to use the method? till now DistributedLanczosSolver is not a mapreduce method org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver = svd
-
Re: is there some place to study Singular Value Decomposition algorithmsSebastian Schelter 2011-08-29, 07:29
http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/
On 29.08.2011 09:15, myn wrote: > i want to study Singular Value Decomposition algorithms; > I also have a book called mahout in action,but i can`t found sth about this algorithm; > is there someplace introduce how to use the method? > till now DistributedLanczosSolver is not a mapreduce method > org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver = svd
-
Re: is there some place to study Singular Value Decomposition algorithmsDanny Bickson 2011-08-29, 07:29
Command line arguments are found here:
https://cwiki.apache.org/MAHOUT/dimensional-reduction.html I wrote a quick tutorial on how to prepare sparse matrices as input to Mahout SVD here: http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html Let me know if you have further questions. 2011/8/29 myn <[EMAIL PROTECTED]> > i want to study Singular Value Decomposition algorithms; > I also have a book called mahout in action,but i can`t found sth about this > algorithm; > is there someplace introduce how to use the method? > till now DistributedLanczosSolver is not a mapreduce method > org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver = svd
-
Re:Re: is there some place to study Singular Value Decomposition algorithmsmyn 2011-08-29, 07:37
thanks
But could you send the content ofhttp://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html to me ? I can`t open it in china . 在 2011-08-29 15:29:46,"Sebastian Schelter" <[EMAIL PROTECTED]> 写道: >http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/ > >On 29.08.2011 09:15, myn wrote: >> i want to study Singular Value Decomposition algorithms; >> I also have a book called mahout in action,but i can`t found sth about this algorithm; >> is there someplace introduce how to use the method? >> till now DistributedLanczosSolver is not a mapreduce method >> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver = svd >
-
Re:Re: is there some place to study Singular Value Decomposition algorithmsmyn 2011-08-29, 07:44
thanks
But could you send the content ofhttp://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html to me ? I can`t open it in china . At 2011-08-29 15:29:40,"Danny Bickson" <[EMAIL PROTECTED]> wrote: >Command line arguments are found here: >https://cwiki.apache.org/MAHOUT/dimensional-reduction.html >I wrote a quick tutorial on how to prepare sparse matrices as input to >Mahout SVD here: >http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html > >Let me know if you have further questions. > >2011/8/29 myn <[EMAIL PROTECTED]> > >> i want to study Singular Value Decomposition algorithms; >> I also have a book called mahout in action,but i can`t found sth about this >> algorithm; >> is there someplace introduce how to use the method? >> till now DistributedLanczosSolver is not a mapreduce method >> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver = svd
-
Re: Re: is there some place to study Singular Value Decomposition algorithmsDan Brickley 2011-08-29, 07:50
2011/8/29 myn <[EMAIL PROTECTED]>:
> thanks > But could you send the content ofhttp://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html to me ? (You asked the same thing twice with only 6 minutes between) Try this: http://translate.google.com/translate?js=n&prev=_t&hl=en&ie=UTF-8&layout=2&eotf=1&sl=zh-CN&tl=en&u=http%3A%2F%2Fbickson.blogspot.com%2F2011%2F02%2Fmahout-svd-matrix-factorization.html Or look for similar services that take a URL and then repeat its contents... For SVD, see also https://cwiki.apache.org/MAHOUT/dimensional-reduction.html and https://cwiki.apache.org/MAHOUT/svd-singular-value-decomposition.html ... the later has links to some mail threads here too: http://mail-archives.apache.org/mod_mbox/mahout-user/201102.mbox/%[EMAIL PROTECTED]%3E Dan ps. for general learning materials, look beyond Mahout. Eg. this in Ruby might help: http://www.igvita.com/2007/01/15/svd-recommendation-system-in-ruby/
-
Re: Re: is there some place to study Singular Value Decomposition algorithmsDanny Bickson 2011-08-29, 07:50
Mahout - SVD matrix factorization - formatting input matrix
Converting Input Format into Mahout's SVD Distributed Matrix Factorization Solver Purpose The code below, converts a matrix from csv format: <from row>,<to col>,<value>\n Into Mahout's SVD solver format. For example, The 3x3 matrix: 0 1.0 2.1 3.0 4.0 5.0 -5.0 6.2 0 Will be given as input in a csv file as: 1,0,3.0 2,0,-5.0 0,1,1.0 1,1,4.0 2,1,6.2 0,2,2.1 1,2,5.0 NOTE: I ASSUME THE MATRIX IS SORTED BY THE COLUMNS ORDER This code is based on code by Danny Leshem, ContextIn. Command line arguments: args[0] - path to csv input file args[1] - cardinality of the matrix (number of columns) args[2] - path the resulting Mahout's SVD input file Method: The code below, goes over the csv file, and for each matrix column, creates a SequentialAccessSparseVector which contains all the non-zero row entries for this column. Then it appends the column vector to file. Compilation: Copy the java code below into an java file named Convert2SVD.java Add to your IDE project path both Mahout and Hadoop jars. Alternatively, a command line option for compilation is given below. view plain<http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> print<http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> ?<http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> 1. import java.io.BufferedReader; 2. import java.io.FileReader; 3. import java.util.StringTokenizer; 4. 5. import org.apache.mahout.math.SequentialAccessSparseVector; 6. import org.apache.mahout.math.Vector; 7. import org.apache.mahout.math.VectorWritable; 8. import org.apache.hadoop.conf.Configuration; 9. import org.apache.hadoop.fs.FileSystem; 10. import org.apache.hadoop.fs.Path; 11. import org.apache.hadoop.io.IntWritable; 12. import org.apache.hadoop.io.SequenceFile; 13. import org.apache.hadoop.io.SequenceFile.CompressionType; 14. 15. /** 16. * Code for converting CSV format to Mahout's SVD format 17. * @author Danny Bickson, CMU 18. * Note: I ASSUME THE CSV FILE IS SORTED BY THE COLUMN (NAMELY THE SECOND FIELD). 19. * 20. */ 21. 22. public class Convert2SVD { 23. 24. 25. public static int Cardinality; 26. 27. /** 28. * 29. * @param args[0] - input csv file 30. * @param args[1] - cardinality (length of vector) 31. * @param args[2] - output file for svd 32. */ 33. public static void main(String[] args){ 34. 35. try { 36. Cardinality = Integer.parseInt(args[1]); 37. final Configuration conf = new Configuration(); 38. final FileSystem fs = FileSystem.get(conf); 39. final SequenceFile.Writer writer = SequenceFile.createWriter(fs, conf, new Path(args[2]), IntWritable.class, VectorWritable.class , CompressionType.BLOCK); 40. 41. final IntWritable key = new IntWritable(); 42. final VectorWritable value = new VectorWritable(); 43. 44. 45. String thisLine; 46. 47. BufferedReader br = new BufferedReader(new FileReader(args[0])); 48. Vector vector = null; 49. int from = -1,to =-1; 50. int last_to = -1; 51. float val = 0; 52. int total = 0; 53. int nnz = 0; 54. int e = 0; 55. int max_to =0; 56. int max_from = 0; 57. 58. while ((thisLine = br.readLine()) != null) { // while loop begins here 59. 60. StringTokenizer st = new StringTokenizer(thisLine, ","); 61. while(st.hasMoreTokens()) { 62. from = Integer.parseInt(st.nextToken())-1; //convert from 1 based to zero based 63. to = Integer.parseInt(st.nextToken())-1; //convert from 1 based to zero basd 64. val = Float.parseFloat(st.nextToken()); 65. if (max_from < from) max_from = from; 66. if (max_to < to) max_to = to; 67. if (from < 0 || to < 0 || to > Cardinality || val == 0.0) 68. throw new NumberFormatException("wrong data" + from + " to: " + to + " val: " + val); 69. } 70. 71. //we are working on an existing column, set non-zero rows in it 72. if (last_to != to && last_to != -1){ 73. value.set(vector); 74. 75. writer.append(key, value); //write the older vector 76. e+= vector.getNumNondefaultElements(); 77. } 78. //a new column is observed, open a new vector for it 79. if (last_to != to){ 80. vector = new SequentialAccessSparseVector(Cardinality); 81. key.set(to); // open a new vector 82. total++; 83. } 84. 85. vector.set(from, val); 86. nnz++; 87. 88. if (nnz % 1000000 == 0){ 89. System.out.println("Col" + total + " nnz: " + nnz); 90. } 91. last_to = to; 92. 93. } // end while 94. 95. value.set(vector); 96. writer.append(key,value);//write last row 97. e+= vector.getNumNondefaultElements(); 98. total++; 99. 100. writer.close(); 101. System.out.println("Wrote a total of " + total + " cols " + " nnz: " + nnz); 102. if (e != nnz) 103. System.err.println("Bug:missing edges! we only got" + e); 104. 105. System.out.println("Highest column: " +
-
Re: Re: is there some place to study Singular Value Decomposition algorithmsLance Norskog 2011-08-29, 08:02
'R' also has an svd implementation, directly in the base package.
There are a few answers to your question: 1) What is SVD? The video lecture above will help. Also, searching for 'singular value decomposition' on Baidu finds a lot of basic explanations. 2) Why do you want it? It creates in on pass a few different unique explanations of what is going on inside your dataset. 3) Mahout Distributed Matrix code, DistributedLanczos etc. are implementations specifically for large-scale problems. There are sub-parts of SVD that you may not need for your problem, and these jobs avoid some of the work. Until you have a solid grasp of what SVD can tell you, there is no point trying the distributed mahout jobs. The SingularValueDecomposition class in Mahout has served me well in my researches. Lance On Mon, Aug 29, 2011 at 12:50 AM, Danny Bickson <[EMAIL PROTECTED]>wrote: > Mahout - SVD matrix factorization - formatting input matrix > Converting Input Format into Mahout's SVD Distributed Matrix Factorization > Solver > > Purpose > The code below, converts a matrix from csv format: > <from row>,<to col>,<value>\n > Into Mahout's SVD solver format. > > > For example, > The 3x3 matrix: > 0 1.0 2.1 > 3.0 4.0 5.0 > -5.0 6.2 0 > > > Will be given as input in a csv file as: > 1,0,3.0 > 2,0,-5.0 > 0,1,1.0 > 1,1,4.0 > 2,1,6.2 > 0,2,2.1 > 1,2,5.0 > > NOTE: I ASSUME THE MATRIX IS SORTED BY THE COLUMNS ORDER > This code is based on code by Danny Leshem, ContextIn. > > Command line arguments: > args[0] - path to csv input file > args[1] - cardinality of the matrix (number of columns) > args[2] - path the resulting Mahout's SVD input file > > Method: > The code below, goes over the csv file, and for each matrix column, creates > a SequentialAccessSparseVector which contains all the non-zero row entries > for this column. > Then it appends the column vector to file. > > Compilation: > Copy the java code below into an java file named Convert2SVD.java > Add to your IDE project path both Mahout and Hadoop jars. Alternatively, a > command line option for compilation is given below. > > > view plain< > http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> > print< > http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> > ?< > http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> > > 1. import java.io.BufferedReader; > 2. import java.io.FileReader; > 3. import java.util.StringTokenizer; > 4. > 5. import org.apache.mahout.math.SequentialAccessSparseVector; > 6. import org.apache.mahout.math.Vector; > 7. import org.apache.mahout.math.VectorWritable; > 8. import org.apache.hadoop.conf.Configuration; > 9. import org.apache.hadoop.fs.FileSystem; > 10. import org.apache.hadoop.fs.Path; > 11. import org.apache.hadoop.io.IntWritable; > 12. import org.apache.hadoop.io.SequenceFile; > 13. import org.apache.hadoop.io.SequenceFile.CompressionType; > 14. > 15. /** > 16. * Code for converting CSV format to Mahout's SVD format > 17. * @author Danny Bickson, CMU > 18. > * Note: I ASSUME THE CSV FILE IS SORTED BY THE COLUMN (NAMELY THE > SECOND FIELD). > > 19. * > 20. */ > 21. > 22. public class Convert2SVD { > 23. > 24. > 25. public static int Cardinality; > 26. > 27. /** > 28. * > 29. * @param args[0] - input csv file > 30. * @param args[1] - cardinality (length of vector) > 31. * @param args[2] - output file for svd > 32. */ > 33. public static void main(String[] args){ > 34. > 35. try { > 36. Cardinality = Integer.parseInt(args[1]); > 37. final Configuration conf = new Configuration(); > 38. final FileSystem fs = FileSystem.get(conf); > 39. final > SequenceFile.Writer writer = SequenceFile.createWriter(fs, conf, new > Path(args[2]), IntWritable.class, VectorWritable.class > , CompressionType.BLOCK); Lance Norskog [EMAIL PROTECTED]
-
Re:Re: Re: is there some place to study Singular Value Decomposition algorithmsmyn 2011-08-29, 11:00
thanks a lot ,that is a quit good example for my study.
At 2011-08-29 15:50:36,"Danny Bickson" <[EMAIL PROTECTED]> wrote: > Mahout - SVD matrix factorization - formatting input matrix > Converting Input Format into Mahout's SVD Distributed Matrix Factorization >Solver > >Purpose >The code below, converts a matrix from csv format: ><from row>,<to col>,<value>\n >Into Mahout's SVD solver format. > > >For example, >The 3x3 matrix: >0 1.0 2.1 >3.0 4.0 5.0 >-5.0 6.2 0 > > >Will be given as input in a csv file as: >1,0,3.0 >2,0,-5.0 >0,1,1.0 >1,1,4.0 >2,1,6.2 >0,2,2.1 >1,2,5.0 > >NOTE: I ASSUME THE MATRIX IS SORTED BY THE COLUMNS ORDER >This code is based on code by Danny Leshem, ContextIn. > >Command line arguments: > args[0] - path to csv input file >args[1] - cardinality of the matrix (number of columns) >args[2] - path the resulting Mahout's SVD input file > >Method: >The code below, goes over the csv file, and for each matrix column, creates >a SequentialAccessSparseVector which contains all the non-zero row entries >for this column. >Then it appends the column vector to file. > >Compilation: >Copy the java code below into an java file named Convert2SVD.java >Add to your IDE project path both Mahout and Hadoop jars. Alternatively, a >command line option for compilation is given below. > > >view plain<http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> >print<http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> >?<http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> > > 1. import java.io.BufferedReader; > 2. import java.io.FileReader; > 3. import java.util.StringTokenizer; > 4. > 5. import org.apache.mahout.math.SequentialAccessSparseVector; > 6. import org.apache.mahout.math.Vector; > 7. import org.apache.mahout.math.VectorWritable; > 8. import org.apache.hadoop.conf.Configuration; > 9. import org.apache.hadoop.fs.FileSystem; > 10. import org.apache.hadoop.fs.Path; > 11. import org.apache.hadoop.io.IntWritable; > 12. import org.apache.hadoop.io.SequenceFile; > 13. import org.apache.hadoop.io.SequenceFile.CompressionType; > 14. > 15. /** > 16. * Code for converting CSV format to Mahout's SVD format > 17. * @author Danny Bickson, CMU > 18. > * Note: I ASSUME THE CSV FILE IS SORTED BY THE COLUMN (NAMELY THE >SECOND FIELD). > > 19. * > 20. */ > 21. > 22. public class Convert2SVD { > 23. > 24. > 25. public static int Cardinality; > 26. > 27. /** > 28. * > 29. * @param args[0] - input csv file > 30. * @param args[1] - cardinality (length of vector) > 31. * @param args[2] - output file for svd > 32. */ > 33. public static void main(String[] args){ > 34. > 35. try { > 36. Cardinality = Integer.parseInt(args[1]); > 37. final Configuration conf = new Configuration(); > 38. final FileSystem fs = FileSystem.get(conf); > 39. final > SequenceFile.Writer writer = SequenceFile.createWriter(fs, conf, new > Path(args[2]), IntWritable.class, VectorWritable.class > , CompressionType.BLOCK); > 40. > 41. final IntWritable key = new IntWritable(); > 42. final VectorWritable value = new VectorWritable(); > 43. > 44. > 45. String thisLine; > 46. > 47. BufferedReader br = new BufferedReader(new > FileReader(args[0])); > 48. Vector vector = null; > 49. int from = -1,to =-1; > 50. int last_to = -1; > 51. float val = 0; > 52. int total = 0; > 53. int nnz = 0; > 54. int e = 0; > 55. int max_to =0; > 56. int max_from = 0; > 57. > 58. while ((thisLine = br.readLine()) != null) { > // while loop begins here > 59. > 60. StringTokenizer st = new StringTokenizer(thisLine, > ",");
-
Re:Re: Re: is there some place to study Singular Value Decomposition algorithmsmyn 2011-08-29, 11:03
the best way is to read the sorce code ;
@_@ At 2011-08-29 16:02:57,"Lance Norskog" <[EMAIL PROTECTED]> wrote: >'R' also has an svd implementation, directly in the base package. > >There are a few answers to your question: >1) What is SVD? The video lecture above will help. Also, searching for >'singular value decomposition' on Baidu finds a lot of basic explanations. >2) Why do you want it? It creates in on pass a few different unique >explanations of what is going on inside your dataset. >3) Mahout Distributed Matrix code, DistributedLanczos etc. are >implementations specifically for large-scale problems. There are sub-parts >of SVD that you may not need for your problem, and these jobs avoid some of >the work. > >Until you have a solid grasp of what SVD can tell you, there is no point >trying the distributed mahout jobs. The SingularValueDecomposition class in >Mahout has served me well in my researches. > >Lance > >On Mon, Aug 29, 2011 at 12:50 AM, Danny Bickson <[EMAIL PROTECTED]>wrote: > >> Mahout - SVD matrix factorization - formatting input matrix >> Converting Input Format into Mahout's SVD Distributed Matrix Factorization >> Solver >> >> Purpose >> The code below, converts a matrix from csv format: >> <from row>,<to col>,<value>\n >> Into Mahout's SVD solver format. >> >> >> For example, >> The 3x3 matrix: >> 0 1.0 2.1 >> 3.0 4.0 5.0 >> -5.0 6.2 0 >> >> >> Will be given as input in a csv file as: >> 1,0,3.0 >> 2,0,-5.0 >> 0,1,1.0 >> 1,1,4.0 >> 2,1,6.2 >> 0,2,2.1 >> 1,2,5.0 >> >> NOTE: I ASSUME THE MATRIX IS SORTED BY THE COLUMNS ORDER >> This code is based on code by Danny Leshem, ContextIn. >> >> Command line arguments: >> args[0] - path to csv input file >> args[1] - cardinality of the matrix (number of columns) >> args[2] - path the resulting Mahout's SVD input file >> >> Method: >> The code below, goes over the csv file, and for each matrix column, creates >> a SequentialAccessSparseVector which contains all the non-zero row entries >> for this column. >> Then it appends the column vector to file. >> >> Compilation: >> Copy the java code below into an java file named Convert2SVD.java >> Add to your IDE project path both Mahout and Hadoop jars. Alternatively, a >> command line option for compilation is given below. >> >> >> view plain< >> http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> >> print< >> http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> >> ?< >> http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> >> >> 1. import java.io.BufferedReader; >> 2. import java.io.FileReader; >> 3. import java.util.StringTokenizer; >> 4. >> 5. import org.apache.mahout.math.SequentialAccessSparseVector; >> 6. import org.apache.mahout.math.Vector; >> 7. import org.apache.mahout.math.VectorWritable; >> 8. import org.apache.hadoop.conf.Configuration; >> 9. import org.apache.hadoop.fs.FileSystem; >> 10. import org.apache.hadoop.fs.Path; >> 11. import org.apache.hadoop.io.IntWritable; >> 12. import org.apache.hadoop.io.SequenceFile; >> 13. import org.apache.hadoop.io.SequenceFile.CompressionType; >> 14. >> 15. /** >> 16. * Code for converting CSV format to Mahout's SVD format >> 17. * @author Danny Bickson, CMU >> 18. >> * Note: I ASSUME THE CSV FILE IS SORTED BY THE COLUMN (NAMELY THE >> SECOND FIELD). >> >> 19. * >> 20. */ >> 21. >> 22. public class Convert2SVD { >> 23. >> 24. >> 25. public static int Cardinality; >> 26. >> 27. /** >> 28. * >> 29. * @param args[0] - input csv file >> 30. * @param args[1] - cardinality (length of vector) >> 31. * @param args[2] - output file for svd >> 32. */ >> 33. public static void main(String[] args){ >> 34. >> 35. try { >> 36. Cardinality = Integer.parseInt(args[1]); >> 37. final Configuration conf = new Configuration();
-
Re:Re:Re: Re: is there some place to study Singular Value Decomposition algorithmsmyn 2011-08-29, 11:05
thanks evey body , my chinese english ,
At 2011-08-29 19:03:59,myn <[EMAIL PROTECTED]> wrote: the best way is to read the sorce code ; @_@ At 2011-08-29 16:02:57,"Lance Norskog" <[EMAIL PROTECTED]> wrote: >'R' also has an svd implementation, directly in the base package. > >There are a few answers to your question: >1) What is SVD? The video lecture above will help. Also, searching for >'singular value decomposition' on Baidu finds a lot of basic explanations. >2) Why do you want it? It creates in on pass a few different unique >explanations of what is going on inside your dataset. >3) Mahout Distributed Matrix code, DistributedLanczos etc. are >implementations specifically for large-scale problems. There are sub-parts >of SVD that you may not need for your problem, and these jobs avoid some of >the work. > >Until you have a solid grasp of what SVD can tell you, there is no point >trying the distributed mahout jobs. The SingularValueDecomposition class in >Mahout has served me well in my researches. > >Lance > >On Mon, Aug 29, 2011 at 12:50 AM, Danny Bickson <[EMAIL PROTECTED]>wrote: > >> Mahout - SVD matrix factorization - formatting input matrix >> Converting Input Format into Mahout's SVD Distributed Matrix Factorization >> Solver >> >> Purpose >> The code below, converts a matrix from csv format: >> <from row>,<to col>,<value>\n >> Into Mahout's SVD solver format. >> >> >> For example, >> The 3x3 matrix: >> 0 1.0 2.1 >> 3.0 4.0 5.0 >> -5.0 6.2 0 >> >> >> Will be given as input in a csv file as: >> 1,0,3.0 >> 2,0,-5.0 >> 0,1,1.0 >> 1,1,4.0 >> 2,1,6.2 >> 0,2,2.1 >> 1,2,5.0 >> >> NOTE: I ASSUME THE MATRIX IS SORTED BY THE COLUMNS ORDER >> This code is based on code by Danny Leshem, ContextIn. >> >> Command line arguments: >> args[0] - path to csv input file >> args[1] - cardinality of the matrix (number of columns) >> args[2] - path the resulting Mahout's SVD input file >> >> Method: >> The code below, goes over the csv file, and for each matrix column, creates >> a SequentialAccessSparseVector which contains all the non-zero row entries >> for this column. >> Then it appends the column vector to file. >> >> Compilation: >> Copy the java code below into an java file named Convert2SVD.java >> Add to your IDE project path both Mahout and Hadoop jars. Alternatively, a >> command line option for compilation is given below. >> >> >> view plain< >> http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> >> print< >> http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> >> ?< >> http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html#> >> >> 1. import java.io.BufferedReader; >> 2. import java.io.FileReader; >> 3. import java.util.StringTokenizer; >> 4. >> 5. import org.apache.mahout.math.SequentialAccessSparseVector; >> 6. import org.apache.mahout.math.Vector; >> 7. import org.apache.mahout.math.VectorWritable; >> 8. import org.apache.hadoop.conf.Configuration; >> 9. import org.apache.hadoop.fs.FileSystem; >> 10. import org.apache.hadoop.fs.Path; >> 11. import org.apache.hadoop.io.IntWritable; >> 12. import org.apache.hadoop.io.SequenceFile; >> 13. import org.apache.hadoop.io.SequenceFile.CompressionType; >> 14. >> 15. /** >> 16. * Code for converting CSV format to Mahout's SVD format >> 17. * @author Danny Bickson, CMU >> 18. >> * Note: I ASSUME THE CSV FILE IS SORTED BY THE COLUMN (NAMELY THE >> SECOND FIELD). >> >> 19. * >> 20. */ >> 21. >> 22. public class Convert2SVD { >> 23. >> 24. >> 25. public static int Cardinality; >> 26. >> 27. /** >> 28. * >> 29. * @param args[0] - input csv file >> 30. * @param args[1] - cardinality (length of vector) >> 31. * @param args[2] - output file for svd >> 32. */ >> 33. public static void main(String[] args){ >> 34. >> 35. try { >> 36. Cardinality = Integer.parseInt(args[1]);
-
Re: Re: Re: is there some place to study Singular Value Decomposition algorithmsDan Brickley 2011-08-29, 11:13
2011/8/29 myn <[EMAIL PROTECTED]>:
> the best way is to read the sorce code ; Talking of "view source", has anyone taken a look at these (public domain) Javascript demos: http://users.telenet.be/paul.larmuseau/SVD.htm / http://www.stasegem.be/shop2/SVD.htm or http://metamerist.com/excanvas/example23a.htm (I am thinking more for educational usage - interactive tutorials - than applications) Could they maybe be useful repackaged, e.g. integrated with http://sylvester.jcoglan.com/ or http://glmatrix.googlecode.com/ ? Dan
-
Re: Re: Re: is there some place to study Singular Value Decomposition algorithmsJeff Hansen 2011-08-29, 12:38
Funny somebody mentioned the MIT Strang lectures from 99. I just spent the
weekend watching the first 13. In case anybody doesn't like being restricted to watching them in their browser and doesn't have easy access to download them with iTunesU, you can download them directly from blip (where academic earth hosts their copies). Also, if anybody else has trouble maintaining interest through the review, I recommend trying to different speeds. I've just been using "mplayer -af scaletempo -speed 1.5 lecture.m4v". I took a linear algebra class a few years ago, but failed to get much out of it because the first few weeks were a bit too easy and somewhere around week 6 or so I looked up from my french homework and realized I had no idea what he was talking about. One of these days the Universities will catch on and video lectures will revolutionize the educational landscape. http://blip.tv/file/get/Aev264-1806Lec1702.m4v http://blip.tv/file/get/Aev264-1806Lec2300.m4v http://blip.tv/file/get/Aev264-1806Lec3845.m4v http://blip.tv/file/get/Aev264-1806Lec4382.m4v http://blip.tv/file/get/Aev264-1806Lec5103.m4v http://blip.tv/file/get/Aev264-1806Lec6314.m4v http://blip.tv/file/get/Aev264-1806Lec7969.m4v http://blip.tv/file/get/Aev264-1806Lec8818.m4v http://blip.tv/file/get/Aev264-1806Lec9210.m4v http://blip.tv/file/get/Aev264-1806Lec10531.m4v http://blip.tv/file/get/Aev264-1806Lec11902.m4v http://blip.tv/file/get/Aev264-1806Lec12269.m4v http://blip.tv/file/get/Aev264-1806Lec13711.m4v http://blip.tv/file/get/Aev264-1806Lec14407.m4v http://blip.tv/file/get/Aev264-1806Lec15761.m4v http://blip.tv/file/get/Aev264-1806Lec16461.m4v http://blip.tv/file/get/Aev264-1806Lec17485.m4v http://blip.tv/file/get/Aev264-1806Lec18383.m4v http://blip.tv/file/get/Aev264-1806Lec19562.m4v http://blip.tv/file/get/Aev264-1806Lec20378.m4v http://blip.tv/file/get/Aev264-1806Lec21301.m4v http://blip.tv/file/get/Aev264-1806Lec22350.m4v http://blip.tv/file/get/Aev264-1806Lec23986.m4v http://blip.tv/file/get/Aev264-1806Lec24910.m4v http://blip.tv/file/get/Aev264-1806Lec24b142.m4v http://blip.tv/file/get/Aev264-1806Lec25100.m4v http://blip.tv/file/get/Aev264-1806Lec26244.m4v http://blip.tv/file/get/Aev264-1806Lec27954.m4v http://blip.tv/file/get/Aev264-1806Lec28424.m4v http://blip.tv/file/get/Aev264-1806Lec29320.m4v http://blip.tv/file/get/Aev264-1806Lec30579.m4v http://blip.tv/file/get/Aev264-1806Lec31319.m4v http://blip.tv/file/get/Aev264-1806Lec32881.m4v http://blip.tv/file/get/Aev264-1806Lec33862.m4v http://blip.tv/file/get/Aev264-1806Lec34244.m4v On Mon, Aug 29, 2011 at 6:13 AM, Dan Brickley <[EMAIL PROTECTED]> wrote: > 2011/8/29 myn <[EMAIL PROTECTED]>: > > the best way is to read the sorce code ; > > Talking of "view source", has anyone taken a look at these (public > domain) Javascript demos: > > http://users.telenet.be/paul.larmuseau/SVD.htm / > http://www.stasegem.be/shop2/SVD.htm > > or http://metamerist.com/excanvas/example23a.htm > > (I am thinking more for educational usage - interactive tutorials - > than applications) > > Could they maybe be useful repackaged, e.g. integrated with > http://sylvester.jcoglan.com/ or http://glmatrix.googlecode.com/ ? > > Dan > |