-High Dimensional Datasets for Binary Classification
praneet mhatre 2012-05-09, 03:06
Hi All / Ted,
I tried looking through the mailing list first, since similar questions
have been asked before. But couldn't really find what I wanted.
Quick background - I have been working on higher order learning algorithms
(Feature Sharding to be specific) for some time. While getting this stuff
into Mahout will require some solid progress on the pig/mahout integration
front among other things, I have been exploring how vertical sharding
generally affects classifier performance using some simple code I've
written in Weka.
Most of my studies so far have been done on moderate dimensional datasets.
Can someone please suggest me some high/very high dimensional datasets
suitable for binary classification and available for free?
Donald Bren School of ICS
University of California, Irvine