I've found a need for the sorting a Drm as well as In-core matrices, something like eg.: DrmLike.sortByColumn(...). I would like to implement this at the math-scala engine neutral level with pass through functions to underlying back ends.
In-core would be engine neutral by current design (in-core matrices are all Mahout matrices with the exception of h2o.. which causes some concern.)
For Spark, we can use RDD.sortBy(...).
Flink we can use DataSet.sortPartition(...).setParallelism(1). (There may be a better method will look deeper).
h2o has an implementation, I'm sure, but this brings me to a more important point: If we want to stub out a method in a back end module, Eg: h2o, which test suites do we want make a requirements?
We've not set any specific rules for which test suites must pass for each module. We've had a soft requirement for inheriting and passing all test suites from math-scala.
Setting a rule for this is something that we need to IMO.
An easy option that I'm thinking would be to set the current core math-scala suites as a requirement, and then allow for an optional suite for methods which will be stubbed out.