Ever since we moved Flink to its own profile, I have been thinking we ought
to do the same to H2O but haven't been to motivated bc it was never causing
anyone any problems.
Maybe its time to drop H2O "official support" and move Flink Batch / H2O
into a "mahout/community/engines" folder.
Ive been doing a lot of Flink Streaming the last couple weeks and already
bootlegged a few of the 'Algorithms" into Flink. Pretty sure we could
support those easily- and I _think_ we could do the same with the
distributed (e.g. wrap a DataStream[(Key, MahoutVector)] and implement the
the Operators on that.
I'd put FlinkStreaming as another community engine.
If we did that, I'd say- by convention we need a Markdown document in
mahout/community/engines that has a table of what is implemented on what.
That is to say, even if we only were able to implement the "algos" on Flink
Streaming- there would still be a lot of value to that for many
applications (esp considering the state of FlinkML). Also beats having a
half cooked engine sitting on a feature branch.
Beam does something similar to that for their various engines.
Speaking of Beam, I've heard rumblings here and there of people tlaking
about making a Beam engine- this might motivate people to get started (no
one person feels responsible for "boiling the ocean" and throwing down an
entire engine in one go- but instead can hack out the portions they need.
On Tue, Sep 5, 2017 at 4:04 PM, Andrew Palumbo <[EMAIL PROTECTED]> wrote: