Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Mahout, mail # dev - Towards 1.0 - Defining backwards compatibility guarantees


Copy link to this message
-
Re: Towards 1.0 - Defining backwards compatibility guarantees
Isabel Drost 2011-11-01, 08:04
On 31.10.2011 Jeff Eastman wrote:
> I think users would benefit a lot by 1) to 3) and would be dismayed if we
> could not maintain data consistency between releases

> (maybe just point releases?).

Good point that I forgot to define in the original mail: Levels of back-compat
should depend on which type of release is being built.
> 4) and 5) are related and it is a question which is more important if we
> can't do both.

I think for minor releases we should do both. However it might be easier to do
4) if we could restrict it to a subset of the code only - meaning only code that
is intended to be used by external code.
> Since a lot of users are using the CLI I think backwards
> compatibility is pretty important there. This is especially the case for
> the MiA examples. The book is really our user manual and many people will
> be turned off if gratuitous API changes make the book obsolete as a
> learning tool. Of course, the book has plenty of API usage examples which
> need to keep compatibility too.
>
> Our 1.0 release will have a lot of solid implementations of scalable
> machine learning software, but everything is not at the same level of
> maturity. I think it is critical that we adopt a maturity scheme so that
> we can realistically make changes to evolving algorithms while making
> reasonable guarantees about stable code. Moving still-evolving
> implementations to a separate source tree would certainly make their
> status visible, but I wonder about the mechanics: to we need a parallel
> contrib universe (with math, core, integration, examples subtrees?) or
> would the annotations work better? I kind of favor the annotations as the
> former seems like too much dependency plumbing.

Me personally, I am currently quite undecided here - annotations have the
advantage of keeping everything in one source tree and module.

Keeping stuff in contrib modules could give us the chance of lowering the bar to
committership substancially - question is, does that really work out well or
will it just cause overhead and trouble? Any experience from the Lucene world
that we can build on here?
> And, of course, defining the content of 1.0 is still something we need to
> do. That is a separate thread TBD.

+1 for taking one step at a time.
Isabel