|
Grant Ingersoll
2010-02-12, 22:44
Dawid Weiss
2010-02-12, 23:22
Ted Dunning
2010-02-12, 23:26
Jake Mannix
2010-02-13, 00:39
Ted Dunning
2010-02-13, 00:57
Benson Margulies
2010-02-13, 02:19
Ted Dunning
2010-02-13, 03:10
Jake Mannix
2010-02-13, 04:00
Kay Kay
2010-02-13, 08:55
Grant Ingersoll
2010-02-13, 12:45
Ted Dunning
2010-02-13, 18:36
Drew Farris
2010-02-13, 19:42
Benson Margulies
2010-02-13, 20:20
Grant Ingersoll
2010-02-13, 20:36
Isabel Drost
2010-02-15, 10:17
Robin Anil
2010-02-15, 10:30
Jeff Eastman
2010-02-15, 12:08
|
-
Mahout as TLPGrant Ingersoll 2010-02-12, 22:44
As many of you know, Mahout has been growing pretty quickly and has also reached a critical mass. I, along with some others in the Mahout community, feel it would make sense for Mahout to become a TLP With this in mind, I've submitted a proposal to the Lucene PMC to ask the board to make Mahout an Apache TLP. One of the feedbacks from the PMC was question as to whether this has been discussed in the community and whether the community is for it. I know it's been brought up tangentially in the past (see [1], [2], [3]) and there wasn't any disagreement, but it seems it warrants a more formal discussion.
I see the following pros: 1. We'd like to organize several subprojects we wish to introduce (Core, NLP, Recommenders/Taste, Ports - C++, etc.) that wouldn't really fit as Lucene subprojects. 2. I also think longer term that while Machine Learning and Search are often related, they are not required of each other and that Mahout would be better aligned with a more narrow focus of Machine Learning only. 3. The PMC can be more narrowly focused on Mahout and it's needs and will be better informed of Mahout's contributors, etc. Cons: 1. Lucene has a very strong brand and I have no doubt that Mahout benefits from that association 2. Changing mailing lists, etc. is a bit of a hassle (mostly for infrastructure), but not that big of a deal. Still, Lucene is well established and well-run, so sometimes inertia is a good thing. At the end of the day, I'm +1. [1] http://search.lucidimagination.com/search/document/a6e03af2952ff196/possible_contribution_at_somewhat_of_a_tangent_to_mahout#5a41be454d503779 [2] http://search.lucidimagination.com/search/document/40c4c4ec11ca07b5/mi_clustering#7197ef846b384e4e [3] http://search.lucidimagination.com/search/document/1817a5e65c83bae3/proposing_a_c_port_for_apache_mahout#8e4e8eabc945264d
-
Re: Mahout as TLPDawid Weiss 2010-02-12, 23:22
> 1. We'd like to organize several subprojects we wish to introduce (Core, NLP, Recommenders/Taste, Ports - C++, etc.) that wouldn't really fit as Lucene subprojects.
And the collections package, vectors, verification and evaluation code, potential test data sets... yes, makes sense to make it a TLP. I don't think Lucene folks will mind -- it's not like Mahout is going to depart from using Lucene/ Hadoop, etc. Not that my voice counts much here, but +1 to the idea. Dawid
-
Re: Mahout as TLPTed Dunning 2010-02-12, 23:26
I am a bit ambivalent, but net +1 on this. The deciding factor for me is
that it makes it easier to express the sub-projects. On Fri, Feb 12, 2010 at 3:22 PM, Dawid Weiss <[EMAIL PROTECTED]> wrote: > > 1. We'd like to organize several subprojects we wish to introduce (Core, > NLP, Recommenders/Taste, Ports - C++, etc.) that wouldn't really fit as > Lucene subprojects. > > And the collections package, vectors, verification and evaluation > code, potential test data sets... yes, makes sense to make it a TLP. I > don't think Lucene folks will mind -- it's not like Mahout is going to > depart from using Lucene/ Hadoop, etc. > > Not that my voice counts much here, but +1 to the idea. > > Dawid > -- Ted Dunning, CTO DeepDyve
-
Re: Mahout as TLPJake Mannix 2010-02-13, 00:39
What are your ambivalencies, Ted? I'm a little split myself, but all of my
"cons" are very fuzzy and hard to articulate (mainly around timing). Could you spell out why your +1 is any weaker than it could be? -jake On Fri, Feb 12, 2010 at 3:26 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > I am a bit ambivalent, but net +1 on this. The deciding factor for me is > that it makes it easier to express the sub-projects. > > On Fri, Feb 12, 2010 at 3:22 PM, Dawid Weiss <[EMAIL PROTECTED]> > wrote: > > > > 1. We'd like to organize several subprojects we wish to introduce > (Core, > > NLP, Recommenders/Taste, Ports - C++, etc.) that wouldn't really fit as > > Lucene subprojects. > > > > And the collections package, vectors, verification and evaluation > > code, potential test data sets... yes, makes sense to make it a TLP. I > > don't think Lucene folks will mind -- it's not like Mahout is going to > > depart from using Lucene/ Hadoop, etc. > > > > Not that my voice counts much here, but +1 to the idea. > > > > Dawid > > > > > > -- > Ted Dunning, CTO > DeepDyve >
-
Re: Mahout as TLPTed Dunning 2010-02-13, 00:57
My ambivalence has to do with uncertainties, mostly. I don't have a clear
idea of what will change. It seems like very little, but there is some overhead. It still seems like a good move regardless of what I don't know. On Fri, Feb 12, 2010 at 4:39 PM, Jake Mannix <[EMAIL PROTECTED]> wrote: > What are your ambivalencies, Ted? I'm a little split myself, but all of my > "cons" > are very fuzzy and hard to articulate (mainly around timing). > > Could you spell out why your +1 is any weaker than it could be? > > -jake > > On Fri, Feb 12, 2010 at 3:26 PM, Ted Dunning <[EMAIL PROTECTED]> > wrote: > > > I am a bit ambivalent, but net +1 on this. The deciding factor for me is > > that it makes it easier to express the sub-projects. > > > > On Fri, Feb 12, 2010 at 3:22 PM, Dawid Weiss <[EMAIL PROTECTED]> > > wrote: > > > > > > 1. We'd like to organize several subprojects we wish to introduce > > (Core, > > > NLP, Recommenders/Taste, Ports - C++, etc.) that wouldn't really fit as > > > Lucene subprojects. > > > > > > And the collections package, vectors, verification and evaluation > > > code, potential test data sets... yes, makes sense to make it a TLP. I > > > don't think Lucene folks will mind -- it's not like Mahout is going to > > > depart from using Lucene/ Hadoop, etc. > > > > > > Not that my voice counts much here, but +1 to the idea. > > > > > > Dawid > > > > > > > > > > > -- > > Ted Dunning, CTO > > DeepDyve > > > -- Ted Dunning, CTO DeepDyve
-
Re: Mahout as TLPBenson Margulies 2010-02-13, 02:19
TLP-itude means the following:
1) Mahout has it's own PMC. That group will vote on committers, releases, and other legal issues. Funny, it's a short list, isn't it? There are many things we might want to do that will be easier to organize if it's just 'us chickens' that have to decide, not that the existing Lucene PMC has been obstructionist. Of course, I write 'us chickens' when I'm likely to be just a committer due to my relatively recent arrival, but you know what I mean. One particularly attractive idea is apparently to go create subprojects, since Apache doesn't do sub-sub projects. In the sort term, I'd counsel against a rush to subdivide, but that's just me. On Fri, Feb 12, 2010 at 7:57 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > My ambivalence has to do with uncertainties, mostly. I don't have a clear > idea of what will change. It seems like very little, but there is some > overhead. > > It still seems like a good move regardless of what I don't know. > > On Fri, Feb 12, 2010 at 4:39 PM, Jake Mannix <[EMAIL PROTECTED]> wrote: > >> What are your ambivalencies, Ted? I'm a little split myself, but all of my >> "cons" >> are very fuzzy and hard to articulate (mainly around timing). >> >> Could you spell out why your +1 is any weaker than it could be? >> >> -jake >> >> On Fri, Feb 12, 2010 at 3:26 PM, Ted Dunning <[EMAIL PROTECTED]> >> wrote: >> >> > I am a bit ambivalent, but net +1 on this. The deciding factor for me is >> > that it makes it easier to express the sub-projects. >> > >> > On Fri, Feb 12, 2010 at 3:22 PM, Dawid Weiss <[EMAIL PROTECTED]> >> > wrote: >> > >> > > > 1. We'd like to organize several subprojects we wish to introduce >> > (Core, >> > > NLP, Recommenders/Taste, Ports - C++, etc.) that wouldn't really fit as >> > > Lucene subprojects. >> > > >> > > And the collections package, vectors, verification and evaluation >> > > code, potential test data sets... yes, makes sense to make it a TLP. I >> > > don't think Lucene folks will mind -- it's not like Mahout is going to >> > > depart from using Lucene/ Hadoop, etc. >> > > >> > > Not that my voice counts much here, but +1 to the idea. >> > > >> > > Dawid >> > > >> > >> > >> > >> > -- >> > Ted Dunning, CTO >> > DeepDyve >> > >> > > > > -- > Ted Dunning, CTO > DeepDyve >
-
Re: Mahout as TLPTed Dunning 2010-02-13, 03:10
Presumably one of the benefits of this will be fewer +0 votes on Mahout
issues due to fewer Lucene centric folks to don't follow our machinations. On Fri, Feb 12, 2010 at 6:19 PM, Benson Margulies <[EMAIL PROTECTED]>wrote: > 1) Mahout has it's own PMC. That group will vote on committers, > releases, and other legal issues. > -- Ted Dunning, CTO DeepDyve
-
Re: Mahout as TLPJake Mannix 2010-02-13, 04:00
So I'm strongly in favor of getting to decide our own destiny, so in
that sense I'm very much a +1 for this. Ditto for the option to create sub-projects. Then there's the simple fact that we are not in any real way a project that *belongs* as part of "Lucene" in the long run. What makes me ambivalent is: how many really active developers do we have to support the administrative tasks around running a TLP? We certainly have momentum in terms of interest and codebase. But should a project only at the 0.2 stage (soon to be 0.3) be a TLP? I'm just wondering if we're just giving ourselves more work. From a practical standpoint, does this make our lives easier, or harder, to do this now as opposed to later? Clearly it must be done at some point, but doing it now has what effect, really? -jake On Fri, Feb 12, 2010 at 7:10 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > Presumably one of the benefits of this will be fewer +0 votes on Mahout > issues due to fewer Lucene centric folks to don't follow our machinations. > > On Fri, Feb 12, 2010 at 6:19 PM, Benson Margulies <[EMAIL PROTECTED] > >wrote: > > > 1) Mahout has it's own PMC. That group will vote on committers, > > releases, and other legal issues. > > > > > > -- > Ted Dunning, CTO > DeepDyve >
-
Re: Mahout as TLPKay Kay 2010-02-13, 08:55
As a lurker around in this community and an active user myself,
expressing mine for whatever it is worth. I am happy with the decoupling of ML from Search, with the former warranting a separate attention to itself. So, +1 on this happening eventually to be more independent, but my reservation has to do with the timing of it and specifically the versioning of it, and how close would a 1.0 release be feasible once this becomes a TLP. On 02/12/2010 02:44 PM, Grant Ingersoll wrote: > As many of you know, Mahout has been growing pretty quickly and has also reached a critical mass. I, along with some others in the Mahout community, feel it would make sense for Mahout to become a TLP With this in mind, I've submitted a proposal to the Lucene PMC to ask the board to make Mahout an Apache TLP. One of the feedbacks from the PMC was question as to whether this has been discussed in the community and whether the community is for it. I know it's been brought up tangentially in the past (see [1], [2], [3]) and there wasn't any disagreement, but it seems it warrants a more formal discussion. > > I see the following pros: > 1. We'd like to organize several subprojects we wish to introduce (Core, NLP, Recommenders/Taste, Ports - C++, etc.) that wouldn't really fit as Lucene subprojects. > 2. I also think longer term that while Machine Learning and Search are often related, they are not required of each other and that Mahout would be better aligned with a more narrow focus of Machine Learning only. > 3. The PMC can be more narrowly focused on Mahout and it's needs and will be better informed of Mahout's contributors, etc. > > Cons: > 1. Lucene has a very strong brand and I have no doubt that Mahout benefits from that association > 2. Changing mailing lists, etc. is a bit of a hassle (mostly for infrastructure), but not that big of a deal. Still, Lucene is well established and well-run, so sometimes inertia is a good thing. > > At the end of the day, I'm +1. > > > [1] http://search.lucidimagination.com/search/document/a6e03af2952ff196/possible_contribution_at_somewhat_of_a_tangent_to_mahout#5a41be454d503779 > > [2] http://search.lucidimagination.com/search/document/40c4c4ec11ca07b5/mi_clustering#7197ef846b384e4e > > [3] http://search.lucidimagination.com/search/document/1817a5e65c83bae3/proposing_a_c_port_for_apache_mahout#8e4e8eabc945264d >
-
Re: Mahout as TLPGrant Ingersoll 2010-02-13, 12:45
All valid points by the many who have responded. Thanks!
When I woke up this morning, I thought maybe we should postpone until 0.3 is out, so it is good to see this expressed here as well. As for concerns about overhead, infra@ will take care of most of the heavy lifting (new mailing lists, migrating everyone over to the new ones). We would need to move our website and put up a redirect, but that is trivial. We'd also have to move our SVN, but that is trivial as well. At the PMC level, the ASF seems to vary quite a bit here, AFAICT. Lucene is pretty low key and very low volume and the subprojects pretty much run themselves. I would suspect that Mahout would be the same given our roots. Another thought is that we time it w/ a 1.0 release and come in with a big bang including press releases, etc. On the other hand, if we do it sooner (after 0.3), we can do two press releases, one for the move and one for the 1.0 release. This would give more exposure overall. Finally, it's not clear the ASF likes lots of subprojects, so we'd need to be careful there. Either that or we just have all committers be committers across all the subs. Then again, it probably isn't a huge deal. Lucene and Hadoop are the two primary examples of projects w/ subs and they are both well run, successful projects. In the end, I still am +1, but think it makes sense to wait until after 0.3. Besides, since the next board meeting is Wednesday, this will give us more time to think about it. -Grant On Feb 13, 2010, at 3:55 AM, Kay Kay wrote: > As a lurker around in this community and an active user myself, expressing mine for whatever it is worth. > > I am happy with the decoupling of ML from Search, with the former warranting a separate attention to itself. So, +1 on this happening eventually to be more independent, but my reservation has to do with the timing of it and specifically the versioning of it, and how close would a 1.0 release be feasible once this becomes a TLP. > > > > On 02/12/2010 02:44 PM, Grant Ingersoll wrote: >> As many of you know, Mahout has been growing pretty quickly and has also reached a critical mass. I, along with some others in the Mahout community, feel it would make sense for Mahout to become a TLP With this in mind, I've submitted a proposal to the Lucene PMC to ask the board to make Mahout an Apache TLP. One of the feedbacks from the PMC was question as to whether this has been discussed in the community and whether the community is for it. I know it's been brought up tangentially in the past (see [1], [2], [3]) and there wasn't any disagreement, but it seems it warrants a more formal discussion. >> >> I see the following pros: >> 1. We'd like to organize several subprojects we wish to introduce (Core, NLP, Recommenders/Taste, Ports - C++, etc.) that wouldn't really fit as Lucene subprojects. >> 2. I also think longer term that while Machine Learning and Search are often related, they are not required of each other and that Mahout would be better aligned with a more narrow focus of Machine Learning only. >> 3. The PMC can be more narrowly focused on Mahout and it's needs and will be better informed of Mahout's contributors, etc. >> >> Cons: >> 1. Lucene has a very strong brand and I have no doubt that Mahout benefits from that association >> 2. Changing mailing lists, etc. is a bit of a hassle (mostly for infrastructure), but not that big of a deal. Still, Lucene is well established and well-run, so sometimes inertia is a good thing. >> >> At the end of the day, I'm +1. >> >> >> [1] http://search.lucidimagination.com/search/document/a6e03af2952ff196/possible_contribution_at_somewhat_of_a_tangent_to_mahout#5a41be454d503779 >> >> [2] http://search.lucidimagination.com/search/document/40c4c4ec11ca07b5/mi_clustering#7197ef846b384e4e >> >> [3] http://search.lucidimagination.com/search/document/1817a5e65c83bae3/proposing_a_c_port_for_apache_mahout#8e4e8eabc945264d >> >
-
Re: Mahout as TLPTed Dunning 2010-02-13, 18:36
+1 to waiting.
On Sat, Feb 13, 2010 at 4:45 AM, Grant Ingersoll <[EMAIL PROTECTED]>wrote: > In the end, I still am +1, but think it makes sense to wait until after > 0.3. Besides, since the next board meeting is Wednesday, this will give us > more time to think about it. > -- Ted Dunning, CTO DeepDyve
-
Re: Mahout as TLPDrew Farris 2010-02-13, 19:42
I can't say that I really understand the issues (if there are any) of
the Mahout project running under Lucene's PMC vs. a Mahout PMC, but it sounds like that would be a big factor in deciding whether the project should be migrated to its own TLP, eg: if Mahout discussions took up a significant portion of the Lucene PMC meetings for example. It doesn't sounds like much of anything else would change from the perspective of a user or new contributor like myself. From my vantage point I don't see much impetus to change at this very moment. If we end up getting something that looks like a solid start on a C++ port, it might be worth revisiting, but even then that could start off as a submodule. http://mahout.apache.org would be easier to remember :) At the end of the day, I think I'm +1 for waiting as well. Drew On Sat, Feb 13, 2010 at 1:36 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > +1 to waiting. > > On Sat, Feb 13, 2010 at 4:45 AM, Grant Ingersoll <[EMAIL PROTECTED]>wrote: > >> In the end, I still am +1, but think it makes sense to wait until after >> 0.3. Besides, since the next board meeting is Wednesday, this will give us >> more time to think about it. >> > > > > -- > Ted Dunning, CTO > DeepDyve >
-
Re: Mahout as TLPBenson Margulies 2010-02-13, 20:20
The ongoing admin is really no big deal. The PMC has to report to the
board once a month. As Grant noted, the initial work is mostly a gift from infra. I don't see any harm in getting 0.3 out first if that makes folks more comfortable. On Sat, Feb 13, 2010 at 2:42 PM, Drew Farris <[EMAIL PROTECTED]> wrote: > I can't say that I really understand the issues (if there are any) of > the Mahout project running under Lucene's PMC vs. a Mahout PMC, but it > sounds like that would be a big factor in deciding whether the project > should be migrated to its own TLP, eg: if Mahout discussions took up a > significant portion of the Lucene PMC meetings for example. > > It doesn't sounds like much of anything else would change from the > perspective of a user or new contributor like myself. From my vantage > point I don't see much impetus to change at this very moment. If we > end up getting something that looks like a solid start on a C++ port, > it might be worth revisiting, but even then that could start off as a > submodule. > > http://mahout.apache.org would be easier to remember :) > > At the end of the day, I think I'm +1 for waiting as well. > > Drew > > On Sat, Feb 13, 2010 at 1:36 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: >> +1 to waiting. >> >> On Sat, Feb 13, 2010 at 4:45 AM, Grant Ingersoll <[EMAIL PROTECTED]>wrote: >> >>> In the end, I still am +1, but think it makes sense to wait until after >>> 0.3. Besides, since the next board meeting is Wednesday, this will give us >>> more time to think about it. >>> >> >> >> >> -- >> Ted Dunning, CTO >> DeepDyve >> >
-
Re: Mahout as TLPGrant Ingersoll 2010-02-13, 20:36
On Feb 13, 2010, at 3:20 PM, Benson Margulies wrote: > The ongoing admin is really no big deal. The PMC has to report to the > board once a month. Once a quarter normally. > As Grant noted, the initial work is mostly a gift > from infra. > > I don't see any harm in getting 0.3 out first if that makes folks more > comfortable. Yeah, this feels better to me the more I think about it.
-
Re: Mahout as TLPIsabel Drost 2010-02-15, 10:17
On Sat Grant Ingersoll <[EMAIL PROTECTED]> wrote:
> > I don't see any harm in getting 0.3 out first if that makes folks > > more comfortable. > > Yeah, this feels better to me the more I think about it. +1 from me as well: I really like the idea of Mahout becoming a TLP - even before a 1.0 release is available. However I think it makes sense to sort out the 0.3 release first. If I am counting correctly, that would make for three reasons for press releases: A new release, Mahout becoming a TLP and later on a 1.0 release. ;) Isabel
-
Re: Mahout as TLPRobin Anil 2010-02-15, 10:30
+1
-
Re: Mahout as TLPJeff Eastman 2010-02-15, 12:08
+1 on Isabel's comments.
Isabel Drost wrote: > On Sat Grant Ingersoll <[EMAIL PROTECTED]> wrote: > >>> I don't see any harm in getting 0.3 out first if that makes folks >>> more comfortable. >>> >> Yeah, this feels better to me the more I think about it. >> > > +1 from me as well: I really like the idea of Mahout becoming a TLP - > even before a 1.0 release is available. > > However I think it makes sense to sort out the 0.3 release first. If I > am counting correctly, that would make for three reasons for press > releases: A new release, Mahout becoming a TLP and later on a 1.0 > release. ;) > > Isabel > > |