Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Nutch, mail # dev - Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]


Copy link to this message
-
Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]
Julien Nioche 2011-09-16, 16:26
Am happy to call for a vote on the future of Nutch 2.0 if you want. Shall we
reduce the various options described before to a single one?

Julien

On 15 September 2011 19:55, Markus Jelsma <[EMAIL PROTECTED]>wrote:

>
> > Hi Guys,
> >
> > I thought I'd chime in on this thread. My comments below:
> > > I understand and share your frustration, however you need to bear in
> mind
> > > that things are done only if people volunteer and have time - usually
> > > taken from their holiday, weekends, evenings. Chris (who is the de
> facto
> > > release master for Nutch and Gora) has not had the time and nobody else
> > > has volunteered to do it.
> >
> > Yep I haven't had the time to push a Gora 0.1.1-incubating release that
> > will address the Maven issues. However it is on my roadmap for open
> source
> > stuff to get done in the next month, so that's a good thing. But yes,
> that
> > portion of my open source work is all volunteer time, so sometimes other
> > things take priority.
> >
> > >> As it happens, yesterday was the 1 year anniversary of the last
> > >> successful Hudson/Jenkins build...  If that actually worked, we could
> > >> point people towards it as a useful recipe for how to get a build
> > >> working off trunk.  I haven't been following Nutch too closely, but it
> > >> always strikes me as really odd, that there's a nightly build and it
> > >> doesn't bother anybody that it fails all the time (and that there
> > >> isn't a nightly build for the stable branches).
> > >
> > > The real issue behind all this is what we should do with Nutch 2.0.
> What
> > > follows is only my opinion and I would love to hear what others have to
> > > say on this subject.
> > >
> > > Since we (actually mostly Dogacan) wrote 2.0 and delegated the storage
> to
> > > Gora, the latter hasn't really taken off since incubation. There have
> > > been some modest contributions to it but it does not seem to be used
> > > much and there is virtually nothing happening on it in terms of
> > > development. More worryingly, the people who initially contributed to
> it
> > > are not very active on the project (such is life, new jobs, different
> > > projects, etc...) anymore·. As for Nutch 2.0, it hasn't made any
> > > progress in  the last 12 months : we still have the same bugs, the
> tests
> > > do not work, the build has to be done manually etc...
> >
> > Yep.
> >
> > > At the same time, there has been a new lease of life into Nutch as a
> > > whole : there is definitely more activity on the mailing lists, new
> > > users, new active committers  etc... and quite a few bugfixes and
> > > improvements - most of them backported from what had been done in the
> > > trunk and people seem fairly happy with what we can do with 1.4
> >
> > Totally agreed. I'm actually not super surprised -- ever since 1.1, I
> kind
> > of felt that maintaining a stable 1.X branch of Nutch (in parallel to the
> > 2.0 efforts) was really going to pay off since there was renewed interest
> > from users in leveraging (and furthermore accepting) the nuances of 1.X.
> >
> > > So the question is : what shall we do with 2.0? Here are a few
> > > possibilities
> > >
> > >
> > > a) put some effort into it, fix the bugs and make so that it can be
> used
> > > instead of 1.x
> > > b) shelve it and leave it for enthusiasts to play with + make 1.x the
> > > trunk again
> > > c) do nothing : keep 2.0 and 1.x in parallel  (but having to maintain
> two
> > > branches is quite a pain)
> > > d) abandon the idea of a neutral storage layer with Gora and hardwire
> it
> > > to e.g. HBase
> > >
> > > Option (a) has not happened in the last 12 months and I am not very
> > > hopeful about it.
> > >
> > > What do you guys think?
> >
> > I'd suggest an option e). Evolve and keep releasing 1.X over the next 6
> > months, and keep 2.0 in the trunk. After 6 months, see how close 1.X is
> to
> > actually being 2.0 (e.g., did we release a 1.4, a 1.5, a 1.6?) If we get
> > to ~1.6 over the next 6 months and there is still no active development

*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com