|
Sami Siren
2007-04-10, 08:20
Grant Ingersoll
2007-04-11, 13:33
Sami Siren
2007-04-11, 15:02
Grant Ingersoll
2007-04-11, 15:54
Chris Hostetter
2007-04-11, 17:50
Sami Siren
2007-04-11, 18:37
Joerg Hohwiller
2007-04-11, 21:43
Chris Hostetter
2007-04-11, 22:59
Sami Siren
2007-04-13, 19:48
Michael Busch
2007-06-02, 00:33
|
-
Maven artifacts for Lucene.*Sami Siren 2007-04-10, 08:20
I have been hoping to put up mechanism for (easier) deployment of m2
artifacts to maven repositories (both Apache snapshot repository and the main maven repository at ibiblio). The most convenient way would be to use maven2 to build the various lucene projects but as the mailing list conversation about this subject indicates there is no common interest for changing the (working) ant based build system to a maven based. The next best thing IMO would be using ant build as normally for the non maven2 releases and use maven2 for building the maven releases (.jar files, optionally also packages for sources used to build the binary and packages for javadocs) with related check sums and signatures. To repeat it one more time: what I am proposing here is not meant to replace the current solid way of building the various Lucene projects - I am just trying to provide a convenient way to make the release artifacts to be deployed to maven repositories. I have put together an initial set of poms (for lucene-java) to do this quite easily, basically what is required is installation of maven2 binaries and the set of pom files and a checkout of the lucene version to build. The various jars are build, packaged, check summed, signed and optionally deployed with single mvn command. So IMO it is quite easy thing to do in addition to normal release process. I can also, for undefined time, volunteer to do these builds if it is too much of burden for RMs. There are however couple of things I need your opinion about (or at least attention): 1. There are differencies when comparing to ant build jars (due to release policy of a.o) the built jars will contain LICENSE.txt, NOTICE.txt in /META-INF. Is this a problem? 2. I propose that we add additional folder level so the groupId for lucene java would be org.apache.lucene.java (it is now org.apache.lucene within the currently released artifacts). The initial list of artifacts (the new proposed structure) is listed below: groupId:org.apache.lucene lucene-parent (pom) (a top level pom defining lucene wide stuff that gets inherited to sub project modules) groupId:org.apache.lucene.java java-parent (pom) lucene-core (jar) lucene-demos (jar) contrib-parent (pom) lucene-analyzers (jar) lucene-benchmark (jar) lucene-highlighter(jar) lucene-misc (jar) lucene-queries (jar) lucene-regex (jar) lucene-snowball (jar) lucene-spellchecker(jar) lucene-surround (jar) lucene-swing (jar) lucene-wordnet (jar) lucene-xml-query-parser (jar) groupId:org.apache.lucene.nutch (TODO) nutch-parent (pom) nutch-core (jar) nutch-plugins (pom) nutch-plugin-x (jar) (as soon as nutch plugins can be of format .jar) ... groupId:org.apache.lucene.hadoop (TODO) hadoop-parent (pom) hadoop-core (jar) hadoop-streaming (jar) ... groupId:org.apache.lucene.solr (TODO) solr-parent (pom) solr-core (jar) ... 3. Where to put poms? They need to be put somewhere. I think it's not smart at this point pollute the ant driven folder structure with poms - they are better of in separate dir structure. What is (in your opinion) the most convenient place for them? I would propose that every sub project would have dir named maven (or something similar) that would contain poms for that particular sub project. Other possibility would be putting a lucene level dir for maven stuff and the poms would be maintained there. The text above was my initial thought about this, however there have been concerns that the procedure described here might not be most optimal one. So far the arguments have been the following: 1. Two build systems to maintain True. However I don't quite see that so black and white: You would anyway need to maintain the poms manually (if you care about the quality of poms) or you would have to build some mechanism to build those. Of course in situation where you would not actually build with maven the poms could be a bit more simple. 2. Two build systems producing different jars, would maven2 releases require a separate vote? Yes the artifacts (jars) would be different, because you would need to add LICENSE and MANIFEST into them (because of apache policy). I don't know about the vote, how do other projects deal with this kind of situation, anyone here to tell? One solution to jar mismatch would be changing the ant build to put those files in produced jars. 3. Additional burden for RM, need to run additional command, install maven There will be that external step for doing the maven release and you need to install maven also. But compared to current situation where you would have to extract jars, put some more files into them, sign them, modify poms to reflect correct version numbers, upload them to repositories manually. The other way to do is would be changing the current build system to be more maven friendly. This would probably mean following things: -add poms for artifacts into svn repository (where?) -adding LICENSE and NOTICE into jars. -add ant target to -sign jars -push artifacts into staging dir or to repository (or leave it as additional manual step) -optionally build javadoc jars (if currently not done) -optionally build source jars (if currently not done) Are there any technical arguments in favour or against the proposed solutions or is there perhaps one that outperforms them both. Please share your thoughts about all this :) Sami Siren
-
Re: Maven artifacts for Lucene.*Grant Ingersoll 2007-04-11, 13:33
Initial thoughts and then more inline below, and keep in mind I long
ago drank the Maven kool-aid and am a big fan. :-) I know it is a pain to a few, but realistically speaking there has not been all that much noise about Maven artifacts not being available. We use Maven for everything we do and all I ever do when there is a new release of Lucene is put the new jars in our remote repository and everything works. It takes two or three steps and about 5 minutes of my time, and would be less if I scripted it. I frankly don't get what the big deal is. OK, it does save a few bytes on a server somewhere and we have our own group/artifact names (lucene/lucene), but chances are it is more reliable than trying to get it from one of the mirrors and it will be faster and the names are clear cut and easy to remember. I would venture anyone with Maven installed has their own repository to store their own, private, artifacts, so it isn't like they need to add some new, complex process. On Apr 10, 2007, at 4:20 AM, Sami Siren wrote: > I have been hoping to put up mechanism for (easier) deployment of m2 > artifacts to maven repositories (both Apache snapshot repository > and the > main maven repository at ibiblio). > > The most convenient way would be to use maven2 to build the various > lucene > projects but as the mailing list conversation about this subject > indicates there is no common interest for changing the (working) > ant based > build system to a maven based. > > The next best thing IMO would be using ant build as normally for > the non > maven2 releases and use maven2 for building the maven releases (.jar > files, optionally also packages for sources used to build the > binary and > packages for javadocs) with related check sums and signatures. > > To repeat it one more time: what I am proposing here is not meant > to replace > the current solid way of building the various Lucene projects - > I am just trying to provide a convenient way to make the release > artifacts > to be deployed to maven repositories. > Couldn't we just add various ANT targets that package the jars per the Maven way, and even copy them to the appropriate places? I wonder how hard it would be to have ANT output the POM and create Maven Jars. I know it is backwards, but, it would be less complicated and could be hooked right into the ANT script and require very little from the RM. Ideally, I would love to see the release process automated so that it became push button (I know Maven goes a long way toward this) > > There are however couple of things I need your opinion about (or at > least > attention): > > 1. There are differencies when comparing to ant build jars (due to > release > policy of a.o) the built jars will contain LICENSE.txt, > NOTICE.txt in /META-INF. Is this a problem? Does this just mean it would be in two places? I don't think that is a big deal. > > 2. I propose that we add additional folder level so the groupId for > lucene > java would be org.apache.lucene.java (it is now org.apache.lucene > within the currently released artifacts). The initial list of > artifacts (the > new proposed structure) is listed below: If I'm understanding correctly, you want to change the whole package structure by adding one more level? Wouldn't this break every single user of Lucene? We are still on M1 but are in the process of migrating, which is not straightforward, but, alas, the writing is on the wall concerning M1. .... > > > > The text above was my initial thought about this, however there > have been > concerns that the procedure described here might not be most > optimal one. So > far the arguments have been the following: > > 1. Two build systems to maintain > > True. However I don't quite see that so black and white: You would > anyway > need to maintain the poms manually (if you care about the quality > of poms) > or you would have to build some mechanism to build those. Of course in I think that would be fine to unify the jars. Grant Ingersoll Center for Natural Language Processing http://www.cnlp.org Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ LuceneFAQ
-
Re: Maven artifacts for Lucene.*Sami Siren 2007-04-11, 15:02
Grant Ingersoll wrote:
> Initial thoughts and then more inline below, and keep in mind I long ago > drank the Maven kool-aid and am a big fan. :-) > > I know it is a pain to a few, but realistically speaking there has not > been all that much noise about Maven artifacts not being available. We > use Maven for everything we do and all I ever do when there is a new > release of Lucene is put the new jars in our remote repository and > everything works. It takes two or three steps and about 5 minutes of my > time, and would be less if I scripted it. I frankly don't get what the > big deal is. OK, it does save a few bytes on a server somewhere and we > have our own group/artifact names (lucene/lucene), but chances are it is > more reliable than trying to get it from one of the mirrors and it will > be faster and the names are clear cut and easy to remember. I would > venture anyone with Maven installed has their own repository to store > their own, private, artifacts, so it isn't like they need to add some > new, complex process. Yes it is true that many organizations use internal repositories (at least from what I've seen), heck even every developer has one (.m2/repository by default), but IMO lots of benefits of maven are lost if that's the way users in large utilize maven. Like you are solving the problem for your organization by deploying lucene into your private repository (on behalf of the developers using you local repository) I would like to solve the problem more globally eventually you could save that 5 mins of your time to do some more lucene magic ;) >> The next best thing IMO would be using ant build as normally for the non >> maven2 releases and use maven2 for building the maven releases (.jar >> files, optionally also packages for sources used to build the binary and >> packages for javadocs) with related check sums and signatures. >> >> To repeat it one more time: what I am proposing here is not meant to >> replace >> the current solid way of building the various Lucene projects - >> I am just trying to provide a convenient way to make the release >> artifacts >> to be deployed to maven repositories. >> > > Couldn't we just add various ANT targets that package the jars per the > Maven way, and even copy them to the appropriate places? I wonder how > hard it would be > to have ANT output the POM and create Maven Jars. I know it is > backwards, but, it would be less complicated and could be hooked right > into the ANT script and require very little from the RM. For me it's more important to get where I am going to than some detail that gets me there. So i would be very happy man if one way would be adopted by the lucene community. > > If I'm understanding correctly, you want to change the whole package > structure by adding one more level? Wouldn't this break every single > user of Lucene? We are still on M1 but are in the process of > migrating, which is not straightforward, but, alas, the writing is on > the wall concerning M1. We wouldn't touch the existing single maven artifact in the repository, just would deploy the new artifacts under different gId, nothing existing is broken on the way. We could of cource continue publishing under gId 'org.apache.lucene' if so decided but I think it's more clear if the subprojects are under dirrerent gId. -- Sami Siren ---------------------------------------------------------------------
-
Re: Maven artifacts for Lucene.*Grant Ingersoll 2007-04-11, 15:54
On Apr 11, 2007, at 11:02 AM, Sami Siren wrote: > We wouldn't touch the existing single maven artifact in the > repository, > just would deploy the new artifacts under different gId, nothing > existing is broken on the way. We could of cource continue publishing > under gId 'org.apache.lucene' if so decided but I think it's more > clear > if the subprojects are under dirrerent gId. > I was confused on the subject. I thought you were talking about the source for some reason, but you mean the structure for the artifacts on the servers. Like I said, I'm not fully up on M2 yet. And, honestly, the lack of a good migration plan from M1 to M2 leaves a very small, but bitter taste in my mouth, especially when it comes to existing jelly scripts. ---------------------------------------------------------------------
-
Re: Maven artifacts for Lucene.*Chris Hostetter 2007-04-11, 17:50
: Couldn't we just add various ANT targets that package the jars per : the Maven way, and even copy them to the appropriate places? I : wonder how hard it would be : to have ANT output the POM and create Maven Jars. I know it is This is what i would view as the ideal situation ... a patch to the current ant build.xml that caused the package-all-binary and package-all-src to produce a new maven directory with everything we need to copy ot the maven repository would be the best way to get people to get on board -- it's hard to complain with something that requires no effort to adopt. if that same patch included a new ant target named something like "publish-maven" which required key access to people.apache.org but took care of pushing the maven artifacts into the exact right spot, that would be one less thing people would have to worry about. : > 1. There are differencies when comparing to ant build jars (due to : > release : > policy of a.o) the built jars will contain LICENSE.txt, : > NOTICE.txt in /META-INF. Is this a problem? we should under no circumstances have two differnet jars calling themselves "lucene-core-X.Y.0.jar" with differnet md5 sums ... that's asking for a world of pain. Fortunately there is an easy fix for this: start putting the LICENSE.txt and NOTICE.txt files in the jar ... i think there's already a patch for this floating arround in Jira. : > 2. I propose that we add additional folder level so the groupId for : > lucene : > java would be org.apache.lucene.java (it is now org.apache.lucene I don't really see the advantage of this ... Lucene Java has allways had the *java* package org.apache.lucene, and my understanding was that maven groupIds should attempt to match the java package structure of the code. likewise the other java subprojects have their own java packages, shouldn't their groupIds match their pacakge structures? As a side note: nothing discussed here really has any barring on the other Lucene sub-projects, the individual project communities need to discuss any changes to their build processes/policies .. but starting with Lucene Java is the probably the right way to go ... if a simple solution is found for our build file, it will probably lend itself to similar soluteions for hte other Lucne projects that use ant. -Hoss ---------------------------------------------------------------------
-
Re: Maven artifacts for Lucene.*Sami Siren 2007-04-11, 18:37
Chris Hostetter wrote:
> : Couldn't we just add various ANT targets that package the jars per > : the Maven way, and even copy them to the appropriate places? I > : wonder how hard it would be > : to have ANT output the POM and create Maven Jars. I know it is > > This is what i would view as the ideal situation ... a patch to the > current ant build.xml that caused the package-all-binary and > package-all-src to produce a new maven directory with everything we need > to copy ot the maven repository would be the best way to get people to get > on board -- it's hard to complain with something that requires no effort > to adopt. Where and how would you store for example the dependency information that you would be using to generate the poms? For lucene java it is easy for most modules as there is only dependency to lucene-core but for example in solr, nutch and hadoop it starts to go beyond trivial. > > if that same patch included a new ant target named > something like "publish-maven" which required key access to > people.apache.org but took care of pushing the maven artifacts into the > exact right spot, that would be one less thing people would have to worry > about. > > : > 1. There are differencies when comparing to ant build jars (due to > : > release > : > policy of a.o) the built jars will contain LICENSE.txt, > : > NOTICE.txt in /META-INF. Is this a problem? > > we should under no circumstances have two differnet jars calling > themselves "lucene-core-X.Y.0.jar" with differnet md5 sums ... that's > asking for a world of pain. Fortunately there is an easy fix for this: > start putting the LICENSE.txt and NOTICE.txt files in the jar ... i think > there's already a patch for this floating arround in Jira. > > : > 2. I propose that we add additional folder level so the groupId for > : > lucene > : > java would be org.apache.lucene.java (it is now org.apache.lucene > > I don't really see the advantage of this ... Lucene Java has allways had > the *java* package org.apache.lucene, and my understanding was that > maven groupIds should attempt to match the java package structure of the > code. likewise the other java subprojects have their own java packages, > shouldn't their groupIds match their pacakge structures? I believe you are right, and I also think it makes more sense to match the package names. >.. but starting with Lucene > Java is the probably the right way to go ... if a simple solution is found > for our build file, it will probably lend itself to similar soluteions for > hte other Lucne projects that use ant. Yes that is my hoping also and the main motivation to start the discussion from lucene java (as it also is a dependency to 2 more sub projects). IMO we should however try to look at the big picture also and not only try to solve the minimal part to get it out of lucene-java hands, because I am afraid that if the minimum is done here in lucene-java there might be caps to fill in other projects and the way things are done here is not usable in other sub projects as it is. -- Sami Siren ---------------------------------------------------------------------
-
Re: Maven artifacts for Lucene.*Joerg Hohwiller 2007-04-11, 21:43
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1 Hi Grant. > Initial thoughts and then more inline below, and keep in mind I long ago > drank the Maven kool-aid and am a big fan. :-) > > I know it is a pain to a few, but realistically speaking there has not > been all that much noise about Maven artifacts not being available. We > use Maven for everything we do and all I ever do when there is a new > release of Lucene is put the new jars in our remote repository and > everything works. It takes two or three steps and about 5 minutes of my > time, and would be less if I scripted it. I frankly don't get what the > big deal is. That was what I was thinking, too, when I asked on this list for somebody to deploy lucene-highlighter 2.0.0 to central maven2 repo on 08.01.2007 21:37. I also supplied the suggested POM for it. Anyhow nobody was able to do this for me for over a quarter of a year now. I think it is just a matter of minutes, but I do NOT have permission to do so. What I did is to add it to the repository of my open-source project. I am not an expert of rights and law and hope this is allowed according to Apache License. Anyways it would be easier to do it once and centralized instead of making all maven+lucene users do this AND especially cause that they create POMs on their own that will all be different. Finally maven users involved in two different projects that did the same thing may end up with a conflicting state if maybe the first project forgot to declare a dependency in a lucene contrib POM. In the end this shows, that the process must be made so easy, that it only takes about one command to call. I do NOT care to much wethere this would be "ant ..." or "mvn ...". Best regards J�rg -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGHVafmPuec2Dcv/8RAv/LAJ0Xx31+Y6awJlAvBbSfmOeipghUWwCdGBEC 4aamDeqxnAgGn6hK4+YXU1c=BGA7 -----END PGP SIGNATURE----- ---------------------------------------------------------------------
-
Re: Maven artifacts for Lucene.*Chris Hostetter 2007-04-11, 22:59
: Where and how would you store for example the dependency information : that you would be using to generate the poms? For lucene java it is easy : for most modules as there is only dependency to lucene-core but for : example in solr, nutch and hadoop it starts to go beyond trivial. Whatever files also need to be included along with the jars in order to make the maven distribution complete that can't be built completley dynamicly (ie: the md5 files) can certainly be commited into the repository ... but if making a release requires a lot of manual upating to those files, it's going to be a hinderane to the process ... things like version number and date should ideally be filled in via variables to help keep things automated. jar dependencies are another matter ... as you say, for java-lucene the issue is trivial since there are no dependencies, but for other projects it could get complicated. Solr (for example) ships with the versions of it's dependencies that it expects to use, and in some cases these version may not be official release versions that you would ever find in a maven repository. I'm notsure how apps that want to publish to maven but depend on apss that do not publish to maven deal with this problem, but whatever solution they use could also be used in this case. ...either way, it's a discussion for the solr-dev list, not java-dev. : projects). IMO we should however try to look at the big picture also and : not only try to solve the minimal part to get it out of lucene-java : hands, because I am afraid that if the minimum is done here in : lucene-java there might be caps to fill in other projects and the way : things are done here is not usable in other sub projects as it is. each project has it's own community ... even if you find a perfect solution to every problem anyone in the world might ever encounter, discussing it on java-dev does nothing to get your solution adopted by the nutch, hadoop, or solr communities. -Hoss ---------------------------------------------------------------------
-
Re: Maven artifacts for Lucene.*Sami Siren 2007-04-13, 19:48
Chris Hostetter wrote:
> Whatever files also need to be included along with the jars in order to > make the maven distribution complete that can't be built completley > dynamicly (ie: the md5 files) can certainly be commited into the > repository ... but if making a release requires a lot of manual upating to > those files, it's going to be a hinderane to the process ... things like > version number and date should ideally be filled in via variables to help > keep things automated. This starts to sound like a plan that can work. I'll see if I can hack something up as a patch fow a review. Do you think poms should live in a separate dir or should they be spread across dirs (modules). > jar dependencies are another matter ... as you say, for java-lucene the > issue is trivial since there are no dependencies, but for other projects > it could get complicated. Solr (for example) ships with the versions of > it's dependencies that it expects to use, and in some cases these version > may not be official release versions that you would ever find in a maven > repository. I'm notsure how apps that want to publish to maven but depend > on apss that do not publish to maven deal with this problem, but whatever > solution they use could also be used in this case. I have seen for example a solution where artifacts are published on somewhere else but official repositories, but to be frank I don't know what's the best (or at least acceptable) solution here. > : projects). IMO we should however try to look at the big picture also and > : not only try to solve the minimal part to get it out of lucene-java > : hands, because I am afraid that if the minimum is done here in > : lucene-java there might be caps to fill in other projects and the way > : things are done here is not usable in other sub projects as it is. > > each project has it's own community ... even if you find a perfect > solution to every problem anyone in the world might ever encounter, > discussing it on java-dev does nothing to get your solution adopted by the > nutch, hadoop, or solr communities. I understand that there are separate communities. I am not saying that everybody must accept the solution that will (if any) adopted by lucene-java. But still I am hoping that we lucene-java won't deliberately accept a solution that won't work for others (as you said it: "if a simple solution is found for our build file, it will probably lend itself to similar soluteions for hte other Lucne projects that use ant.") -- Sami Siren ---------------------------------------------------------------------
-
Re: Maven artifacts for Lucene.*Michael Busch 2007-06-02, 00:33
> > Whatever files also need to be included along with the jars in order to
> > make the maven distribution complete that can't be built completley > > dynamicly (ie: the md5 files) can certainly be commited into the > > repository ... but if making a release requires a lot of manual upating to > > those files, it's going to be a hinderane to the process ... things like > > version number and date should ideally be filled in via variables to help > > keep things automated. > This starts to sound like a plan that can work. I'll see if I can hack > something up as a patch fow a review. Do you think poms should live in a > separate dir or should they be spread across dirs (modules). Hi Sami, as you may have noticed we're aiming for a Lucene 2.2 release in about two weeks and as it looks like I will be acting as RM. I'd like to get the maven upload right this time, but I admit that I don't have much experience with maven in general. I'll try to get familiar with it during the next days, but some help will definitely be useful. I already committed LUCENE-802 (patch from Steven Parkes) that adds LICENSE.TXT and NOTICE.TXT to the Lucene jars. I think what remains is to have our ant script generate the poms for maven, right? Have you made progress with your patch yet? Regards, Michael --------------------------------------------------------------------- |