|
Erick Erickson
2012-09-16, 14:54
Dawid Weiss
2012-09-16, 15:17
Uwe Schindler
2012-09-16, 15:44
Uwe Schindler
2012-09-16, 15:46
Dawid Weiss
2012-09-16, 16:02
Dawid Weiss
2012-09-16, 17:48
Steven A Rowe
2012-09-16, 18:26
Yonik Seeley
2012-09-16, 18:32
Dawid Weiss
2012-09-16, 18:42
Dawid Weiss
2012-09-16, 18:45
Robert Muir
2012-09-16, 18:51
Erick Erickson
2012-09-16, 22:27
Erick Erickson
2012-09-16, 23:53
Yonik Seeley
2012-09-17, 01:10
Yonik Seeley
2012-09-17, 01:30
Chris Male
2012-09-17, 01:51
Robert Muir
2012-09-17, 01:53
Yonik Seeley
2012-09-17, 01:55
Yonik Seeley
2012-09-17, 02:02
Erick Erickson
2012-09-17, 02:06
Robert Muir
2012-09-17, 02:08
Mark Miller
2012-09-17, 03:10
Dawid Weiss
2012-09-17, 07:06
Uwe Schindler
2012-09-17, 07:13
Dawid Weiss
2012-09-17, 07:30
Uwe Schindler
2012-09-17, 07:35
Robert Muir
2012-09-17, 11:58
Michael McCandless
2012-09-17, 12:42
Dawid Weiss
2012-09-17, 12:45
Erick Erickson
2012-09-17, 12:51
Chris Male
2012-09-17, 12:58
Dawid Weiss
2012-09-17, 13:10
Dawid Weiss
2012-09-17, 13:11
Chris Male
2012-09-17, 13:16
Dawid Weiss
2012-09-17, 13:19
Uwe Schindler
2012-09-17, 13:31
Dawid Weiss
2012-09-17, 13:37
Erick Erickson
2012-09-17, 13:53
Mark Miller
2012-09-17, 13:54
Robert Muir
2012-09-17, 13:58
Mark Miller
2012-09-17, 14:49
|
-
being a good citizen is hard when you can't successfully run tests....Erick Erickson 2012-09-16, 14:54
Unit tests are good. We all know that. But I'm becoming increasingly
frustrated at trying to run them. I've been working on LUCENE-4326 for a while (ok, intermittently, but...). I've been almost unable to successfully run "ant test" at the top level, I'm back to the message: HEARTBEAT J1: 2012-09-16T10:19:32, no events in: 183s, approx. at: TestReplicationHandler.test going on forever, or at least 1,800+ seconds and counting right now. I have no clue what it means to terminate the test run at this point. Are there tests that haven't been run yet that won't get run if I ctrl-C? I don't know.... OK, I can wait for a long time and hope it terminates sometime, which it has in the past. Eventually. Maybe. Which makes trying to actually _use_ the tests frustrating at best and I would guess intimidating as hell for people who do even less coding than I do... I can terminate the tests and grep for "reproduce with" or "FAILURE" in the output file. I can run any failing tests on an unaltered branch (which may well miss stuff if the tests terminate without completing).... I can do a lot of things that involve checking in code without successfully doing what it says on the "how to contribute" page. I see a build target "jenkins-hourly" that seems promising, is it enough? If so, I'll change the "how to contribute" page.... So what's the story? Given the pace that fixes flow into the system, others aren't having the trouble I'm having or no new code would get checked in. So I've got to assume there's a process that's not documented that people are using in order to make progress. If there is such a process, we need to make it plain on the "How to contribute" page, not have it be something that each of us has to create our own private way of coping. Or fix the system so this doesn't happen all the time (Yeah, I know, I should feel free <G>). I'm about to adopt the policy that I'll run any failing tests on the code on an unaltered tree and if they fail on the unaltered tree I'll check stuff in anyway. That's poor policy at best, and on the way to "the hell with the testing" as an attitude. Testing is getting in the way of progress in my case, not helping me not break things. Or my particular system (OS x, Lion) is just screwed up and I've been too lazy to dig enough to understand why... Erick@FrustratedOnASundayMorning ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Dawid Weiss 2012-09-16, 15:17
This message just means that the test is running. Possibly it hung and will
not return possibly it just runs for a long time. This message is practical because if you're running on multiple forked jvms there is no way to pipe their output to a single console sybchronously. I don't think the test framework is to blame here. Some of the tests are just flaky. I've tried to exclude them from a normal test run a few times but this was received with mixed response. See the archives. Dawid Sent from mobile phone. On Sep 16, 2012 4:55 PM, "Erick Erickson" <[EMAIL PROTECTED]> wrote: > Unit tests are good. We all know that. But I'm becoming increasingly > frustrated at trying to run them. I've been working on LUCENE-4326 for > a while (ok, intermittently, but...). I've been almost unable to > successfully run "ant test" at the top level, I'm back to the message: > > HEARTBEAT J1: 2012-09-16T10:19:32, no events in: 183s, approx. at: > TestReplicationHandler.test > > going on forever, or at least 1,800+ seconds and counting right now. I > have no clue what it means to terminate the test run at this point. > Are there tests that haven't been run yet that won't get run if I > ctrl-C? I don't know.... > > OK, I can wait for a long time and hope it terminates sometime, which > it has in the past. Eventually. Maybe. Which makes trying to actually > _use_ the tests frustrating at best and I would guess intimidating as > hell for people who do even less coding than I do... > > I can terminate the tests and grep for "reproduce with" or "FAILURE" > in the output file. I can run any failing tests on an unaltered branch > (which may well miss stuff if the tests terminate without > completing).... I can do a lot of things that involve checking in code > without successfully doing what it says on the "how to contribute" > page. I see a build target "jenkins-hourly" that seems promising, is > it enough? If so, I'll change the "how to contribute" page.... > > So what's the story? Given the pace that fixes flow into the system, > others aren't having the trouble I'm having or no new code would get > checked in. So I've got to assume there's a process that's not > documented that people are using in order to make progress. If there > is such a process, we need to make it plain on the "How to contribute" > page, not have it be something that each of us has to create our own > private way of coping. Or fix the system so this doesn't happen all > the time (Yeah, I know, I should feel free <G>). > > I'm about to adopt the policy that I'll run any failing tests on the > code on an unaltered tree and if they fail on the unaltered tree I'll > check stuff in anyway. That's poor policy at best, and on the way to > "the hell with the testing" as an attitude. Testing is getting in the > way of progress in my case, not helping me not break things. > > Or my particular system (OS x, Lion) is just screwed up and I've been > too lazy to dig enough to understand why... > > Erick@FrustratedOnASundayMorning > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
-
RE: being a good citizen is hard when you can't successfully run tests....Uwe Schindler 2012-09-16, 15:44
I generally never run Solr tests. When I changed smthg in Lucene, I just run ant validate (not precommit) to see if it compiles and let the rest does Jenkins. I am tired of waiting for Solr tests, they are sometimes passing sometimes not, sometimes take hours or sometimes obviously also drink my beer when I am away from my computer.
Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [EMAIL PROTECTED] > -----Original Message----- > From: Erick Erickson [mailto:[EMAIL PROTECTED]] > Sent: Sunday, September 16, 2012 4:55 PM > To: [EMAIL PROTECTED] > Subject: being a good citizen is hard when you can't successfully run tests.... > > Unit tests are good. We all know that. But I'm becoming increasingly frustrated > at trying to run them. I've been working on LUCENE-4326 for a while (ok, > intermittently, but...). I've been almost unable to successfully run "ant test" at > the top level, I'm back to the message: > > HEARTBEAT J1: 2012-09-16T10:19:32, no events in: 183s, approx. at: > TestReplicationHandler.test > > going on forever, or at least 1,800+ seconds and counting right now. I have no > clue what it means to terminate the test run at this point. > Are there tests that haven't been run yet that won't get run if I ctrl-C? I don't > know.... > > OK, I can wait for a long time and hope it terminates sometime, which it has in > the past. Eventually. Maybe. Which makes trying to actually _use_ the tests > frustrating at best and I would guess intimidating as hell for people who do > even less coding than I do... > > I can terminate the tests and grep for "reproduce with" or "FAILURE" > in the output file. I can run any failing tests on an unaltered branch (which may > well miss stuff if the tests terminate without completing).... I can do a lot of > things that involve checking in code without successfully doing what it says on > the "how to contribute" > page. I see a build target "jenkins-hourly" that seems promising, is it enough? If > so, I'll change the "how to contribute" page.... > > So what's the story? Given the pace that fixes flow into the system, others > aren't having the trouble I'm having or no new code would get checked in. So > I've got to assume there's a process that's not documented that people are > using in order to make progress. If there is such a process, we need to make it > plain on the "How to contribute" > page, not have it be something that each of us has to create our own private > way of coping. Or fix the system so this doesn't happen all the time (Yeah, I > know, I should feel free <G>). > > I'm about to adopt the policy that I'll run any failing tests on the code on an > unaltered tree and if they fail on the unaltered tree I'll check stuff in anyway. > That's poor policy at best, and on the way to "the hell with the testing" as an > attitude. Testing is getting in the way of progress in my case, not helping me > not break things. > > Or my particular system (OS x, Lion) is just screwed up and I've been too lazy to > dig enough to understand why... > > Erick@FrustratedOnASundayMorning > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] For additional > commands, e-mail: [EMAIL PROTECTED] ---------------------------------------------------------------------
-
RE: being a good citizen is hard when you can't successfully run tests....Uwe Schindler 2012-09-16, 15:46
"Not precommit" -> "now precommit", sorry.
----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [EMAIL PROTECTED] > -----Original Message----- > From: Uwe Schindler [mailto:[EMAIL PROTECTED]] > Sent: Sunday, September 16, 2012 5:45 PM > To: [EMAIL PROTECTED] > Subject: RE: being a good citizen is hard when you can't successfully run tests.... > > I generally never run Solr tests. When I changed smthg in Lucene, I just run ant > validate (not precommit) to see if it compiles and let the rest does Jenkins. I am > tired of waiting for Solr tests, they are sometimes passing sometimes not, > sometimes take hours or sometimes obviously also drink my beer when I am > away from my computer. > > Uwe > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: [EMAIL PROTECTED] > > > -----Original Message----- > > From: Erick Erickson [mailto:[EMAIL PROTECTED]] > > Sent: Sunday, September 16, 2012 4:55 PM > > To: [EMAIL PROTECTED] > > Subject: being a good citizen is hard when you can't successfully run tests.... > > > > Unit tests are good. We all know that. But I'm becoming increasingly > > frustrated at trying to run them. I've been working on LUCENE-4326 for > > a while (ok, intermittently, but...). I've been almost unable to > > successfully run "ant test" at the top level, I'm back to the message: > > > > HEARTBEAT J1: 2012-09-16T10:19:32, no events in: 183s, approx. at: > > TestReplicationHandler.test > > > > going on forever, or at least 1,800+ seconds and counting right now. I > > have no clue what it means to terminate the test run at this point. > > Are there tests that haven't been run yet that won't get run if I > > ctrl-C? I don't know.... > > > > OK, I can wait for a long time and hope it terminates sometime, which > > it has in the past. Eventually. Maybe. Which makes trying to actually > > _use_ the tests frustrating at best and I would guess intimidating as > > hell for people who do even less coding than I do... > > > > I can terminate the tests and grep for "reproduce with" or "FAILURE" > > in the output file. I can run any failing tests on an unaltered branch > > (which may well miss stuff if the tests terminate without > > completing).... I can do a lot of things that involve checking in code > > without successfully doing what it says on the "how to contribute" > > page. I see a build target "jenkins-hourly" that seems promising, is > > it enough? If so, I'll change the "how to contribute" page.... > > > > So what's the story? Given the pace that fixes flow into the system, > > others aren't having the trouble I'm having or no new code would get > > checked in. So I've got to assume there's a process that's not > > documented that people are using in order to make progress. If there > > is such a process, we need to make it plain on the "How to contribute" > > page, not have it be something that each of us has to create our own > > private way of coping. Or fix the system so this doesn't happen all > > the time (Yeah, I know, I should feel free <G>). > > > > I'm about to adopt the policy that I'll run any failing tests on the > > code on an unaltered tree and if they fail on the unaltered tree I'll check stuff > in anyway. > > That's poor policy at best, and on the way to "the hell with the > > testing" as an attitude. Testing is getting in the way of progress in > > my case, not helping me not break things. > > > > Or my particular system (OS x, Lion) is just screwed up and I've been > > too lazy to dig enough to understand why... > > > > Erick@FrustratedOnASundayMorning > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] For > > additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] For additional > commands, e-mail: [EMAIL PROTECTED]
-
Re: being a good citizen is hard when you can't successfully run tests....Dawid Weiss 2012-09-16, 16:02
> have no clue what it means to terminate the test run at this point.
> Are there tests that haven't been run yet that won't get run if I > ctrl-C? I don't know.... > Just to complete here, the answer is that hitting ctrl-c will terminate your ant and all the forked processes including the test that was hanging (or busy doing whatever it was doing). Yes there might be other suites still waiting to be ececuted on that jvm which was stalled. Again, this has nothing to do with the frameworj hopefully. These tests would just hang on a normal ant junit task without any feedback for the user like you...
-
Re: being a good citizen is hard when you can't successfully run tests....Dawid Weiss 2012-09-16, 17:48
> Or my particular system (OS x, Lion) is just screwed up and I've been
> too lazy to dig enough to understand why... Erick, is this a SSD system or a spindle? Did that test complete? Can you provide the master seed so that I can take a look? Dawid ---------------------------------------------------------------------
-
RE: being a good citizen is hard when you can't successfully run tests....Steven A Rowe 2012-09-16, 18:26
I always run both Lucene and Solr tests. Yes, Solr tests are more likely to fail than Lucene tests. When that happens, I see if I can repro with the same seed (almost never works), then run the remaining modules' tests.
Related question: how hard would it be to set up Ant testing to be like the maven --fail-at-end test option? That way at least when failures do occur, they wouldn't block other testing. Steve -----Original Message----- From: Uwe Schindler [mailto:[EMAIL PROTECTED]] Sent: Sunday, September 16, 2012 11:45 AM To: [EMAIL PROTECTED] Subject: RE: being a good citizen is hard when you can't successfully run tests.... I generally never run Solr tests. When I changed smthg in Lucene, I just run ant validate (not precommit) to see if it compiles and let the rest does Jenkins. I am tired of waiting for Solr tests, they are sometimes passing sometimes not, sometimes take hours or sometimes obviously also drink my beer when I am away from my computer. Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [EMAIL PROTECTED] > -----Original Message----- > From: Erick Erickson [mailto:[EMAIL PROTECTED]] > Sent: Sunday, September 16, 2012 4:55 PM > To: [EMAIL PROTECTED] > Subject: being a good citizen is hard when you can't successfully run tests.... > > Unit tests are good. We all know that. But I'm becoming increasingly frustrated > at trying to run them. I've been working on LUCENE-4326 for a while (ok, > intermittently, but...). I've been almost unable to successfully run "ant test" at > the top level, I'm back to the message: > > HEARTBEAT J1: 2012-09-16T10:19:32, no events in: 183s, approx. at: > TestReplicationHandler.test > > going on forever, or at least 1,800+ seconds and counting right now. I have no > clue what it means to terminate the test run at this point. > Are there tests that haven't been run yet that won't get run if I ctrl-C? I don't > know.... > > OK, I can wait for a long time and hope it terminates sometime, which it has in > the past. Eventually. Maybe. Which makes trying to actually _use_ the tests > frustrating at best and I would guess intimidating as hell for people who do > even less coding than I do... > > I can terminate the tests and grep for "reproduce with" or "FAILURE" > in the output file. I can run any failing tests on an unaltered branch (which may > well miss stuff if the tests terminate without completing).... I can do a lot of > things that involve checking in code without successfully doing what it says on > the "how to contribute" > page. I see a build target "jenkins-hourly" that seems promising, is it enough? If > so, I'll change the "how to contribute" page.... > > So what's the story? Given the pace that fixes flow into the system, others > aren't having the trouble I'm having or no new code would get checked in. So > I've got to assume there's a process that's not documented that people are > using in order to make progress. If there is such a process, we need to make it > plain on the "How to contribute" > page, not have it be something that each of us has to create our own private > way of coping. Or fix the system so this doesn't happen all the time (Yeah, I > know, I should feel free <G>). > > I'm about to adopt the policy that I'll run any failing tests on the code on an > unaltered tree and if they fail on the unaltered tree I'll check stuff in anyway. > That's poor policy at best, and on the way to "the hell with the testing" as an > attitude. Testing is getting in the way of progress in my case, not helping me > not break things. > > Or my particular system (OS x, Lion) is just screwed up and I've been too lazy to > dig enough to understand why... > > Erick@FrustratedOnASundayMorning > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] For additional > commands, e-mail: [EMAIL PROTECTED] ---------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Yonik Seeley 2012-09-16, 18:32
On Sun, Sep 16, 2012 at 2:26 PM, Steven A Rowe <[EMAIL PROTECTED]> wrote:
> Related question: how hard would it be to set up Ant testing to be like the maven --fail-at-end test option? That definitely seems like an avenue worth pursuing. -Yonik http://lucidworks.com ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Dawid Weiss 2012-09-16, 18:42
> Related question: how hard would it be to set up Ant testing to be like the maven --fail-at-end test option? That way at least when failures do occur, they wouldn't block other testing.
The problem with this is that we have a multi-module ant build. junit4 _always_ runs all the tests, it never stops on the first failure. But it is invoked several times so every call is considered independent (and is). Now, I know I'm being a pain but read the output of 'ant test-help' carefully :) Quoting: # Run all tests without stopping on errors (inspect log files!). ant -Dtests.haltonfailure=false test What this does is it won't stop on errors after a junit4 task completes. The problem is, of course, that you'll need to embed some higher-level logic in Ant that will, for example, create a marker file based on a property and fail the build at the very end. What is "the very end" when you have antcalls? Hard to tell, no nesting information is available. Whatever the consensus is I'll stick with my previous opinion -- notoriously failing tests are bad, we shouldn't make it easier to skim over failing test cases. We should either fix (hard, I know) or disable (yes, we lose coverage) those flaky tests. Dawid ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Dawid Weiss 2012-09-16, 18:45
> going on forever, or at least 1,800+ seconds and counting right now. I
Again, for completeness -- I was on mobile -- the "suite timeout" or maximum time for all tests within a single class is currently set to: @TimeoutSuite(millis = 2 * TimeUnits.HOUR) I initially wanted this to me like 10 minutes but there are tests in nightly mode (and on slower machines) that really take so long to complete and since it's an annotation you can't tweak this dynamically. Dawid ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Robert Muir 2012-09-16, 18:51
On Sun, Sep 16, 2012 at 2:42 PM, Dawid Weiss
<[EMAIL PROTECTED]> wrote: > Whatever the consensus is I'll stick with my previous opinion -- > notoriously failing tests are bad, we shouldn't make it easier to skim > over failing test cases. We should either fix (hard, I know) or > disable (yes, we lose coverage) those flaky tests. > This is coming up every few weeks on the list now. A lot of energy being spent on ideas to dance around a few shitty solr tests. not so much energy spent fixing these few shitty solr tests, some of which (Like TestReplicationHandler) are totally useless and have been failing sporatically for like, years. -- lucidworks.com ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Erick Erickson 2012-09-16, 22:27
Dawid:
Well, this time it succeeded eventually, 2,400 seconds later. My machine has an SSD Master seed: [junit4:junit4] <JUnit4> says ¡Hola! Master seed: 6785BB3284A15298 The problem for me is this happens virtually all the time and I guess I just get impatient. Having a test that almost invariably takes 30 minutes or longer is hard to get used to. Of course it may just be something "exciting" about my setup. Hmmm, I wonder what happens if I bump up the memory.... I suppose I went down the wrong track. I'd assumed since this seems to happen (the test takes forever) virtually all the time for me it was happening for others too so there was some "tribal knowledge" out there to make it stop (other than adding @Ignore to the test) and others were finding ways to run differently. But if it's just a problem on my machine, that's something else again.... I've saved the full output file so I can send that to you if you need it. Erick On Sun, Sep 16, 2012 at 1:48 PM, Dawid Weiss <[EMAIL PROTECTED]> wrote: >> Or my particular system (OS x, Lion) is just screwed up and I've been >> too lazy to dig enough to understand why... > > Erick, is this a SSD system or a spindle? Did that test complete? Can > you provide the master seed so that I can take a look? > > Dawid > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Erick Erickson 2012-09-16, 23:53
58 minutes that time, master seed:
[junit4:junit4] <JUnit4> says cze??! Master seed: C38330C8877B30C9 On Sun, Sep 16, 2012 at 6:27 PM, Erick Erickson <[EMAIL PROTECTED]> wrote: > Dawid: > > Well, this time it succeeded eventually, 2,400 seconds later. > > My machine has an SSD > > Master seed: [junit4:junit4] <JUnit4> says ¡Hola! Master seed: 6785BB3284A15298 > > The problem for me is this happens virtually all the time and I guess > I just get impatient. > Having a test that almost invariably takes 30 minutes or longer is > hard to get used to. > > Of course it may just be something "exciting" about my setup. Hmmm, I > wonder what happens > if I bump up the memory.... > > I suppose I went down the wrong track. I'd assumed since this seems to > happen (the test > takes forever) virtually all the time for me it was happening for > others too so there was some > "tribal knowledge" out there to make it stop (other than adding > @Ignore to the test) and > others were finding ways to run differently. But if it's just a > problem on my machine, > that's something else again.... > > I've saved the full output file so I can send that to you if you need it. > > Erick > > > On Sun, Sep 16, 2012 at 1:48 PM, Dawid Weiss > <[EMAIL PROTECTED]> wrote: >>> Or my particular system (OS x, Lion) is just screwed up and I've been >>> too lazy to dig enough to understand why... >> >> Erick, is this a SSD system or a spindle? Did that test complete? Can >> you provide the master seed so that I can take a look? >> >> Dawid >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Yonik Seeley 2012-09-17, 01:10
On Sun, Sep 16, 2012 at 6:27 PM, Erick Erickson <[EMAIL PROTECTED]> wrote:
> I suppose I went down the wrong track. I'd assumed since this seems to > happen (the test > takes forever) virtually all the time for me it was happening for > others too so there was some > "tribal knowledge" out there to make it stop No, you've definitely hit on something Erick! I can now reproduce this issue on my OS-X Lion macbook 100% of the time, but not on any of my other machines. I never noticed it because due to memory/swapping issues I normally do development on the macbook, then rsync the current code to my linux box to run the tests. On my linux box (built in '09, PhenomII, HDD) the test takes 50-55 sec. On my kids old windows box ('08, athlon64, HDD, Win7) the test takes 88-95 sec. On my mac it always takes forever, and I see loops of stuff like this: [junit4:junit4] 2> 52748 T215 oash.SnapPuller.fetchLatestIndex SEVERE Master at: http://localhost:62803/solr is not available. Index fetch failed. Exception: org.apache.solr.client.solrj.SolrServerException: Server refused connection at: http://localhost:62803/solr [junit4:junit4] 2> 52751 T219 C17 UPDATE [collection1] webapp=/solr path=/update params={wt=javabin&version=2} {add=[150]} 0 0 [junit4:junit4] 2> 52755 T219 C17 UPDATE [collection1] webapp=/solr path=/update params={wt=javabin&version=2} {add=[151]} 0 0 [junit4:junit4] 2> 62758 T215 oash.SnapPuller.fetchLatestIndex SEVERE Master at: http://localhost:62803/solr is not available. Index fetch failed. Exception: org.apache.solr.client.solrj.SolrServerException: Server refused connection at: http://localhost:62803/solr [junit4:junit4] 2> 62761 T219 C17 UPDATE [collection1] webapp=/solr path=/update params={wt=javabin&version=2} {add=[152]} 0 1 [junit4:junit4] HEARTBEAT J0: 2012-09-16T20:46:49, no events in: 67.2s, approx. at: TestReplicationHandler.test [junit4:junit4] 2> 62786 T219 C17 UPDATE [collection1] webapp=/solr path=/update params={wt=javabin&version=2} {add=[153]} 0 1 [junit4:junit4] 2> 72787 T215 oash.SnapPuller.fetchLatestIndex SEVERE Master at: http://localhost:62803/solr is not available. Index fetch failed. Exception: org.apache.solr.client.solrj.SolrServerException: Server refused connection at: http://localhost:62803/solr When did you first notice this Erick? Anyone else that regularly runs tests on OS-X notice when this started happening? Anyway, please open a JIRA issue! -Yonik http://lucidworks.com ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Yonik Seeley 2012-09-17, 01:30
On Sun, Sep 16, 2012 at 2:51 PM, Robert Muir <[EMAIL PROTECTED]> wrote:
> not so much energy spent fixing these few shitty solr tests, some of > which (Like TestReplicationHandler) are totally useless and have been > failing sporatically for like, years. Can you explain why it's useless (without the derogatory adjectives please)? I didn't write the test to begin with, so I don't know off the top of my head all of the functionality it covers. I'd be surprised if it was all redundant and covered by other test suites of course. Notes: - I remember it passing for *long* periods of time - I just ran it in a loop 30 times on my linux box and it passed 100% of the time, and in a timely manner - It *has* found many bugs when it started failing (i.e. usefull, not useless) - Many of us (including you) *have* worked to improve the situation over time when it does deteriorate - check the logs. It's not clear what you are suggesting (unless you are volunteering to look into this issue with OS-X apparently, or volunteering to write a new replication test from scratch or something). -Yonik http://lucidworks.com ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Chris Male 2012-09-17, 01:51
On Mon, Sep 17, 2012 at 1:30 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On Sun, Sep 16, 2012 at 2:51 PM, Robert Muir <[EMAIL PROTECTED]> wrote: > > not so much energy spent fixing these few shitty solr tests, some of > > which (Like TestReplicationHandler) are totally useless and have been > > failing sporatically for like, years. > > Can you explain why it's useless (without the derogatory adjectives > please)? > I'm not wanting to get into issues of usefulness of tests or not, but I did just look at the build failure messages over the last few months and I've received a build failure message for this test almost every single day. I appreciate that this doesn't happen locally and makes it hard to fix, but it's hard to work with continuous integration that so commonly fails on one test. > > I didn't write the test to begin with, so I don't know off the top of > my head all of the functionality it covers. I'd be surprised if it > was all redundant and covered by other test suites of course. > > Notes: > - I remember it passing for *long* periods of time > - I just ran it in a loop 30 times on my linux box and it passed 100% > of the time, and in a timely manner > - It *has* found many bugs when it started failing (i.e. usefull, not > useless) > - Many of us (including you) *have* worked to improve the situation > over time when it does deteriorate - check the logs. > > It's not clear what you are suggesting (unless you are volunteering to > look into this issue with OS-X apparently, or volunteering to write a > new replication test from scratch or something). > > -Yonik > http://lucidworks.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Chris Male | Open Source Search Developer | elasticsearch | www.e<http://www.dutchworks.nl> lasticsearch.com
-
Re: being a good citizen is hard when you can't successfully run tests....Robert Muir 2012-09-17, 01:53
On Sun, Sep 16, 2012 at 9:30 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On Sun, Sep 16, 2012 at 2:51 PM, Robert Muir <[EMAIL PROTECTED]> wrote: >> not so much energy spent fixing these few shitty solr tests, some of >> which (Like TestReplicationHandler) are totally useless and have been >> failing sporatically for like, years. > > Can you explain why it's useless (without the derogatory adjectives please)? it fails multiple times a day, that makes it useless. its like the boy that cried wolf, nobody is even taking the time to look at it. its unmaintained, dead code. > - Many of us (including you) *have* worked to improve the situation > over time when it does deteriorate - check the logs. i worked on it a while ago because it was failing every day and nobody else was interested in fixing it. > > It's not clear what you are suggesting (unless you are volunteering to > look into this issue with OS-X apparently, or volunteering to write a > new replication test from scratch or something). I'm volunteering to disable this test: I put my time into it already, I dont care about replication, I just want the test to stop failing every day. -- lucidworks.com ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Yonik Seeley 2012-09-17, 01:55
On Sun, Sep 16, 2012 at 9:10 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> No, you've definitely hit on something Erick! > I can now reproduce this issue on my OS-X Lion macbook 100% of the > time, but not on any of my other machines. Further, it seems to work fine when run from my IDE (intellij) on OS-X and completes in about 35 seconds. It goes back to being problematic if I try "ant test -Dtestcase=TestReplicationHandler" -Yonik http://lucidworks.com ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Yonik Seeley 2012-09-17, 02:02
On Sun, Sep 16, 2012 at 9:53 PM, Robert Muir <[EMAIL PROTECTED]> wrote:
> I dont care about replication Yeah, I know. That's the crux of some of the biggest problems here. > I'm volunteering to disable this test -1 -Yonik http://lucidworks.com ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Erick Erickson 2012-09-17, 02:06
https://issues.apache.org/jira/browse/SOLR-3846
Actually, despite being frustrated, I'm also encouraged. Here I thought this was a problem others were seeing, but were working around. Kinda restores my faith in the process.... Since I can get this very reliably, anyone want to point me at where the problem might lie? Or let me know if there's a patch to test and I'll hammer it.... Erick On Sun, Sep 16, 2012 at 10:02 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On Sun, Sep 16, 2012 at 9:53 PM, Robert Muir <[EMAIL PROTECTED]> wrote: >> I dont care about replication > > Yeah, I know. That's the crux of some of the biggest problems here. > >> I'm volunteering to disable this test > > -1 > > -Yonik > http://lucidworks.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Robert Muir 2012-09-17, 02:08
On Sun, Sep 16, 2012 at 10:02 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On Sun, Sep 16, 2012 at 9:53 PM, Robert Muir <[EMAIL PROTECTED]> wrote: >> I dont care about replication > > Yeah, I know. That's the crux of some of the biggest problems here. Really? Its somehow my responsibility to fix this test and my fault that its broken? I think you have seriously lost your mind. -- lucidworks.com ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Mark Miller 2012-09-17, 03:10
I get value from this test - if it was disabled, I'd probably re-enable it.
would be great if it didn't fail so much, but the type of fail tells me something. I work on improving tests all the time - I also need to fit in a life and further dev. Getting large scale integration tests to pass reliably on so many different systems is hard! Lucene tests are easy in comparison. I'd rather have a test that tells me something based on the fail than no coverage. If someone doesn't want to help improve the test, fine - doesn't mean it should be disabled. It passes 100% of the time on my machine. I've worked on both Lucene and solr tests. Due to the nature of each, it's easy to make solid Lucene tests and hard to make solid solr tests in many cases. Boo hoo. Library vs large application with many integration tests. On Sunday, September 16, 2012, Robert Muir wrote: > On Sun, Sep 16, 2012 at 10:02 PM, Yonik Seeley <[EMAIL PROTECTED]<javascript:;>> > wrote: > > On Sun, Sep 16, 2012 at 9:53 PM, Robert Muir <[EMAIL PROTECTED]<javascript:;>> > wrote: > >> I dont care about replication > > > > Yeah, I know. That's the crux of some of the biggest problems here. > > Really? Its somehow my responsibility to fix this test and my fault > that its broken? > > I think you have seriously lost your mind. > > -- > lucidworks.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] <javascript:;> > For additional commands, e-mail: [EMAIL PROTECTED] <javascript:;> > > -- - Mark
-
Re: being a good citizen is hard when you can't successfully run tests....Dawid Weiss 2012-09-17, 07:06
> life and further dev. Getting large scale integration tests to pass reliably
> on so many different systems is hard! Lucene tests are easy in comparison. Nobody blames you Mark. I filed this a while ago -- https://issues.apache.org/jira/browse/SOLR-3766 but didn't have time to work on it. The idea aas to have a separate test plan for running tests, possibly without stopping on failures, and just record failures (maybe jenkins has this feature already, don't know). This way you'd see a history of failures and could tell the trend, so to speak. Obviously this has a disadvantage of not running those tests on every build machine out there but it would be a middle ground to resolve the issue, I think? Anyway, does Apache infrastructure make macs available for testing too? Could we apply for one? It'd be nice to have a mac running our tests as well from time to time. Dawid ---------------------------------------------------------------------
-
RE: being a good citizen is hard when you can't successfully run tests....Uwe Schindler 2012-09-17, 07:13
There is one OS-X slave, but ist down for most of the time:
https://builds.apache.org/computer/ I was thinking yesterday to clone my local VirtualBOX VM running OS-X... ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [EMAIL PROTECTED] > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of > Dawid Weiss > Sent: Monday, September 17, 2012 9:06 AM > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: Re: being a good citizen is hard when you can't successfully run tests.... > > > life and further dev. Getting large scale integration tests to pass > > reliably on so many different systems is hard! Lucene tests are easy in > comparison. > > Nobody blames you Mark. I filed this a while ago -- > > https://issues.apache.org/jira/browse/SOLR-3766 > > but didn't have time to work on it. The idea aas to have a separate test plan for > running tests, possibly without stopping on failures, and just record failures > (maybe jenkins has this feature already, don't know). This way you'd see a > history of failures and could tell the trend, so to speak. > > Obviously this has a disadvantage of not running those tests on every build > machine out there but it would be a middle ground to resolve the issue, I think? > Anyway, does Apache infrastructure make macs available for testing too? Could > we apply for one? It'd be nice to have a mac running our tests as well from > time to time. > > Dawid > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] For additional > commands, e-mail: [EMAIL PROTECTED] ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Dawid Weiss 2012-09-17, 07:30
> I was thinking yesterday to clone my local VirtualBOX VM running OS-X...
Does it run on non-apple hardware? I thought you can only run osx on apple hardware :) D. ---------------------------------------------------------------------
-
RE: being a good citizen is hard when you can't successfully run tests....Uwe Schindler 2012-09-17, 07:35
The server edition runs everywhere :-) Just not the client.
My problem is that I don’t have a donated License so I don’t want to do this :( ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [EMAIL PROTECTED] > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of > Dawid Weiss > Sent: Monday, September 17, 2012 9:30 AM > To: [EMAIL PROTECTED] > Subject: Re: being a good citizen is hard when you can't successfully run tests.... > > > I was thinking yesterday to clone my local VirtualBOX VM running OS-X... > > Does it run on non-apple hardware? I thought you can only run osx on apple > hardware :) > > D. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] For additional > commands, e-mail: [EMAIL PROTECTED] ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Robert Muir 2012-09-17, 11:58
On Sun, Sep 16, 2012 at 11:10 PM, Mark Miller <[EMAIL PROTECTED]> wrote:
> I get value from this test - if it was disabled, I'd probably re-enable it. > would be great if it didn't fail so much, but the type of fail tells me > something. That means the assert in question isnt important at all. I'll remove it. Again my problem is the idea that having a failing build is "ok" because certain types of failures "don't matter". If they dont matter they should be removed. It causes a ton of noise when people are lazy about tests in this way, and it wastes a ton of peoples time. R Remember every time one of these tests fails it sends an email, that I must read (we don't yet have a way to put in the subject header its a SOLR test fail versus a LUCENE one, or i'd filter the solr ones and not be complaining as much). -- lucidworks.com ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Michael McCandless 2012-09-17, 12:42
I agree that a test that frequently fails, and does not get fixed, is
nearly pointless: everybody ignores it so it's as if the test didn't exist. And so it should be disabled. I say *nearly* because the failures are in fact useful to devs who do have the itch/time to debug/fix them. So I think we need some middle ground here, where the tests keep failing but only those that are interested in the failures see the notifications. We need to switch from a "push" model (any failure is broadcast to everybody) to a "pull" model (those devs that want to debug the failures go and check the logs), for such tests. When someone wants to make sure their change didn't break something (Erick's original use case) then these tests should not run. I like Dawid's idea (a separate test plan that Jenkins runs with these "difficult" tests, and it wouldn't email dev on failure). Mike McCandless http://blog.mikemccandless.com On Mon, Sep 17, 2012 at 7:58 AM, Robert Muir <[EMAIL PROTECTED]> wrote: > On Sun, Sep 16, 2012 at 11:10 PM, Mark Miller <[EMAIL PROTECTED]> wrote: >> I get value from this test - if it was disabled, I'd probably re-enable it. >> would be great if it didn't fail so much, but the type of fail tells me >> something. > > That means the assert in question isnt important at all. I'll remove it. > > Again my problem is the idea that having a failing build is "ok" > because certain types of failures "don't matter". If they dont matter > they should be removed. > > It causes a ton of noise when people are lazy about tests in this way, > and it wastes a ton of peoples time. R > > Remember every time one of these tests fails it sends an email, that I > must read (we don't yet have a way to put in the subject header its a > SOLR test fail versus a LUCENE one, or i'd filter the solr ones and > not be complaining as much). > > -- > lucidworks.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Dawid Weiss 2012-09-17, 12:45
I think we can even integrate hossman's suggestion and generate a stability
report like weekly or something. I will take a look at this this week but it is definitely something that will require everyone's consensus. Dawid Sent from mobile phone. On Sep 17, 2012 2:42 PM, "Michael McCandless" <[EMAIL PROTECTED]> wrote: > I agree that a test that frequently fails, and does not get fixed, is > nearly pointless: everybody ignores it so it's as if the test didn't > exist. And so it should be disabled. > > I say *nearly* because the failures are in fact useful to devs who do > have the itch/time to debug/fix them. > > So I think we need some middle ground here, where the tests keep > failing but only those that are interested in the failures see the > notifications. We need to switch from a "push" model (any failure is > broadcast to everybody) to a "pull" model (those devs that want to > debug the failures go and check the logs), for such tests. > > When someone wants to make sure their change didn't break something > (Erick's original use case) then these tests should not run. > > I like Dawid's idea (a separate test plan that Jenkins runs with these > "difficult" tests, and it wouldn't email dev on failure). > > Mike McCandless > > http://blog.mikemccandless.com > > On Mon, Sep 17, 2012 at 7:58 AM, Robert Muir <[EMAIL PROTECTED]> wrote: > > On Sun, Sep 16, 2012 at 11:10 PM, Mark Miller <[EMAIL PROTECTED]> > wrote: > >> I get value from this test - if it was disabled, I'd probably re-enable > it. > >> would be great if it didn't fail so much, but the type of fail tells me > >> something. > > > > That means the assert in question isnt important at all. I'll remove it. > > > > Again my problem is the idea that having a failing build is "ok" > > because certain types of failures "don't matter". If they dont matter > > they should be removed. > > > > It causes a ton of noise when people are lazy about tests in this way, > > and it wastes a ton of peoples time. R > > > > Remember every time one of these tests fails it sends an email, that I > > must read (we don't yet have a way to put in the subject header its a > > SOLR test fail versus a LUCENE one, or i'd filter the solr ones and > > not be complaining as much). > > > > -- > > lucidworks.com > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
-
Re: being a good citizen is hard when you can't successfully run tests....Erick Erickson 2012-09-17, 12:51
Hmmm, in my other digression I thought I was on this thread.
No disagreement at all with the "stability report" idea etc. To address being able to run tests, does it make sense to create a new annotation "PeskyTest" or something? It would suit my use case, that people need to be able to reliably run all reasonable tests without much pain before checking code in. We could leave these tests running all the time for everyone, but add a note to "How To Contribute" about "feel free to check code in if it passes all tests when you disable PeskyTest" (and tell them how). By leaving in on by default, we'd get maximum test coverage on different machines/environments/whatever, but give people a good way to get the warm fuzzy that their new changes didn't break tests. FWIW, Erick On Mon, Sep 17, 2012 at 8:42 AM, Michael McCandless <[EMAIL PROTECTED]> wrote: > I agree that a test that frequently fails, and does not get fixed, is > nearly pointless: everybody ignores it so it's as if the test didn't > exist. And so it should be disabled. > > I say *nearly* because the failures are in fact useful to devs who do > have the itch/time to debug/fix them. > > So I think we need some middle ground here, where the tests keep > failing but only those that are interested in the failures see the > notifications. We need to switch from a "push" model (any failure is > broadcast to everybody) to a "pull" model (those devs that want to > debug the failures go and check the logs), for such tests. > > When someone wants to make sure their change didn't break something > (Erick's original use case) then these tests should not run. > > I like Dawid's idea (a separate test plan that Jenkins runs with these > "difficult" tests, and it wouldn't email dev on failure). > > Mike McCandless > > http://blog.mikemccandless.com > > On Mon, Sep 17, 2012 at 7:58 AM, Robert Muir <[EMAIL PROTECTED]> wrote: >> On Sun, Sep 16, 2012 at 11:10 PM, Mark Miller <[EMAIL PROTECTED]> wrote: >>> I get value from this test - if it was disabled, I'd probably re-enable it. >>> would be great if it didn't fail so much, but the type of fail tells me >>> something. >> >> That means the assert in question isnt important at all. I'll remove it. >> >> Again my problem is the idea that having a failing build is "ok" >> because certain types of failures "don't matter". If they dont matter >> they should be removed. >> >> It causes a ton of noise when people are lazy about tests in this way, >> and it wastes a ton of peoples time. R >> >> Remember every time one of these tests fails it sends an email, that I >> must read (we don't yet have a way to put in the subject header its a >> SOLR test fail versus a LUCENE one, or i'd filter the solr ones and >> not be complaining as much). >> >> -- >> lucidworks.com >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Chris Male 2012-09-17, 12:58
On Tue, Sep 18, 2012 at 12:45 AM, Dawid Weiss <[EMAIL PROTECTED]> wrote:
> I think we can even integrate hossman's suggestion and generate a > stability report like weekly or something. > > I will take a look at this this week but it is definitely something that > will require everyone's consensus. > What would they add in addition to the test histories you can see on jenkins? > Dawid > > Sent from mobile phone. > On Sep 17, 2012 2:42 PM, "Michael McCandless" <[EMAIL PROTECTED]> > wrote: > >> I agree that a test that frequently fails, and does not get fixed, is >> nearly pointless: everybody ignores it so it's as if the test didn't >> exist. And so it should be disabled. >> >> I say *nearly* because the failures are in fact useful to devs who do >> have the itch/time to debug/fix them. >> >> So I think we need some middle ground here, where the tests keep >> failing but only those that are interested in the failures see the >> notifications. We need to switch from a "push" model (any failure is >> broadcast to everybody) to a "pull" model (those devs that want to >> debug the failures go and check the logs), for such tests. >> >> When someone wants to make sure their change didn't break something >> (Erick's original use case) then these tests should not run. >> >> I like Dawid's idea (a separate test plan that Jenkins runs with these >> "difficult" tests, and it wouldn't email dev on failure). >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> On Mon, Sep 17, 2012 at 7:58 AM, Robert Muir <[EMAIL PROTECTED]> wrote: >> > On Sun, Sep 16, 2012 at 11:10 PM, Mark Miller <[EMAIL PROTECTED]> >> wrote: >> >> I get value from this test - if it was disabled, I'd probably >> re-enable it. >> >> would be great if it didn't fail so much, but the type of fail tells me >> >> something. >> > >> > That means the assert in question isnt important at all. I'll remove it. >> > >> > Again my problem is the idea that having a failing build is "ok" >> > because certain types of failures "don't matter". If they dont matter >> > they should be removed. >> > >> > It causes a ton of noise when people are lazy about tests in this way, >> > and it wastes a ton of peoples time. R >> > >> > Remember every time one of these tests fails it sends an email, that I >> > must read (we don't yet have a way to put in the subject header its a >> > SOLR test fail versus a LUCENE one, or i'd filter the solr ones and >> > not be complaining as much). >> > >> > -- >> > lucidworks.com >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: [EMAIL PROTECTED] >> > For additional commands, e-mail: [EMAIL PROTECTED] >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> -- Chris Male | Open Source Search Developer | elasticsearch | www.e<http://www.dutchworks.nl> lasticsearch.com
-
Re: being a good citizen is hard when you can't successfully run tests....Dawid Weiss 2012-09-17, 13:10
> To address being able to run tests, does it make sense
> to create a new annotation "PeskyTest" or something? It would There is an annotation for this already, it's called @BadApple and I marked a few tests with it. To disable those tests you'd run with: -Dtests.badapples=false But I see all the annotations I added have been removed so you'd need to add them again. Dawid ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Dawid Weiss 2012-09-17, 13:11
> What would they add in addition to the test histories you can see on
> jenkins? Is there a per-test history on jenkins too? I'm more familiar with Atlassian Bamboo. Obviously if it already is in Jenkins there's no need to do anything other than just run tests with -Dtests.haltonfailure=false I'm wondering if jenkins also considers a build "failed" if tests fail but ant returns with success (i.e. does it parse log XMLs and derive this information from there)? ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Chris Male 2012-09-17, 13:16
On Tue, Sep 18, 2012 at 1:11 AM, Dawid Weiss
<[EMAIL PROTECTED]>wrote: > > What would they add in addition to the test histories you can see on > > jenkins? > > Is there a per-test history on jenkins too? I'm more familiar with > Atlassian Bamboo. Obviously if it already is in Jenkins there's no > need to do anything other than just run tests with > Yeah there is. It's a little messy and hard to navigate, but an example: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java6/661/testReport/junit/org.apache.solr.cloud/SyncSliceTest/testDistribSearch/history/ (wait for it to load) > -Dtests.haltonfailure=false > > I'm wondering if jenkins also considers a build "failed" if tests fail > but ant returns with success (i.e. does it parse log XMLs and derive > this information from there)? > No idea sorry. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Chris Male | Open Source Search Developer | elasticsearch | www.e<http://www.dutchworks.nl> lasticsearch.com
-
Re: being a good citizen is hard when you can't successfully run tests....Dawid Weiss 2012-09-17, 13:19
> Yeah there is. It's a little messy and hard to navigate, but an example:
Thanks Chris, this looks like what I wanted. >> I'm wondering if jenkins also considers a build "failed" if tests fail >> but ant returns with success (i.e. does it parse log XMLs and derive >> this information from there)? I'll check later on, no worries. D. ---------------------------------------------------------------------
-
RE: being a good citizen is hard when you can't successfully run tests....Uwe Schindler 2012-09-17, 13:31
It only fails build if exit status of ANT is signaling failure. It parses the XML output and will present the statistics, but not fail. If we run with -Dtests.haltonfailure=false, it will always pass the build (unless some non-test failures like compile errors occur). No mails will be sent on failing tests then.
----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [EMAIL PROTECTED] > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of > Dawid Weiss > Sent: Monday, September 17, 2012 3:12 PM > To: [EMAIL PROTECTED] > Subject: Re: being a good citizen is hard when you can't successfully run tests.... > > > What would they add in addition to the test histories you can see on > > jenkins? > > Is there a per-test history on jenkins too? I'm more familiar with Atlassian > Bamboo. Obviously if it already is in Jenkins there's no need to do anything > other than just run tests with > > -Dtests.haltonfailure=false > > I'm wondering if jenkins also considers a build "failed" if tests fail but ant > returns with success (i.e. does it parse log XMLs and derive this information > from there)? > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] For additional > commands, e-mail: [EMAIL PROTECTED] ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Dawid Weiss 2012-09-17, 13:37
Thanks Uwe. I've created this plan to test things out:
https://builds.apache.org/job/Lucene-BadApples-trunk-java7/ > It only fails build if exit status of ANT is signaling failure. I think what I'll do is make a second build step in which I'll just manually parse those XML files (or scan for an XPath expression) and fail the build if any of the tests failed. D. ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Erick Erickson 2012-09-17, 13:53
And you'll see that on the "How To Contribute" page in a few minutes.....
Erick On Mon, Sep 17, 2012 at 9:10 AM, Dawid Weiss <[EMAIL PROTECTED]> wrote: >> To address being able to run tests, does it make sense >> to create a new annotation "PeskyTest" or something? It would > > There is an annotation for this already, it's called @BadApple and I > marked a few tests with it. To disable those tests you'd run with: > > -Dtests.badapples=false > > But I see all the annotations I added have been removed so you'd need > to add them again. > > Dawid > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Mark Miller 2012-09-17, 13:54
It's not so simple. If the replication test fails with the common fail I tend to see of somehow a searcher not getting closed, I know that's not a big deal for that test.
If it failed on something else, I know there might actually be a problem. It has value beyond anyone wanting to debug or fix it. The other ideas about how to deal with this are much better than this terrible disable idea. That test in particular is critical coverage. The common fails I see from it (though pretty much never locally) are not at all critical to me, and so relatively low on my already huge priority list. The test certainly has major value beyond people being interested in debugging. If it never passes for you after you make changes, that's a big deal. On my machines it passes 99.9% of the time. And no, I did not write the test... Sent from my iPad On Sep 17, 2012, at 8:42 AM, Michael McCandless <[EMAIL PROTECTED]> wrote: > I agree that a test that frequently fails, and does not get fixed, is > nearly pointless: everybody ignores it so it's as if the test didn't > exist. And so it should be disabled. > > I say *nearly* because the failures are in fact useful to devs who do > have the itch/time to debug/fix them. > > So I think we need some middle ground here, where the tests keep > failing but only those that are interested in the failures see the > notifications. We need to switch from a "push" model (any failure is > broadcast to everybody) to a "pull" model (those devs that want to > debug the failures go and check the logs), for such tests. > > When someone wants to make sure their change didn't break something > (Erick's original use case) then these tests should not run. > > I like Dawid's idea (a separate test plan that Jenkins runs with these > "difficult" tests, and it wouldn't email dev on failure). > > Mike McCandless > > http://blog.mikemccandless.com > > On Mon, Sep 17, 2012 at 7:58 AM, Robert Muir <[EMAIL PROTECTED]> wrote: >> On Sun, Sep 16, 2012 at 11:10 PM, Mark Miller <[EMAIL PROTECTED]> wrote: >>> I get value from this test - if it was disabled, I'd probably re-enable it. >>> would be great if it didn't fail so much, but the type of fail tells me >>> something. >> >> That means the assert in question isnt important at all. I'll remove it. >> >> Again my problem is the idea that having a failing build is "ok" >> because certain types of failures "don't matter". If they dont matter >> they should be removed. >> >> It causes a ton of noise when people are lazy about tests in this way, >> and it wastes a ton of peoples time. R >> >> Remember every time one of these tests fails it sends an email, that I >> must read (we don't yet have a way to put in the subject header its a >> SOLR test fail versus a LUCENE one, or i'd filter the solr ones and >> not be complaining as much). >> >> -- >> lucidworks.com >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Robert Muir 2012-09-17, 13:58
On Mon, Sep 17, 2012 at 9:54 AM, Mark Miller <[EMAIL PROTECTED]> wrote:
> It's not so simple. If the replication test fails with the common fail I tend to see of somehow a searcher not getting closed, I know that's not a big deal for that test. > > If it failed on something else, I know there might actually be a problem. OK, well thats why i'm suggesting in such a case, where: 1. this fail isn't considered very important and 2. its failing every day or multiple times a day and 3. nobody is planning on fixing it anytime soon that we disable the assert so that when a real fail happens, we know its a problem. See my commit: http://svn.apache.org/viewvc?rev=1386588&view=rev I feel like we can do these targeted assume()'s or whatever (i limited this to TestReplicationHandler only on freebsd initially, that seems to be where it happens the most), and thats very simple. then our build isn't crying wolf. -- lucidworks.com ---------------------------------------------------------------------
-
Re: being a good citizen is hard when you can't successfully run tests....Mark Miller 2012-09-17, 14:49
On Sep 17, 2012, at 6:58 AM, Robert Muir <[EMAIL PROTECTED]> wrote:
> On Mon, Sep 17, 2012 at 9:54 AM, Mark Miller <[EMAIL PROTECTED]> wrote: >> It's not so simple. If the replication test fails with the common fail I tend to see of somehow a searcher not getting closed, I know that's not a big deal for that test. >> >> If it failed on something else, I know there might actually be a problem. > > OK, well thats why i'm suggesting in such a case, where: > > 1. this fail isn't considered very important > and > 2. its failing every day or multiple times a day > and > 3. nobody is planning on fixing it anytime soon > > that we disable the assert so that when a real fail happens, we know > its a problem. > > See my commit: http://svn.apache.org/viewvc?rev=1386588&view=rev > > I feel like we can do these targeted assume()'s or whatever (i limited > this to TestReplicationHandler only on freebsd initially, that seems > to be where it happens the most), and thats very simple. then our > build isn't crying wolf. > And I'm not arguing against that, or other creative solutions - I'm arguing against simply disabling the test. - Mark --------------------------------------------------------------------- |