|
Chris Hostetter
2012-05-04, 22:29
Yonik Seeley
2012-05-05, 03:07
Dawid Weiss
2012-05-05, 07:58
Mikhail Khludnev
2012-05-05, 09:24
Dawid Weiss
2012-05-05, 09:49
Dawid Weiss
2012-05-06, 18:39
Yonik Seeley
2012-05-06, 19:12
Dawid Weiss
2012-05-06, 19:38
Yonik Seeley
2012-05-06, 19:56
Chris Hostetter
2012-05-07, 17:37
Dawid Weiss
2012-05-09, 08:12
Chris Hostetter
2012-05-12, 18:55
Dawid Weiss
2012-05-13, 08:28
|
-
Annotation for "run this test, but don't fail build if it fails" ?Chris Hostetter 2012-05-04, 22:29
Dawid: With the new test runner you created, would it be possible to setup an annotation that we could use instead to indicate that a test should in fact be run, and if it fails, include the failure info in the build report, but do not fail the build? I'm thinking in particular about some of the test that multi-threaded tests that are currently marked @Ignore or @AwaitsFix because they sporadically fail on jenkins in our jail -- but that people haven't been able to reproduce consistently on local dev machines (or that some people have been able to reproduce, but not the people who understand the tests/code well enough to try and fix) As it stands right now, if somone wants to try and fix a complicated test that's disabled, they have to make a guess at the fix, un-@Ignore, then watch the next few/several builds patiently to see if / how-often it fails, then commit the @Ignore back, and repeat. If we could leave these tests running on every build, then we could at least monitor the relative frequency of the failures -- ie: "last week testFoo failed in 10% of the builds, this week it fails in every build, so somebody definiteily broke something" or "last week testFoor failed in 10% of the builds, and after my attempted hardening it only fails in 5% of the builds so i may be on to something." what do folks think? -Hoss ---------------------------------------------------------------------
-
Re: Annotation for "run this test, but don't fail build if it fails" ?Yonik Seeley 2012-05-05, 03:07
On Fri, May 4, 2012 at 6:29 PM, Chris Hostetter
<[EMAIL PROTECTED]> wrote: > > Dawid: > > With the new test runner you created, would it be possible to setup an > annotation that we could use instead to indicate that a test should in fact > be run, and if it fails, include the failure info in the build report, but > do not fail the build? > > I'm thinking in particular about some of the test that multi-threaded tests > that are currently marked @Ignore or @AwaitsFix because they sporadically > fail on jenkins in our jail -- but that people haven't been able to > reproduce consistently on local dev machines (or that some people have been > able to reproduce, but not the people who understand the tests/code well > enough to try and fix) > > As it stands right now, if somone wants to try and fix a complicated test > that's disabled, they have to make a guess at the fix, un-@Ignore, then > watch the next few/several builds patiently to see if / how-often it fails, > then commit the @Ignore back, and repeat. > > If we could leave these tests running on every build, then we could at least > monitor the relative frequency of the failures -- ie: "last week testFoo > failed in 10% of the builds, this week it fails in every build, so somebody > definiteily broke something" or "last week testFoor failed in 10% of the > builds, and after my attempted hardening it only fails in 5% of the builds > so i may be on to something." > > what do folks think? +1 Something like this is definitely needed. Some of the Solr tests that spin up multiple JVMs are particularly tough to get 100% bullet-proof on all platforms (esp this freebsd jail) and there is a lot of information in tests that occasionally fail (esp if said tests may be the *only* tests we have for certain functionalities). -Yonik ---------------------------------------------------------------------
-
Re: Annotation for "run this test, but don't fail build if it fails" ?Dawid Weiss 2012-05-05, 07:58
Not a problem. Ill be at work on monday. Can you file an issue an assign it
to me please? Im currently on mobile only. Dawid On May 5, 2012 5:08 AM, "Yonik Seeley" <[EMAIL PROTECTED]> wrote: > On Fri, May 4, 2012 at 6:29 PM, Chris Hostetter > <[EMAIL PROTECTED]> wrote: > > > > Dawid: > > > > With the new test runner you created, would it be possible to setup an > > annotation that we could use instead to indicate that a test should in > fact > > be run, and if it fails, include the failure info in the build report, > but > > do not fail the build? > > > > I'm thinking in particular about some of the test that multi-threaded > tests > > that are currently marked @Ignore or @AwaitsFix because they sporadically > > fail on jenkins in our jail -- but that people haven't been able to > > reproduce consistently on local dev machines (or that some people have > been > > able to reproduce, but not the people who understand the tests/code well > > enough to try and fix) > > > > As it stands right now, if somone wants to try and fix a complicated test > > that's disabled, they have to make a guess at the fix, un-@Ignore, then > > watch the next few/several builds patiently to see if / how-often it > fails, > > then commit the @Ignore back, and repeat. > > > > If we could leave these tests running on every build, then we could at > least > > monitor the relative frequency of the failures -- ie: "last week testFoo > > failed in 10% of the builds, this week it fails in every build, so > somebody > > definiteily broke something" or "last week testFoor failed in 10% of the > > builds, and after my attempted hardening it only fails in 5% of the > builds > > so i may be on to something." > > > > what do folks think? > > +1 > > Something like this is definitely needed. > Some of the Solr tests that spin up multiple JVMs are particularly > tough to get 100% bullet-proof on all platforms (esp this freebsd > jail) and there is a lot of information in tests that occasionally > fail (esp if said tests may be the *only* tests we have for certain > functionalities). > > -Yonik > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
-
Re: Annotation for "run this test, but don't fail build if it fails" ?Mikhail Khludnev 2012-05-05, 09:24
It's a pretty useful technique especially from CI perspective. I use it via
JUnit's assumptions. Failed assumption is shown as an ignored test. On Sat, May 5, 2012 at 2:29 AM, Chris Hostetter <[EMAIL PROTECTED]>wrote: > > Dawid: > > With the new test runner you created, would it be possible to setup an > annotation that we could use instead to indicate that a test should in fact > be run, and if it fails, include the failure info in the build report, but > do not fail the build? > > I'm thinking in particular about some of the test that multi-threaded > tests that are currently marked @Ignore or @AwaitsFix because they > sporadically fail on jenkins in our jail -- but that people haven't been > able to reproduce consistently on local dev machines (or that some people > have been able to reproduce, but not the people who understand the > tests/code well enough to try and fix) > > As it stands right now, if somone wants to try and fix a complicated test > that's disabled, they have to make a guess at the fix, un-@Ignore, then > watch the next few/several builds patiently to see if / how-often it fails, > then commit the @Ignore back, and repeat. > > If we could leave these tests running on every build, then we could at > least monitor the relative frequency of the failures -- ie: "last week > testFoo failed in 10% of the builds, this week it fails in every build, so > somebody definiteily broke something" or "last week testFoor failed in 10% > of the builds, and after my attempted hardening it only fails in 5% of the > builds so i may be on to something." > > what do folks think? > > -Hoss > > ------------------------------**------------------------------**--------- > To unsubscribe, e-mail: [EMAIL PROTECTED]he.**org<[EMAIL PROTECTED]> > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Obviousnessly Yours Captain <http://www.griddynamics.com> <[EMAIL PROTECTED]>
-
Re: Annotation for "run this test, but don't fail build if it fails" ?Dawid Weiss 2012-05-05, 09:49
One other way to do it that is already implemented is to run full tests
without failing on failures and only touch some marker file to fail at the end. Ant test-help gives a hint on how to run tests this way currently. Dont know how it'd play with others - speak up. On May 5, 2012 11:25 AM, "Mikhail Khludnev" <[EMAIL PROTECTED]> wrote: > It's a pretty useful technique especially from CI perspective. I use it > via JUnit's assumptions. Failed assumption is shown as an ignored test. > > On Sat, May 5, 2012 at 2:29 AM, Chris Hostetter <[EMAIL PROTECTED]>wrote: > >> >> Dawid: >> >> With the new test runner you created, would it be possible to setup an >> annotation that we could use instead to indicate that a test should in fact >> be run, and if it fails, include the failure info in the build report, but >> do not fail the build? >> >> I'm thinking in particular about some of the test that multi-threaded >> tests that are currently marked @Ignore or @AwaitsFix because they >> sporadically fail on jenkins in our jail -- but that people haven't been >> able to reproduce consistently on local dev machines (or that some people >> have been able to reproduce, but not the people who understand the >> tests/code well enough to try and fix) >> >> As it stands right now, if somone wants to try and fix a complicated test >> that's disabled, they have to make a guess at the fix, un-@Ignore, then >> watch the next few/several builds patiently to see if / how-often it fails, >> then commit the @Ignore back, and repeat. >> >> If we could leave these tests running on every build, then we could at >> least monitor the relative frequency of the failures -- ie: "last week >> testFoo failed in 10% of the builds, this week it fails in every build, so >> somebody definiteily broke something" or "last week testFoor failed in 10% >> of the builds, and after my attempted hardening it only fails in 5% of the >> builds so i may be on to something." >> >> what do folks think? >> >> -Hoss >> >> ------------------------------**------------------------------**--------- >> To unsubscribe, e-mail: [EMAIL PROTECTED]he.**org<[EMAIL PROTECTED]> >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> > -- > Obviousnessly Yours > Captain > > <http://www.griddynamics.com> > <[EMAIL PROTECTED]> > >
-
Re: Annotation for "run this test, but don't fail build if it fails" ?Dawid Weiss 2012-05-06, 18:39
So, I started thinking about it -- I can implement something that will
report failures (much like we do right now) it's quite tricky to fit it into the reporting system and continuous integration system. Here's why -- if a test doesn't fail then its output (sysout/syserrs) are not currently printed (to provide a cleaner view of what's been executed). Verbose log is on disk but it'd have to be scanned by hand (and copied as a build artifact). Yet another problem is that jenkins wouldn't _fail_ on such pseudo-failures because the set of JUnit statuses is not extensible (it'd be something like FAILED+IGNORE) so we'd need to either go with IGNORED, ASSUMPTION_IGNORED or SUCCESS, none of which are a good match, really. ASSUMPTION_IGNORED status is probably most convenient here because of how it can be technically propagated back to JUnit. Any ideas? Hoss -- how do you envision "monitoring" of these tests? Manually? Dawid > If we could leave these tests running on every build, then we could at least > monitor the relative frequency of the failures -- ie: "last week testFoo > failed in 10% of the builds, this week it fails in every build, so somebody > definiteily broke something" or "last week testFoor failed in 10% of the > builds, and after my attempted hardening it only fails in 5% of the builds > so i may be on to something." > > what do folks think? > > -Hoss > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > ---------------------------------------------------------------------
-
Re: Annotation for "run this test, but don't fail build if it fails" ?Yonik Seeley 2012-05-06, 19:12
On Sun, May 6, 2012 at 2:39 PM, Dawid Weiss
<[EMAIL PROTECTED]> wrote: > Any ideas? Hoss -- how do you envision "monitoring" of these tests? Manually? If the tests are run many times a day, it would be great to get a daily report of the percent of time the tests pass. Then if it goes from 5% to 50%, we can go uh-oh... The crux of the problem remains that (for solr devs) it's still much more useful to have a test fail intermittently than to disable and not run the test at all. -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10 ---------------------------------------------------------------------
-
Re: Annotation for "run this test, but don't fail build if it fails" ?Dawid Weiss 2012-05-06, 19:38
> If the tests are run many times a day, it would be great to get a
> daily report of the percent of time the tests pass. Then if it goes > from 5% to 50%, we can go uh-oh... Yeah, well... but this is beyond the runner as it aggregates over time -- it looks like a jenkins plugin that would analyze test run logs and provide such statistics. I also admit I've never seen anything like this -- a suite of tests with an allowed failure ratio over time and a threshold that would trigger a warning... > The crux of the problem remains that (for solr devs) it's still much > more useful to have a test fail intermittently than to disable and not > run the test at all. These are weird tests if they allow for a (predictable?) failure from time to time. I don't say it's a bad concept, but I think unit tests may not be a good framework for handling this. Dawid ---------------------------------------------------------------------
-
Re: Annotation for "run this test, but don't fail build if it fails" ?Yonik Seeley 2012-05-06, 19:56
On Sun, May 6, 2012 at 3:38 PM, Dawid Weiss
<[EMAIL PROTECTED]> wrote: > I also admit I've never seen anything like > this -- a suite of tests with an allowed failure ratio over time and a > threshold that would trigger a warning... Not so much an "allowed" failure rate... more of "it fails sometimes and no one has had the time to try to get it to pass with a greater percentage of time". And even when people put effort into get it to pass more often, it's still not 100%. As those tests exist now, there are a few choices a) turn them off (this is bad because it seriously decreases coverage) b) somehow deal with the intermittent failures Given that we're not running on a realtime system, the fact that many higher level tests have timing and scheduling dependencies means that we will never achieve a 100% pass rate on such tests. > These are weird tests if they allow for a (predictable?) failure from > time to time. I don't say it's a bad concept, but I think unit tests > may not be a good framework for handling this. Yeah, these aren't really unit tests. Should we try to move them somewhere else? Or run them separately and email the results to a different list? -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10 ---------------------------------------------------------------------
-
Re: Annotation for "run this test, but don't fail build if it fails" ?Chris Hostetter 2012-05-07, 17:37
: as a build artifact). Yet another problem is that jenkins wouldn't : _fail_ on such pseudo-failures because the set of JUnit statuses is : not extensible (it'd be something like FAILED+IGNORE) so we'd need to That was really the main question i had, as someone not very familiar with the internals of JUnit, is wether it was possible for our test runner to make the ultimate decision about the success/fail status of the entire run based on the annotations of the tests that fail/succed I know that things like jenkins are totally fine with the idea of a build succeeding even if some of the junit testsuite.xml files contain failures (many projects don't have tests fail the build, but still report the test status -- it's one of the reasons jenkins has multiple degrees of "build health) but the key question is could we have our test runner say "test X failed, therefore the build should fail" but also "test Y failed, and test Y is annotated with @UnstableTest, therefore don't let that failure fail th entire build. : are a good match, really. ASSUMPTION_IGNORED status is probably most : convenient here because of how it can be technically propagated back Ultimatley i think it's important that these failures be reported as faulres -- because that's truly what they are -- we shouldn't try to sugar coat it, or pretend something happened that didn't. Ideally these tests should be fixed, and my hope is that if we stop @Ignore-ing them then they are more likeley to get fixed because people will see them run, see the frequency/inconsistency that they fail with and experiment with fixes to try and improve that. But in the meantime, it's reasonable to say "we know this test sometimes fails on jenkins, so let's not fail the whole build just because this is one of those times" : Any ideas? Hoss -- how do you envision "monitoring" of these tests? Manually? I think a Version 2.0 "feature" would be to see agregated historic stats on the pass/fail rate of every test, regardless of it's annotation, so we can see: a) statistically, how often does test X fail on jenkins? b) statistically, how often does test X fail on my box? c) statistically, how often does test X fail on your box? oh really - that's the same stats that Pete is seeing, but much higher then anyone else including jenkins and you both run Windows, so maybe there is a platform specific bug in the code and/or test? But as a shorter term less complicated goal would just be to say: "Tests with the @UnstableTest annotation are run, and their status is recorded just like any other test, but their success/failure doesn't impact the overall success/failure of the build. People who care about these test can monitor them directly" So effectively: if you care about the test, then you have data about it you can fetch from jenkins and/or any other machine running all tests, but you have to be proactive about it -- if you don't care about it, then it's just like if the test was @Ignored. If dealing with this entireley in the runner isn't possible because of the limited junit statuses (and how those test statuess effect the final suite status) then my strawman suggestion would be... 1) "ant test" - treats @UnstableTest the same as @AwaitsFix - fails the build if any test fails 2) "ant test-unstable" - *only* runs @UnstableTest tests - doesn't fail the build for any reason - puts the result XML files in the same place as "ant test" (so jenkins UI sees them) 3) jenkins runs "ant test test-unstable" * if a test is flat out broken / flawed in an easy to reproduce way we mark it @AwaitsFix * if a test is failing sporadically and in ways that are hard to reproduce, we mark it @UnstableTest * people doing experiments trying to fix/improve an @UnstableTest can dig through jenkins reports to see how that test is doing before/after various tweaks -Hoss ----------------------------------------------------------------
-
Re: Annotation for "run this test, but don't fail build if it fails" ?Dawid Weiss 2012-05-09, 08:12
> That was really the main question i had, as someone not very familiar with
> the internals of JUnit, is wether it was possible for our test runner to > make the ultimate decision about the success/fail status of the entire > run based on the annotations of the tests that fail/succed There are two things that need to be distinguished: 1) the "runner" is what's passed to @RunWith(Runner.class). The runner is what is given a suite class and runs its tests (propagating test execution events to any interested listeners). We use RandomizedRunner which manages certain things on top of the default JUnit runner (thread groups, custom annotation for seeds, custom test methods in junit3 style, etc.). 2) ant's task for executing JUnit tests (junit4). This one is responsible for collecting suites, forking jvms and managing listeners. It is also responsible for failing ant's build (by throwing an exception) if requested -- see "haltonfailure" property here http://labs.carrotsearch.com/download/randomizedtesting/1.3.0/docs/junit4-ant/Tasks/junit4.html. This is consistent with ANT's default runner. There is some confusion about the two -- in Lucene we use both but you could run suites annotated with RandomizedRunner using any container you want (including standard ANT's <junit> task). Unfortunately this also means that whether a build is failed or not is a direct consequence of if any of the tests failed or not. There are no other conditions for this (including complex conditions you mentioned). > I know that things like jenkins are totally fine with the idea of a build > succeeding even if some of the junit testsuite.xml files contain failures This is the "haltonfailure" option. You can actually override it from command line in Lucene build scripts and run full build without stopping on errors -- see ant test-help: [echo] # Run all tests without stopping on errors (inspect log files!). [echo] ant -Dtests.haltonfailure=false test > health) but the key question is could we have our test runner say "test > X failed, therefore the build should fail" but also "test Y failed, and > test Y is annotated with @UnstableTest, therefore don't let that failure > fail th entire build. Not really. Not in a way that would be elegant and fit into JUnit listener infrastructure. Read on. > Ultimatley i think it's important that these failures be reported as > faulres -- because that's truly what they are -- we shouldn't try to > sugar coat it, or pretend something happened that didn't. Ideally these +1. > I think a Version 2.0 "feature" would be to see agregated historic stats > on the pass/fail rate of every test, regardless of it's annotation, so we > can see: > a) statistically, how often does test X fail on jenkins? > b) statistically, how often does test X fail on my box? > c) statistically, how often does test X fail on your box? oh really - > that's the same stats that Pete is seeing, but much higher then anyone > else including jenkins and you both run Windows, so maybe there is a > platform specific bug in the code and/or test? This is an interesting idea and I think this could be done by adding a custom report and some marker in an assumption-ignore status... The history is doable (much like execution times currently)... but it's hackish. > 1) "ant test" > - treats @UnstableTest the same as @AwaitsFix > - fails the build if any test fails > 2) "ant test-unstable" > - *only* runs @UnstableTest tests > - doesn't fail the build for any reason > - puts the result XML files in the same place as "ant test" > (so jenkins UI sees them) > 3) jenkins runs "ant test test-unstable" This is doable by enabling/disabling test groups. A new build plan would need to be created that would do: ant -Dtests.haltonfailure=false -Dtests.awaitsfix=true -Dtests.unstable=true test both awaitsfix and unstable would be disabled by default (and assumption-ignored) so this wouldn't affect anybody else. The above would run all tests without stopping on errors. A post-processing script could then parse json reports (or XML reports) and collect historical statistics. Doable, but honestly this seems like more work (scripts for collecting stats, test groups are trivial) than trying to fix those two or three tests that fail? Dawid
-
Re: Annotation for "run this test, but don't fail build if it fails" ?Chris Hostetter 2012-05-12, 18:55
: This is doable by enabling/disabling test groups. A new build plan : would need to be created that would do: : : ant -Dtests.haltonfailure=false -Dtests.awaitsfix=true : -Dtests.unstable=true test right ...that's an idea that came up the other day when i was talking to simon at revolution another idea that came up this morning talking with rmuir is to have a new ant target "ie: test-needs-fix" that could be run as part of the same jenkins "run all tests continuously" build target that only runs the @AwaitsFix group, and overrides haltonfailure when calling the junit macro. (which would save us an extra jenkins run) : Doable, but honestly this seems like more work (scripts for collecting : stats, test groups are trivial) than trying to fix those two or three : tests that fail? but there are 3 distinct (in my mind) issues here tthat all of this helps address: 1) those "two or three" tests that fail and should be fixed ... i agree, we should fix them, but when you can't reproduce the failures at all, it's really hard to iterate and figure out what needs fixed. if jenkins is where they fail, we need a way for jenkins to run them so we can see if our attempted fixes work. 2) all of that is double painful when they only fail sporadically, it might take a week of constant jenkins testing to discover that your fix decreased the likelyhood of failure, but didn't completely fix the problem -- a test might fail an average of 1/5 times, and someone might start working on improving them, and reduce the failure rate to 1/20 times - they need to be running constantly (with some way to review the rate of failure) to make progress 3) we have some tests that demonstrate bugs no one has ever fixed because they are too hard to fix, or we don't have a good solution for them. some of these tests are commited but @Ignored, some are commited but commented out, some are sitting in patches in jira waiting for the patch to be expanded to include the code -- it would be nice if all of those tests could be commited, and uncommented, and run on every build, failing 100% of the time, so the known weaknesses in the code that now one has time/energy/ideas on how to fix would be more publicly visible, and people evaluating lucene/solr and looking at the tests could see in a very clear way "testDoSomeStuffWithFeatureXandFeatureYTogether() fails 100% of the time, so i probably shouldn't use X and Y together" I think if running "ant test test-needs-fix" executed all the normal tests (which fail the build) as well as all @AwaitsFix (which wouldn't fail the build) and just generated the normal junit test output, with the pretty graph showing all the failing @AwaitsFix tests in red, so people could realisticly see "some stuff doesn't work, and some of that stuff doesn't work sporadically" then we would benefit us in multiple ways (more data to help fix sporadically failing tests, more open honesty about what doesn't work) and when people fix @AwaitsFix tests, they just remove the annotation. -Hoss ---------------------------------------------------------------------
-
Re: Annotation for "run this test, but don't fail build if it fails" ?Dawid Weiss 2012-05-13, 08:28
> jenkins "run all tests continuously" build target that only runs the
> @AwaitsFix group, and overrides haltonfailure when calling the junit > macro. (which would save us an extra jenkins run) Yep, doable. > decreased the likelyhood of failure, but didn't completely fix the > problem -- a test might fail an average of 1/5 times, and someone might > start working on improving them, and reduce the failure rate to 1/20 You're making them stochastic which is an interesting direction for research :) I mean, typically the direction would be to make (unit) tests predictable and repeatable given the same starting conditions. Whether this is always possible is a different discussion, I understand it. > evaluating lucene/solr and looking at the tests could see in a very clear > way "testDoSomeStuffWithFeatureXandFeatureYTogether() fails 100% > of the time, so i probably shouldn't use X and Y together" Ok. > build) and just generated the normal junit test output, with the pretty > graph showing all the failing @AwaitsFix tests in red, so people could Don't know about the pretty graph but if jenkins includes individual test tracking (and I think it does) then just running those tests without causing a build failure would give you a time-based overview of their execution status. You would still need to know which tests to look at because I don't think jenkins tracks "frequently failing" tests (the way Atlassian's Bamboo does for example). It would be definitely possible to build an infrastructure that would generate the above from test logs we produce but I won't have the time to look into it in the nearest future given the number of issues assigned to me in Lucene (and elsewhere...). We can start small by adding that "no-fail" test run - this would still leave a track of failed logs so that when somebody has the time to write up that view over time they have some data to work with. This would be an interesting plugin to jenkins in general... perhaps they have a GSoC student or something and we could suggest it to them? Dawid --------------------------------------------------------------------- |