IMPORTANT - New Policy Proposal - No More Intermittent Test Failures

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

IMPORTANT - New Policy Proposal - No More Intermittent Test Failures

jtgreene
Administrator
We always have the problem of having a set of tests which fail one out
of 10 runs, but we leave the test around hoping one day someone will fix
it. The problem is no one does, and it makes regression catching hard.
Right now people that submit pull requests have to scan through test
results and ask around to figure out if they broke something or not.

So I propose a new policy. Any test which intermittently fails will be
ignored and a JIRA opened to the author for up to a month. If that test
is not passing in one month time, it will be removed from the codebase.

The biggest problem with this policy is that we might completely lose
coverage. A number of the clustering tests for example fail
intermittently, and if we removed them we would have no other coverage.
So for special cases like clustering, I am thinking of relocating them
to a different test run called "broken-clustering", or something like
that. This run would only be monitored by those working on clustering,
and would not be included in the main "all tests" run.

Any other ideas?

--
Jason T. Greene
JBoss AS Lead / EAP Platform Architect
JBoss, a division of Red Hat

_______________________________________________
jboss-as7-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/jboss-as7-dev
Reply | Threaded
Open this post in threaded view
|

Re: IMPORTANT - New Policy Proposal - No More Intermittent Test Failures

Ondrej Zizka
I'd also suggest to add an information to the docs of DMR operations whether they are sync or async.
Often I can see tests broken due to race condition caused by async operation, like unfinished removal of something in one test while being added in next test.

my2c
Ondra



Jason T. Greene píše v Po 09. 07. 2012 v 13:16 -0500:
We always have the problem of having a set of tests which fail one out 
of 10 runs, but we leave the test around hoping one day someone will fix 
it. The problem is no one does, and it makes regression catching hard. 
Right now people that submit pull requests have to scan through test 
results and ask around to figure out if they broke something or not.

So I propose a new policy. Any test which intermittently fails will be 
ignored and a JIRA opened to the author for up to a month. If that test 
is not passing in one month time, it will be removed from the codebase.

The biggest problem with this policy is that we might completely lose 
coverage. A number of the clustering tests for example fail 
intermittently, and if we removed them we would have no other coverage. 
So for special cases like clustering, I am thinking of relocating them 
to a different test run called "broken-clustering", or something like 
that. This run would only be monitored by those working on clustering, 
and would not be included in the main "all tests" run.

Any other ideas?



_______________________________________________
jboss-as7-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/jboss-as7-dev
Reply | Threaded
Open this post in threaded view
|

Re: IMPORTANT - New Policy Proposal - No More Intermittent Test Failures

Jason T. Greene
All management ops are synchronous, and execute serially. Maybe you are thinking of test ordering issues?

Sent from my iPhone

On Jul 9, 2012, at 9:30 PM, Ondřej Žižka <[hidden email]> wrote:

> I'd also suggest to add an information to the docs of DMR operations whether they are sync or async.
> Often I can see tests broken due to race condition caused by async operation, like unfinished removal of something in one test while being added in next test.
>
> my2c
> Ondra
>
>
>
> Jason T. Greene píše v Po 09. 07. 2012 v 13:16 -0500:
>>
>> We always have the problem of having a set of tests which fail one out
>> of 10 runs, but we leave the test around hoping one day someone will fix
>> it. The problem is no one does, and it makes regression catching hard.
>> Right now people that submit pull requests have to scan through test
>> results and ask around to figure out if they broke something or not.
>>
>> So I propose a new policy. Any test which intermittently fails will be
>> ignored and a JIRA opened to the author for up to a month. If that test
>> is not passing in one month time, it will be removed from the codebase.
>>
>> The biggest problem with this policy is that we might completely lose
>> coverage. A number of the clustering tests for example fail
>> intermittently, and if we removed them we would have no other coverage.
>> So for special cases like clustering, I am thinking of relocating them
>> to a different test run called "broken-clustering", or something like
>> that. This run would only be monitored by those working on clustering,
>> and would not be included in the main "all tests" run.
>>
>> Any other ideas?
>>
>

_______________________________________________
jboss-as7-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/jboss-as7-dev
Reply | Threaded
Open this post in threaded view
|

Re: IMPORTANT - New Policy Proposal - No More Intermittent Test Failures

Carlo de Wolf
In reply to this post by jtgreene
+1 on the policy.

On 07/09/2012 08:16 PM, Jason T. Greene wrote:
> We always have the problem of having a set of tests which fail one out
> of 10 runs, but we leave the test around hoping one day someone will fix
> it. The problem is no one does, and it makes regression catching hard.

And with a couple of such tests in a big code base you'll never get a
blue run.
> Right now people that submit pull requests have to scan through test
> results and ask around to figure out if they broke something or not.
>
> So I propose a new policy. Any test which intermittently fails will be
> ignored and a JIRA opened to the author for up to a month. If that test
> is not passing in one month time, it will be removed from the codebase.

What we had in the ejb3 testsuite was a mechanism which we could control
via an xml file:
http://viewvc.jboss.org/cgi-bin/viewvc.cgi/jbossas/projects/ejb3/trunk/testsuite/src/test/resources/known-issues.xml?revision=100056&view=markup
That way no code needs to be changed and you can easily do a
'known-issues' test run.

-1 on the removal. There is no incentive for the author to fix the test
at all and we should be careful with what we lose.
I would say: the author gets one month to fix it up, after that SET is
going to fix it up. During that time no contributions are honored from
that author.

>
> The biggest problem with this policy is that we might completely lose
> coverage. A number of the clustering tests for example fail
> intermittently, and if we removed them we would have no other coverage.
> So for special cases like clustering, I am thinking of relocating them
> to a different test run called "broken-clustering", or something like
> that. This run would only be monitored by those working on clustering,
> and would not be included in the main "all tests" run.
>
> Any other ideas?
>

At the end of the day, you are working the left side of the field and
you don't want to get jammed by bits coming from the right side. This is
why I think smaller integration suites (bringing together 2 or 3
components / techs) would make more sense instead of a large big bang
suite. In essence you're already on that path by separating a subset of
the clustering tests.

Carlo
_______________________________________________
jboss-as7-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/jboss-as7-dev
Reply | Threaded
Open this post in threaded view
|

Re: IMPORTANT - New Policy Proposal - No More Intermittent Test Failures

Darran Lofthouse
In reply to this post by jtgreene
Please implement this yesterday ;-)

On 07/09/2012 07:16 PM, Jason T. Greene wrote:

> We always have the problem of having a set of tests which fail one out
> of 10 runs, but we leave the test around hoping one day someone will fix
> it. The problem is no one does, and it makes regression catching hard.
> Right now people that submit pull requests have to scan through test
> results and ask around to figure out if they broke something or not.
>
> So I propose a new policy. Any test which intermittently fails will be
> ignored and a JIRA opened to the author for up to a month. If that test
> is not passing in one month time, it will be removed from the codebase.
>
> The biggest problem with this policy is that we might completely lose
> coverage. A number of the clustering tests for example fail
> intermittently, and if we removed them we would have no other coverage.
> So for special cases like clustering, I am thinking of relocating them
> to a different test run called "broken-clustering", or something like
> that. This run would only be monitored by those working on clustering,
> and would not be included in the main "all tests" run.
>
> Any other ideas?
>


_______________________________________________
jboss-as7-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/jboss-as7-dev
Reply | Threaded
Open this post in threaded view
|

Re: IMPORTANT - New Policy Proposal - No More Intermittent Test Failures

Ondrej Zizka
In reply to this post by Jason T. Greene
There are 3 tests which seemed to me as failing due to async ops.
https://issues.jboss.org/browse/JBPAPP-9377
They only fail in EC2 which is slow, so I thought it could be it.
Thanks for info, I'll look for different cause.

Ondra



On Mon, 2012-07-09 at 23:44 -0400, Jason Greene wrote:
All management ops are synchronous, and execute serially. Maybe you are thinking of test ordering issues? 

Sent from my iPhone

On Jul 9, 2012, at 9:30 PM, Ondřej Žižka <[hidden email]> wrote:

> I'd also suggest to add an information to the docs of DMR operations whether they are sync or async.
> Often I can see tests broken due to race condition caused by async operation, like unfinished removal of something in one test while being added in next test.
> 
> my2c
> Ondra
> 
> 
> 
> Jason T. Greene píše v Po 09. 07. 2012 v 13:16 -0500:
>> 
>> We always have the problem of having a set of tests which fail one out 
>> of 10 runs, but we leave the test around hoping one day someone will fix 
>> it. The problem is no one does, and it makes regression catching hard. 
>> Right now people that submit pull requests have to scan through test 
>> results and ask around to figure out if they broke something or not.
>> 
>> So I propose a new policy. Any test which intermittently fails will be 
>> ignored and a JIRA opened to the author for up to a month. If that test 
>> is not passing in one month time, it will be removed from the codebase.
>> 
>> The biggest problem with this policy is that we might completely lose 
>> coverage. A number of the clustering tests for example fail 
>> intermittently, and if we removed them we would have no other coverage. 
>> So for special cases like clustering, I am thinking of relocating them 
>> to a different test run called "broken-clustering", or something like 
>> that. This run would only be monitored by those working on clustering, 
>> and would not be included in the main "all tests" run.
>> 
>> Any other ideas?
>> 
> 


_______________________________________________
jboss-as7-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/jboss-as7-dev
Reply | Threaded
Open this post in threaded view
|

Re: IMPORTANT - New Policy Proposal - No More Intermittent Test Failures

jtgreene
Administrator
In reply to this post by jtgreene
Since there are no objections this policy is now in effect. I will send
an update once these have all been disabled/removed.

On 7/9/12 1:16 PM, Jason T. Greene wrote:

> We always have the problem of having a set of tests which fail one out
> of 10 runs, but we leave the test around hoping one day someone will fix
> it. The problem is no one does, and it makes regression catching hard.
> Right now people that submit pull requests have to scan through test
> results and ask around to figure out if they broke something or not.
>
> So I propose a new policy. Any test which intermittently fails will be
> ignored and a JIRA opened to the author for up to a month. If that test
> is not passing in one month time, it will be removed from the codebase.
>
> The biggest problem with this policy is that we might completely lose
> coverage. A number of the clustering tests for example fail
> intermittently, and if we removed them we would have no other coverage.
> So for special cases like clustering, I am thinking of relocating them
> to a different test run called "broken-clustering", or something like
> that. This run would only be monitored by those working on clustering,
> and would not be included in the main "all tests" run.
>
> Any other ideas?
>


--
Jason T. Greene
JBoss AS Lead / EAP Platform Architect
JBoss, a division of Red Hat


_______________________________________________
jboss-as7-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/jboss-as7-dev
Reply | Threaded
Open this post in threaded view
|

Re: IMPORTANT - New Policy Proposal - No More Intermittent Test Failures

Thomas Diesler
In reply to this post by jtgreene
I agree with part one of the policy - @Ignore the failing tests and
create a jira.

The silent removal of a test if nothing happens is problematic. Instead
I propose to subsequently increase priority on the issue. The person how
is supposed to make the next progress step is supposed to work on
blocking issues (i.e. the failing test) first before anything else would
get pulled. The issue can be closed as "won't fix" if there is agreement
(documented in the issue) that we don't loose significant test coverage
for a functional area.

On 07/09/2012 08:16 PM, Jason T. Greene wrote:

> We always have the problem of having a set of tests which fail one out
> of 10 runs, but we leave the test around hoping one day someone will fix
> it. The problem is no one does, and it makes regression catching hard.
> Right now people that submit pull requests have to scan through test
> results and ask around to figure out if they broke something or not.
>
> So I propose a new policy. Any test which intermittently fails will be
> ignored and a JIRA opened to the author for up to a month. If that test
> is not passing in one month time, it will be removed from the codebase.
>
> The biggest problem with this policy is that we might completely lose
> coverage. A number of the clustering tests for example fail
> intermittently, and if we removed them we would have no other coverage.
> So for special cases like clustering, I am thinking of relocating them
> to a different test run called "broken-clustering", or something like
> that. This run would only be monitored by those working on clustering,
> and would not be included in the main "all tests" run.
>
> Any other ideas?
>

--
xxxxxxxxxxxxxxxxxxxxxxxxxxxx
Thomas Diesler
JBoss OSGi Lead
JBoss, a division of Red Hat
xxxxxxxxxxxxxxxxxxxxxxxxxxxx



_______________________________________________
jboss-as7-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/jboss-as7-dev