Generally speaking, a well formed test pipeline should never time out, never have flaky tests in it and never return any false positives. In an ideal world, a pipeline will be fast and reliable. If the pipeline is not reliable however, there should be a task that forces developers to fix the pipeline with high prio.
Should I skip tests when the pipeline has flaky tests?
Generally speaking, there are way more arguments that speak against that than otherwise. Tests should never be skipped for the purpose of merging something into your Git master, for example. However, there are cases that are meant for another purpose besides that, that would allow the skip of particular tests. Lets have a look about some reasoning that could give you as eager developer some ideas about when it is okay to skip tests and when its definitely not okay.
Skipping tests is acceptable when:
- you don't mean to use the result of a test pipeline for merging attempts
- you want only a specific subset of tests to be run for optimization attempts (= research purposes)
- you have previously merged a test for some reason that is now breaking every single pipeline and want to remove this very test (but can't because the pipeline would fail otherwise)
- there is a critical business demand that is time critical and you can specify very precisely and securely what to test (and what don't need to test, but this case should not happen on a regular basis)
You should not skip tests when:
- there is a subset of flaky tests that fail individually (instead, just replay until it works, or group flaky tests and exclude them in the first place)
- it just takes too long
- you want to merge before anyone else
- you just don't like the pipeline
- underlying master has changed and the previous pipeline attempt was already fine
But that is not the whole story. It can be okay for teams to decide together to skip specific things under specific circumstances; after all most rules are meant to be broken in specific circumstances, generally speaking. Following rules because they are rules is usually (and historically) a very bad idea, because Software Development must be dynamically and flexible to adapt to unknown events, and, in theory, there is no use to have a broken pipeline and because of it a more extended stop of development.
Test pipelines become a bit more streamlined when following these general points:
- don't test what you don't own (/vendor folder)
- fix flaky tests with highest priority
- optimize the pipelines to fail early, and follow a logical procedure
- engage with developers that own flaky tests, fix them together to create a good team building sense
So can I skip my test now?
Personally speaking, I would say that before you consider skipping tests (unless its for the sole purpose of research or analysis), you should not skip the tests. In around 99.9% of the cases, you can easily await the result of a long-lasting pipeline, and if flaky tests cause to fail 9 out of 10 pipelines, your goal should not be to bring your feature into the repository, you should fix those tests first. After all, whats the purpose of a pipeline that fails more often than it runs trough?
To go a bit deeper into the research argument, lets picture this:
You are working on a flaky test that is scheduled to run in a group of Behat tests that will only be executed at the end of the pipeline for some reason. The code you change affects only the parts that are being tested at this point in time. Now, you make your changes, you create your PR and now you have to wait until all the other tests are done, before it finally is time for your feature to be run. Maybe there are flaky tests before your test group, and it can take a long time to get there. Here is the catch: you don't want to skip the tests to get your change merged. You want to skip some tests for the sole purpose of understanding if your code fixed what it should have fixed. Instead of waiting 3 or more hours for the general research, I think its acceptable to skip all the groups that are definitely not impacted by your change. Sometimes, its really hard to tell if that is the case, but sometimes its straight-forward to make that call. If it is straight forward, and you're not skipping for the purpose to merge faster, skip what you don't need. Now, once you have done your research and you're happy with the result, just replay the pipeline, this time without any skips. The result will be the same, but you got to your result not just much faster, but also with less cost for the pipeline (and the business paying for the instances).
However if your goal is to merge quickly, never ever skip anything at all.
During the past few months we've had many discussions with individual developers regarding pipeline stability and the feedback from around 10 developers was nearly going into the same direction: flaky tests are causing more headache than long-running pipelines, so flaky tests should be your number 1 goal, and skipping tests because of flaky tests are not good arguments.