EXTENT-2016: Managing QA for Complex Systems in Agile Development Framework

Date: Jun 22, 2016

Fundamental Principles of QA Management

QA, as we see it, is a continuous learning process. It is about learning how the system should operate in production, followed by learning the actual behavior of the system by way of performing tests and, after comparison, ensuring that all the discrepancies are either fixed or documented. It is the epitome of all QA activities. The other essential task that QA performs is developing the Test Design, coming up with Test Tools and Automated Test Libraries, which is an ongoing process, as opposed to learning at the beginning and testing as the end result. We are still learning when the system goes into production and still receive feedback from it. That allows us to reveal as many issues as possible.

Anti-patterns

Exactpro work principles will be illustrated below through the following list of anti-patterns - a set of commonly used techniques - which will most likely lead to inefficiencies in QA, if used incorrectly. They will be discussed one by one.

The Requirements Traceability Matrix

The common misconception is attempting to base the entire testing process on requirements, this is also known as the formal approach. Here are some of the principles that may be misleading, if followed blindly:

Test Cases are written at the beginning.
All Test Cases must contain direct links to Requirements.
Stakeholders must see the test coverage and correspondence between the Test Cases and the software requirements.

In these cases, all testing is based on software requirement documents, the test library often being a copy of the requirements. Unfortunately, this formal approach to testing dramatically decreases the test coverage. Testing is limited to and is only as good as the requirements document, by which testers are strictly limited in their activities. Consequently, in this case, QA is dispensable and does not add any value. It is Exactpro’s firm belief that test scenarios must be based on deep knowledge of the system, acquired by QA, as opposed to some formal and/or outdated requirement documents.

Test Automation

Here are some of the common misconceptions that revolve around test automation:

100% of the Test Library has to be automated.
Complete regression must happen overnight as part of the build process.
As a result of test automation, we should reduce the QA team to a few people.

These are bright examples of attempts to write Automated Test Scripts for all tests in the regression library. Eventually, a lot of effort is wasted on test automation and the desired outcome is not achieved. What is not being taken into account here, is that change is part of the software development process. Thus, it is obvious that attempts to develop a large automated test library against the volatile software, that may, in addition, have bugs and other issues and is developed under the constantly changing directions, are doomed, because the maintenance of such an automated library would take too much effort.

The Test Library

This anti-pattern describes an attempt to document all test cases in detail before the testing starts. Here are some examples of this mode of thinking:

All available Test Scenarios must be documented and put into the Test Management Software.
The Test Library consists of Test Suites which consist of Test Scenarios which consist of Test Cases.
All Test Cases contain detailed test steps, and Testers must tick off “Pass/Fail” as they execute them.
The completion of a test library is a precondition for test automation.
All tests need to be documented upfront, before the testing starts, as well as in detail.

Adequate test coverage of a complex - trading/post trade - system assumes tens of thousands of test cases. Investing in documenting and maintaining them is inefficient: the paperwork is insurmountable, while QA resources are limited and the time that should be spent on testing ends up being spent on writing test scenarios step by step.

As a result, the test library is extremely limited by the documenting capability of the team. Moreover, all they do is work on writing, instead of testing. Thus, the test coverage is limited too, and the testing process inhibits the speed of delivery and quality.

The Test Plan

This is an attempt to draw a large Gantt chart for all QA activities and try to carefully follow it, working very hard to reach the milestones and achieve the objectives. Sadly, this approach ends in inadequate management and focus on tasks that do not lead to successful software delivery.

Because of the fact that we are dealing with the unknown from beginning to end, we are never really able to predict much of what is going to happen. There is no straightforward way to plan QA: testing is very similar to research, and, depending on how many bugs are found, how many issues - discovered, and how many other issues - revealed, the testing process constantly changes.The test team must be agile and must deal with software changes, keeping strategic focus on final project delivery.

Non-Functional Testing

Here is a widespread example: a company decides to perform non-functional testing for certain software, for which they allocate the necessary production-like hardware and perform load testing, measuring the latency and the capacity. Here are the examples of this mode of thinking:

We need to test whether the system supports 10,000 orders per second.
We need to make sure that the latency is within the expected limits.
We will allocate production-like hardware to focus on capacity and latency measurements.

The mistake is in avoiding testing the system under load and limiting the testing to performance KPIs. Therefore, the separation of approaches to functional and non-functional testing leads to the fact that the vast area of test conditions is not covered. QA must carefully examine the system behaviour under load, without limiting it to latency/performance measurements. Rather, the system behavior should be verified in detail. As a result, numerous “load-related issues with unspecified conditions” will persist in production.

“The Test Tool”

This is the consequence of using commercially advertised test tools. Here are some examples:

We have purchased a license for a magical test automation tool.
We will use it for all of our test automation purposes.
It allows interacting with GUI /via FIX /via SWIFT and has the capacity to record/replay test scenarios.
It is expensive.

The outcomes of this approach include inefficient expenditures of money and resources, the test tool paradigm limiting the test approach.

There is nothing wrong with these tools per se, but in the modern world of trading and clearing and settlement systems, these don not, in fact, add much value to test automation. There are no plug-and-play test tools for the complex environment. In addition, most of these tools are aimed at UI which merely represents a tiny part of the test universe. So, in the end, the QA team spends a lot of time trying to overcome the limitations of these tools and testing the system anyway, despite those limitations.

“The Scrum Team”

This is an attempt to blend all of QA into a development scrum team. This implies that testing should be done by developers or by testers of the scrum team and everything that comes out of that does not require any testing. Here are some examples of this mode of thinking:

We are using scrum and will test everything as a part of the sprint.
Testing is done and automated by Developers who can temporarily allocate their time within scrum.
We are using DevOps, we do not need QA, QA is dead.
We will put some QA engineers into Scrum team and let them write unit tests.

I believe that this approach will probably work in the world of Google or other start-ups, but in the market infrastructure systems, we are actually in dire need of integration QA. If we do not do integration testing, we will not put together a team of people who understand the business requirements and the infrastructure and will not let them test the system according to their critical judgment, we will have a lot of issues in production.

In conclusion, I would like to draw a comparison between the anti-patterns and our testing principles.

Developing team knowledge vs. “The Requirement Traceability Matrix”

We learn a lot. Our test coverage is not defined by the spreadsheet having links to the requirements, but the actual knowledge of the people who test the system.

Building software to test software vs. “Test Automation”

We develop test tools for efficient regression, and we assume that there will be changes in the systems that we are going to test, so we design our test tools carefully to make sure that these changes will not make us constantly rebuild our test tools.

Test Design vs. “The Test Library”

We develop test design, which is our approach to testing. We invest a lot of analytical resources into coming up with efficient ways of performing tests, to increase coverage and to decrease effort.

Agile iterative approach vs. “The Test Plan”

We use agile development techniques in testing to be flexible, to deliver the systems that we are testing to production.

Testing at the confluence of Functional and Non-Functional Testing vs. “Non-Functional Testing”

We do not only perform functional testing in its classical understanding, but also put a lot of effort into trying to test the system under load: we develop the corresponding test tools, use post-transaction analysis, etc.

Pragmatic in-house development of test tools vs. “The Test Tool”

We develop our own test tools as part of our test design, based on the particular tasks and the particular obstacles and avoid buying third-party software.

The QA team is a pacemaker vs. “The Scrum Team”

The way it happens is, when we start testing the system, we raise so many issues so fast and try to retest them and revolve around them extremely fast so the development team has to follow us and catch up with us by fixing those issues. It usually becomes apparent that our work sets the pace for the whole project.

Q&A

Q: One of the things that business people like about the traditional methods of The Traceability Matrix is its tangibility. So how do you prepare the metrics within your QA management framework?
A: As Chris mentioned earlier, we came up with quality KPIs at LSEG. The main lagging indicator in the quality KPIs is the number of issues in production. Ideally, there are very few or none. As for the leading indicator, we measure the test coverage, so basically we have 2 parameters: 1) how much time and money is spent on testing, 2) the test coverage provided. The measurement of the test coverage can take several forms: by talking to the QA people, the businessmen can estimate their level of knowledge about the system and what they test in the system. Another way is listening to a presentation in the format that is convenient for our testers in explaining the test coverage. One more way to approach it is using a coverage analysis tool. We can run our test library through a code coverage analysis tool, which will indicate that, for instance, this library covers 80% of the code. We do it in our work with Millennium. So, there are several ways that we can go about it, apart from just producing the test library and the documentation.

Q: When you were presenting the anti-patterns, you mentioned test automation. How much automation is reasonable, in your opinion?
A: It fully depends on the project and the complexity, on the endpoint of the system, on how many third parties the system is connected to, on how volatile the environment around the system is. I think we can cover a fair amount of percentage of automated or semi-automated tests. We just do it our own way, i.e. by developing a test tool that allows us to achieve this. It may not just be a tool that simulates test scenarios step by step, but it could be Load Injector creating random load, and sometimes it is post-transactional tools. So I would not be able to give you an exact number, I doubt anyone can, but at our work we aim at achieving 70-80% automation. But that only concerns regression. If you are doing functional testing and dealing with software change, you put a lot of effort in learning and getting to know the system, which is manual.