AI in Testing: yes, but…

With "Artificial Intelligence" (AI), we indicate the automatic execution of tasks which, if executed by a human being would require the use of intelligence. This automatic execution is based on a so-called algorithm, which is nothing else than a clearly defined sequence of assignments that have to be completed. Under that point of view, a recipe is an example of a very easy algorithm.
Such algorithms can already be found in typical applications like navigation systems and traffic lights management and, most prominently, search engines.

Gianmarco Pani

Mar 03, 2020

Companies, which develop AI-based software, are already using AI in their processes, for instance, for code reviews. We are though far away from a system that can autonomously develop. In principle, AI is used to support developers to produce good code, thus having an as early as possible influence on its final quality.

AI in Testing and testing AI

When we talk about code quality, we directly or indirectly talk about Testing. In this area too, Artificial Intelligence is a hot topic, if not the hottest. That is also due to the two-sided relationship between AI and Testing: we can apply AI to Testing and its related activities, but we can also test AI-based systems.

Let’s first consider what applying AI to Testing means. The market already booms with testing tools which are (or, better said, claim to be) AI-assisted, and many are dreaming about humanless testing shortly, i.e. a scenario where tests are automatically generated, maintained and executed by a “robot”.

But is this a realistic scenario? In my opinion, no, or at least not to that extent.

AI cannot be directly applied to Testing

The main reason for this “no” is what is called the “test oracle problem”. A test oracle is the source of information about the test output correctness. It could be a document, a requirement specification or even a human being, who is knowledgeable about the system under test. The crucial point is that even the best-written requirement specification leaves room to interpretation, thus presenting gaps which, when defining the expected result of a test, need to be filled by a human being using “common sense”. AI is currently not able to perform this typically human task of applying common sense to feel gaps, and most likely will never be. And even if we would ever be able to come to an automated approximation of this human skill, who would blindly trust technology without being sure it is delivering the expected results?

AI can boost testing by assisting in related activities

AI cannot, therefore, be directly applied to Testing; but it can, and it will be applied to testing-related activities, bringing them to unexperienced levels of efficiency and accuracy.
Those activities include, naming a few, bug triaging, anomaly detection, fault prediction, risk estimation, manual testing effort estimation, generation of automated regression tests. I would like to quickly focus on further two, which I find particularly relevant.

The first one is the data generation. Synthetic data are currently a must in Testing, as the simple copy of production data is sometimes not possible (due to privacy reason) or too cumbersome. Moreover, Test Automation demands always more relevant test data. Finally, Artificial Intelligence itself is a significant consumer of test data, needed for the Machine Learning process. The absence of useful test data is one of the biggest challenges AI faces nowadays. Hence the ability of AI to produce relevant test data is a need!

The second relevant area Is object recognition. Everybody who has dealt with the automation of UI or cross-browser tests has experienced the frustration of having a lot of false positives (i.e. tests which fail when they should not) due to a button being renamed, a text box getting a new shape etc. Object recognition would make automated UI tests and cross-browser testing more robust and therefore more reliable and much cheaper, drastically reducing the manual work in those areas.

The big challenge: testing AI-based systems

As mentioned before, there is a second side of the relation between AI and Testing: testing AI-based systems. This is probably the most challenging task testers have at hand.
AI systems are by nature very complex and complicated, and it is, therefore, challenging to determine what the expected result of a test is.

The whole sector is moreover in continuous and rapid development, and there are no standard techniques, procedures or best practices which have emerged yet. For instance, there are no universally accepted metrics to measure the accuracy of an AI system. We could think about measuring the rate of false positives and false negatives and compare it with that of a human (this procedure resembles the famous Turing Test), accepting the accuracy if AI scores are at least as good as the human. Although that sounds reasonable, it is a profound change in the testing science: accepting that the result of a test is not 100% sure, that it is a “probability”. This is similar to what happened in Physics with Heisenberg’s uncertainty principle and won’t be accepted immediately by everybody.

Furthermore, the implications of this inherent difficulty to test an AI system lead to the fact that its external validity can be fully proved only once it is in production. I envisage the risk that given AI systems won’t at first work as expected in production, exhibiting a lower quality than foreseen, there will be shallow acceptance, when not even a rejection, of them.

There finally is an aspect which should be taken in careful consideration and which will, in my view, have the most significant impact on the skills needed by a tester: ethic. The debate about ethical implications of AI has already started, with essential discussions about who should determine what is ethical and what is not and who will be responsible for the “ethical conformity” of an AI system. Without entering in this discussion, I’d like to underline that testers will have to understand which ethical considerations affect the target system and to elaborate strategies and techniques to test its ethical conformity. That, in turn, requires the tester to possess or develop a new set of skills and knowledge, as she/he will need to think ethically and to interpret ethical specifications.

In conclusion, I believe Artificial Intelligence will hugely impact Testing in the future, but not in the near one. We still need to make some necessary steps forward in understanding not only how AI can mimic humans, but also how that magnificent machine, which is the human brain, really works. Only then we will be able to implement an AI of good quality.