As concerns about the privacy and security implications of contract tracing apps for COVID-19 show, establishing trust in software is still a real challenge.
I’ve been following reports about these apps closely, most notably the UK’s NHSX COVID-19 apps, and have been interested to read medical, technical and ethical analysis from around the world. What all of these perspectives underline for me is the critical role of public trust in the success or failure of such applications.
To a large extent, this is about trust in the people and organisations creating them, the governments that have commissioned them, the value of the data that they will collect in the fight against the virus, and the possible misuse of that data. But trust in the software itself is also a key factor, with commentators expressing concerns about its ability to perform its intended function, as well as the possibility that it might be used for other purposes.
NHSX, the ‘digital innovation unit’ of the National Health Service, which is responsible for developing officially-sanctioned apps in the UK, have taken a number of steps in response to this. They published their goals and objectives for the apps, asked the National Cyber Security Centre (NCSC) to review their design, and made the app source code and documentation available under an open source license.
These actions have been broadly welcomed, but how much can they help to inform our understanding of this software, and the ways it might be used — or misused? And how can we use this information to decide whether or not to trust it?
How do we trust software?
Concerns about privacy and security in smartphone and web-based applications are nothing new. In Europe, legislation such as GDPR and organisations like the NCSC have come into being to try to address some of these concerns, and growing public awareness of these issues has led organisations such as Facebook and Google to add ever more sophisticated privacy options and security features to their software.
Many of these measures are introduced in response to public perception, by organisations that are largely self-regulating. When software has a more critical role to play, as part of a system such as a car, an aircraft, a heart monitor or a power station, then it may be required to satisfy more rigorous safety standards, which are regulated and overseen by a national or international authority.
But how can consumers have confidence in these measures and approaches? How can organisations know whether software components that they use as part of their products — whether from suppliers, partners or other parts of their own organisation — satisfy their criteria, or follow the applicable standards and regulations? Or be confident that the approaches in use offer customers sufficient protection against the threats that such measures are intended to prevent?
My colleagues at Codethink have been thinking about these questions for a number of years, publishing a white paper on the topic with the Institute for Strategy, Resilience & Security (UCL) in 2017 and working with shared initiatives like the Trustable and ELISA projects. I’ve spent a lot of time wrestling with these questions myself, in the course of working on the Trustable methodology.
One thing that we have observed in the course of this work, is that fundamental questions about trust are the same, regardless of the software domain or whether the risks are concerned with safety, security, privacy or liability. The nature and potential consequences of these risks may differ, but if we want to decide whether software can be trusted to address such concerns, then we need to ask:
- What are the potential hazards of using the software?
- What steps can be taken to mitigate the associated risks?
- What claims (if any) does the software’s provider make regarding these risks?
- How can we have confidence in these claims?
Identifying hazards
The first of these questions can be the hardest to answer. There may be an unbounded list of ways in which software can go wrong, have unintended consequences, or be compromised or exploited by a bad actor. Identifying all of these hazards during the development of the software may not be feasible, especially if it is based on a new technology or concerned with a novel application of that technology.
Where software is part of an electronic or electro-mechanical system, hazards may relate to hardware failures, and the software’s ability to deal with them. For software in other roles, Identifying hazards may be an exercise of the imagination rather than engineering, bounded by limits of cost and time available. However, past issues identified in comparable software — performing a similar role, using the same technology, or subject to interference by the same external factors — can be a rich source of inspiration.
One approach to this that Codethink has been exploring is the use of System-Theoretic Process Analysis (STPA) to develop safety and security criteria, examining software from a whole system perspective, where the definition of the system may include its users, the environment in which it operates and even the organisations responsible for regulating it.
This technique begins by identifying the losses (of life, personal data, control, time, etc) that we are trying to prevent and then uses control structure models at progressive levels of detail to identify hazards (system conditions leading to loss), before systematically and iteratively determining how these may occur. By identifying unsafe control actions (the circumstances that may lead to such hazards), it is then possible to describe constraints (measures to prevent them) that may be included in the software’s design and testing.
For a contract tracing application, hazards already identified by commentators include the leaking of personal data that can be exploited by third parties and the possibility that self-reporting mechanisms and existing security vulnerabilities may be exploited by bad actors to obtain such data or disseminate false information. Another danger, which emphasises the importance of public understanding of an app’s purpose, is the possibility that apps will have a negative impact on people’s behaviour, if they are given the false impression that using the app makes them safer.
Further hazards could be identified by pooling the collective experiences of developers from around the world. Even if there is a lack of consensus about the right approach, developers will almost certainly identify hazards that will be common to all solutions. An open database of these hazards might help everyone to achieve their goals faster.
Examples of hazards that other developers have identified include a problem with iOS apps operating in low power mode, which was reported by the Australian government for their app. Advocates of a decentralised approach have also published an analysis of potential issues for a German centralised solution, which might equally apply to the UK’s approach.
Mitigating risk
Having identified some hazards, developers need to consider how to mitigate the risks associated with them. This also requires us to understand and consider the software’s intended function or purpose, as well as that of any larger system (or systems) of which it may be a part. The viability of a product may be compromised if safety or security measures are applied too broadly, so it may be necessary to accept some level of risk in order to preserve its usability.
Not all risks can be managed by the software itself, so external measures such as instructions for users, hardware fail-safes or fault monitoring and reporting may be used for some risks, especially where a software solution would be costly or impractical. Other strategies include redundancy, where a second component is included to achieve the required criteria when the primary one fails, and resiliency, which focuses on rapid recovery from failures. These allow components of a system to fail without jeopardising the safety or performance of the system as a whole. Choosing the appropriate strategy may require an assessment of the relative severity or likelihood of risks, to ensure that measures to address the most critical are built into the software’s design.
Approaches to risk mitigation in the realm of safety, as practised by medical device and vehicle manufacturers, tend to rely on tried and tested solutions, rigorous processes and standards, which are created by authorities such as the ISO or IEC, or industry-specific bodies like MISRA. These are intended to encapsulate the ‘lessons learned’ from past experiences and eliminate the conditions that permitted failures to occur.
In many cases, however, such processes and standards reflect the norms for software — or the characteristics of the hardware — at the time they were developed, which in some cases was several decades ago. Others are specific to an industry or certain types of software. Applying such processes to complex or dynamic systems with rapid development cycles, extensive ecosystems of dependencies, extensible external interactions and continuous update cycles can be nearly impossible.
Open source software, along with web- and smartphone apps, are prime examples of this, but they have also pioneered different approaches to testing and continuous delivery of improvements or bug fixes. Combining the power of continuous integration tools and cloud computing to verify software against safety, security and privacy criteria could revolutionise this domain. Subjecting such tests to public scrutiny by making them open source could have an even greater impact.
By making their COVID-19 apps open source, NHSX have taken a first step in this direction, but early reports suggest that they have not yet embraced the continuous testing approach. They have, however, benefited from hazard analysis by the wider technical community, who have identified security flaws in private key generation and use of insecure web resources through scrutinising the code.
Claims, confidence and evidence
Assuming that a developer has made some effort to identify hazards and mitigate the associated risks for their software, how can we, as potential users of that software, learn about the measures they have used? And if they make claims about these measures, what can give us confidence that their claims are true — and more specifically, that they remain true for the specific version of the software that we are using?
For software with a safety-critical role, this kind of assurance may come from regulatory oversight, or standards-based audit by an independent safety consultant. However, this approach is both time-consuming and expensive, and does not lend itself to software that requires short time-to-market or regular update cycles. It can also mean that a full audit of software designs, processes and outputs is only conducted for specific milestones or at periodic intervals. Furthermore, simple adherence to standards of this kind does not guarantee the risk mitigation that hazards merit.
Best practice for supporting claims relating to safety, security and privacy would be to document the software’s goals relating to hazards in these areas, the risks that it has been designed to mitigate, and the measures used to verify this. This kind of approach is mandated by many standards, and supported by elaborate tooling and documentation, tracing requirements through design and implementation to test.
Another factor that must be considered with most modern software is the near-inevitability of updates. Even if no new features are envisaged, it is almost impossible to identify all of the risks associated with software before it is released, not least because unforeseen hazards or risks may arise as a result of new technologies or interactions. When regular updates are required, lengthy verification and validation processes or oversight by independent external regulators can be very challenging.
Open source software projects might seem to be in a position to offer the gold standard for providing consumers with assurances about their software, as they open up source code and tests to public scrutiny. However, in practice the licenses for such projects explicitly avoid making any claims, and include disclaimers for warranty and liability.
However, there is nothing to prevent organisations who use such software as part of their product (or indeed open source projects themselves) from providing evidence that might be used to support such claims, such as designs, tests and records of verification and validation activities. Generating and maintaining such evidence as part of a continuous delivery process could further increase confidence, by ensuring that the risk mitigation measures are updated and applied for every version of the software, not just a ‘sanctified’ audit version.
Closed-source developers might also make use of an open and continuous delivery approach to generating and sharing evidence for their own software, publishing their safety, security and/or privacy goals, along with tests and test results that demonstrate how they have been met, even if they still wish to keep their source code private. Where independent oversight is required, this generated evidence could then be automatically shared with an external safety assessor, or a regulatory authority.
As I noted at the start, NHSX published a detailed set of claims about their design goals for the COVID-19 apps and the way that they will be used, acknowledging the critical role of public trust in the success of their undertaking, and noting: “To earn that trust, we will continue to work based on transparent standards of privacy, security and ethics.”
Proponents of other solutions, many using a different model to the NHSX app, have also shared claims about their software in an effort to reassure the public. Google and Apple published a joint statement about the changes they have introduced in iOS and Android to facilitate the development of contact tracing apps “with user privacy and security central to the design.”
While laudable, publishing this kind of information is only a beginning. Without details of the hazards that have been considered in software’s design and implementation, the measures that have been used to mitigate the associated risks, and evidence to support this for specific versions of software, we have no way to make a truly informed decision about trust. While undertaking the analysis necessary to make such a decision will never be feasible for the average consumer, by enabling independent technical experts to do so, and to share their results for public consumption and peer review, it might become possible to build ‘webs of trust’ to give consumers greater confidence in developers’ claims.
Looking to the future
Evaluating the extent to which we can trust software will always be a challenging process. Competition between rival organisations (and even national governments) to create the best software for a particular purpose often works at odds with the public interest — which is to have confidence in the software that plays an ever-more-critical role in our lives, often without our making a conscious or informed decision.
It is no longer sufficient to evaluate a piece of software once, or even once in a while, in order to have confidence in it. As the NCSC’s report on the NHSX COVID-19 application notes:
“Code is truth. This document is correct at the time of writing, but the system is still in development, so there may be detail changes before release. After release, there will certainly be updates to the system and this document,as we learn more about large scale, real world deployment of the app and the mechanisms by which the virus spreads.”
Clearly identifying the risks that may be associated with using software, and the measures that have been used to mitigate these, can go a long way towards building trust in software, and in the people and organisations that create it. Generating evidence that these measures have been applied as part of the software creation and refinement process has the potential to make continuous scrutiny of a provider’s claims a practical possibility.
As illustrated by the decision of so many developers of COVID-19 contact tracing apps from around the world to make their applications open to public scrutiny, the open source model is a natural choice for products that need to win consumer trust. The software model behind a report that informed the UK government’s COVID-19 strategy has also been made open source, prompting many to question why this hadn’t happened sooner. However, simply sharing source code is not sufficient; transparency about the safety, security and privacy goals that have been considered in software’s design and implementation, and the measures that have been used to achieve these, are also required in order to secure that trust.
For the contact tracing apps, and for other software where trust is critical and many parallel or competing solutions are developed, sharing information and evidence about hazards and risk mitigation strategies in the public domain could help to make such software safer and more secure. While many organisations already share expertise in this way, through their contributions to standards bodies, this is often accessible only to members or at considerable cost. Extending the benefits of the open source model to this domain would serve the public interest, by ensuring that all developers are aware of hazards that apply to their software, while simultaneously serving the commercial interests of software creators, by enabling them to focus on other differentiating factors.
Other Content
- Codethink/Arm White Paper: Arm STLs at Runtime on Linux
- Speed Up Embedded Software Testing with QEMU
- Open Source Summit Europe (OSSEU) 2024
- Watch: Real-time Scheduling Fault Simulation
- Improving systemd’s integration testing infrastructure (part 2)
- Meet the Team: Laurence Urhegyi
- A new way to develop on Linux - Part II
- Shaping the future of GNOME: GUADEC 2024
- Developing a cryptographically secure bootloader for RISC-V in Rust
- Meet the Team: Philip Martin
- Improving systemd’s integration testing infrastructure (part 1)
- A new way to develop on Linux
- RISC-V Summit Europe 2024
- Safety Frontier: A Retrospective on ELISA
- Codethink sponsors Outreachy
- The Linux kernel is a CNA - so what?
- GNOME OS + systemd-sysupdate
- Codethink has achieved ISO 9001:2015 accreditation
- Outreachy internship: Improving end-to-end testing for GNOME
- Lessons learnt from building a distributed system in Rust
- FOSDEM 2024
- QAnvas and QAD: Streamlining UI Testing for Embedded Systems
- Outreachy: Supporting the open source community through mentorship programmes
- Using Git LFS and fast-import together
- Testing in a Box: Streamlining Embedded Systems Testing
- SDV Europe: What Codethink has planned
- How do Hardware Security Modules impact the automotive sector? The final blog in a three part discussion
- How do Hardware Security Modules impact the automotive sector? Part two of a three part discussion
- How do Hardware Security Modules impact the automotive sector? Part one of a three part discussion
- Automated Kernel Testing on RISC-V Hardware
- Automated end-to-end testing for Android Automotive on Hardware
- GUADEC 2023
- Embedded Open Source Summit 2023
- RISC-V: Exploring a Bug in Stack Unwinding
- Adding RISC-V Vector Cryptography Extension support to QEMU
- Introducing Our New Open-Source Tool: Quality Assurance Daemon
- Achieving Long-Term Maintainability with Open Source
- FOSDEM 2023
- Think before you Pip
- BuildStream 2.0 is here, just in time for the holidays!
- A Valuable & Comprehensive Firmware Code Review by Codethink
- GNOME OS & Atomic Upgrades on the PinePhone
- Flathub-Codethink Collaboration
- Codethink proudly sponsors GUADEC 2022
- Tracking Down an Obscure Reproducibility Bug in glibc
- Web app test automation with `cdt`
- FOSDEM Testing and Automation talk
- Protecting your project from dependency access problems
- Full archive