Introduction
As of January 2021, more than 64 countries and regions have developed contact-tracing apps to limit the spread of COVID-19.1
These apps use Bluetooth signals to log when smartphone owners (users) are close to each other; so if a user later tests positive for COVID-19, an alert can be sent to the other users that have recently been in close contact. Figure 1 shows several screenshots from an example app, depicting the typical features of these apps.
In the UK, three different apps have been developed and publicised by the regional and national governments for different constituent countries: the NHS COVID-19 app (for England and Wales), the StopCOVID NI app for Northern Ireland and the Protect Scotland app for Scotland. Reports1 indicate that the total cost of the NHS COVID-19 app alone is expected to top £35 million.
The apps have been promoted as a promising tool to help bring the COVID-19 outbreak under control. However, there are many articles in the academic literature2 and also the media questioning the efficacy and public adoption of these apps. For example, a systematic review3 of 15 studies in this area found that ‘there is relatively limited evidence for the impact of contact-tracing apps’. A news search for ‘what went wrong with UK contact-tracing apps’ would return a few hundred hits.
One cannot help but wonder about the reasons behind low adoption of the apps by the general public in the UK and many other countries. The issue is multifaceted, complex and interdisciplinary, as it relates to fields such as public health, behavioural science,4, epidemiology, information technology (IT) and software engineering. Since these apps are essentially software systems, we investigate them from a ‘software in society’ lens5 in this paper. Software in society means the role of position of software systems (eg, mobile apps) in society, as they are used by billions of people in the society. Software systems should be of high quality and should be usable/useful for end users who are mostly non-technical people.
The software aspects of these apps are also quite diverse in themselves, for example, whether the app software would work as intended (eg, will it send the alerts to all the recent contacts?) and whether different apps developed by different countries will cooperate/integrate (when people travel between counties). A related news article reported that a large number of developers worldwide has reported a large number of defects in the NHS app (bit.ly/FlawsInNHSApp).
An interesting source of knowledge about the user experiences is through the availability of a large number of user reviews in the two major app stores: the Google Play Store for Android apps and the Apple App Store for iOS apps. A review often contains information about a user’s experience with the app and opinion of it, feature requests or bug reports.6 Many insights can be mined by analysing the user reviews of these apps to figure out what end users think of contact-tracing apps, and that is what we analyse and present in this paper.
The nature of our analysis is ‘exploratory’,7 as we want to extract insights from the app reviews which could be useful for different stakeholders, for example, app developers, public-health experts, decision makers and the public. The two research questions (RQs) that we explore are (1) what are the users’ satisfaction levels with the three UK apps? and (2) what are the main issues (problems) that users have reported about the apps? While some studies8 have shown that there may be some inherent negative bias in public app reviews (in app stores), many researchers6 and practitioners are widely using app reviews to derive improvement recommendations on the apps.
User feedback has long been an important approach for understanding the success or failure of software systems, traditionally in the form of direct feedback or focus groups and more recently through social media (eg, tweets about a given app in Tweeter) and reviews submitted in app stores.9. A systematic literature review6 of the approaches used to mine user opinion from app store reviews identified a number of approaches used to analyse such reviews and some interesting findings such as the correlation between app rating and downloads.
Several related papers, similar to this work, have been published, for example, a recent paper10 focused on sentiment analysis of user reviews of the Irish app. Another recent paper11 analysed the user reviews of apps of a set of 16 countries (UK was not included). The paper presented thematic findings on what went wrong with the apps, for example, lack of citizen involvement, lack of understanding of the technological context of users and ambitious technical assumptions without cultural considerations.
As another related work, we have published online a recent comprehensive technical report12 by analysing the review data of nine European apps from (1) England and Wales, (2) Scotland, (3) Northern Ireland, (4) Ireland, (5) Germany, (6) Switzerland, (7) France, (8) Finland and (9) Austria. In this current paper, our goal was to go in depth and focus on the three UK apps.
In addition to the academic (peer-reviewed) literature, in the grey literature (such as news articles and technical reports), there are plenty of articles on the software engineering aspects of contact-tracing apps. For example, an interesting related news article was entitled ‘UK contact-tracing app launch shows flawed understanding of software development’ (www.verdict.co.uk/contact-tracing-app-launch/). The article argued that ‘In a pandemic, speed is critical. When it comes to developing high-quality software at speed, using open-source is essential, which other nations were quick to recognize. The article also criticised the approach taken by the UK healthcare authorities in developing their app from scratch: ‘Countries such as Ireland, Germany, and Italy used open-source to build [develop] their own applications months ago. Sadly the UK did not follow suit, and wasted millions of pounds and hours of resources trying to build its own version’.
Another motivating factor for this study is the consulting engagement of the first author in relation to the NI’s StopCOVID NI app, in the summer of 2020. Some of his activities included peer review and inspection of various software engineering artefacts of the app, for example, design diagrams, test plans and test suites; see page 13 of an online report by the NI’s Health and Social Care authority (https://covid-19.hscni.net/wp-content/uploads/2020/07/Expleo-StopCOVIDNI-Closure-Report-V1.0.pdf) In the project, a need was identified to review and mine insights from user reviews in order to be able to make improvements in the app.
In the rest of this paper, we first review our method and data collection approach, and then present the results of our analysis. We then conclude the paper with discussions and conclusions.