Introduction
Innovation is defined in different ways: as a product such as a new idea, method or device; as a process, such as the introduction and adoption of new ideas, discoveries and inventions; and as an outcome, such as significant measurable change. Here our focus is on innovation as a process.
In the NHS, annual spending on research and development, including the National Institute of Health Research (NIHR), was £1.2 billion in 2014–15, but over the same period annual spending to support innovation spread through the Academic Health Science Networks (AHSNs) was much less (£50 million).1
Evaluators seek to understand how and why healthcare innovations do or do not spread. The focus is often the innovation itself (technology), although other factors are often critical in determining success or failure.2 Healthcare innovation is seldom a simple linear process but involves a complex adaptive system in which unpredictability and uncertainty are normal.3
The NASSS (non-adoption, abandonment, scale-up, spread, and sustainability) framework helps us understand the reasons for non-adoption, abandonment and challenges to scale-up, spread and sustainability of patient-facing health and care technologies using seven dimensions: the clinical condition(s) being treated; the technologies used; the value proposition; the adopter system (staff, patients, carers); the organisation(s); the wider context; and interaction between domains and adaptation over time.4
The work described here was prompted by evaluation of digital innovations in health and care services, in particular, evaluation of digital innovations and new care models led by Wessex AHSN and the Diabetes Digital Coach NHS Testbed led by West of England AHSN.
We looked for short simple generic survey tools to meet our evaluation needs but could not find what we sought. As a result, we developed a set of related measures, based on reviewing the innovation literature and earlier experience of developing person-reported outcome measures (PROMs) and person-reported experience measures (PREMs). These measures are described here:
Innovation Readiness Score helps rate where users and organisations lie on the innovativeness spectrum (based on Rogers’ categories of innovator, early adopter, early majority, etc).5
Digital Confidence Score helps rate user’s digital literacy and confidence to use digital products, to distinguish between digital natives and digital immigrants.6
Innovation Adoption Score is based on May’s Normalisation Process Theory (NPT),7 to rate the process of adoption before, during and after implementation.
User Satisfaction rates user’s assessment of a specific digital product, as a combination of customer satisfaction and user experience (in its widest sense).8
Behaviour Change helps identify factors such as capability, opportunity and motivation that enable or prevent us from doing what is being proposed, based on Michie’s COM-B model.9
These measures share the look and feel of R-Outcomes family of short generic PROMs and PREMs.10 11
Design criteria include being clear, brief, suitable for frequent use, multi-modal (suitable for use with multiple data collection modalities including smart-phones), responsive, good psychometric properties and easily understood scores and data visualisation. Scores generated need to be easy to interpret and action by all stakeholders, and be comparable for benchmarking.
The measures are short with a low reading age and are generic, applicable for any condition in any setting. Each has four items, although exceptions are allowed, with four response options each. Options are labelled, colour-coded and use emojis, with the best option on the left and the least desirable on the right. For scoring, each option is allocated a score on a 0 to 3 scale, where: Strongly agree=3, Agree=2, Neutral=1, and Disagree=0. A higher score is always better.
A summary score for a group of four items is calculated by adding the scores for each item, giving a 13-point scale with a range from 0 (4×disagree) to 12 (4×strongly agree). When reporting results for a cohort, the mean score is transformed linearly to a scale from 0 to 100, where 0 indicates that all respondents chose the lowest score and 100 that all chose the highest. The 0–100 scale is familiar and enables comparison of item and summary mean scores on the same scale.
Each measure was developed in a similar way. We identified the need for a measure, reviewed the literature, consulted with colleagues and users, designed prototypes and the measures evolved through a series of iterations with input from users and colleagues over several months or years.