Original research

Women’s attitudes to the use of AI image readers: a case study from a national breast screening programme

Abstract

Background Researchers and developers are evaluating the use of mammogram readers that use artificial intelligence (AI) in clinical settings.

Objectives This study examines the attitudes of women, both current and future users of breast screening, towards the use of AI in mammogram reading.

Methods We used a cross-sectional, mixed methods study design with data from the survey responses and focus groups. We researched in four National Health Service hospitals in England. There we approached female workers over the age of 18 years and their immediate friends and family. We collected 4096 responses.

Results Through descriptive statistical analysis, we learnt that women of screening age (≥50 years) were less likely than women under screening age to use technology apps for healthcare advice (likelihood ratio=0.85, 95% CI 0.82 to 0.89, p<0.001). They were also less likely than women under screening age to agree that AI can have a positive effect on society (likelihood ratio=0.89, 95% CI 0.84 to 0.95, p<0.001). However, they were more likely to feel positive about AI used to read mammograms (likelihood ratio=1.09, 95% CI 1.02 to 1.17, p=0.009).

Discussion and Conclusions Women of screening age are ready to accept the use of AI in breast screening but are less likely to use other AI-based health applications. A large number of women are undecided, or had mixed views, about the use of AI generally and they remain to be convinced that it can be trusted.

Introduction

Population breast screening in England aims to detect breast cancer earlier, thus improving outcomes for women between the ages of 50 and 70 years. The National Health Service (NHS) Breast Screening Programme (NHSBSP) invites more than 2 million women for a test every year nationally. In the light of the high volume of images to be read, artificial intelligence (AI) is focusing on the development of image reading technology.1–3 As studies confirm the diagnostic accuracy of AI products in breast cancer diagnosis, there is an emerging concern among clinicians that AI image reading may not be sufficiently focused on patients. ‘Clinically meaningful endpoints such as survival, symptoms and need for treatment’ could mitigate the risks of overtreatment and false positives.4

In a healthcare context, where shared decision-making is increasing,5 patients are seeking a greater understanding of how a diagnosis is arrived at. Regulators of AI technology are starting to acknowledge the importance of being seen as trustworthy on uptake and adoption.6

Public attitudes to the use of AI and machine learning in healthcare are evolving. Social attitudes to the use of AI to support diagnosis are positive but people still want human involvement.7–11 Specifically in radiology, people want to be fully informed about the use of AI and want to retain human interaction in the diagnostic process.12 13 However, they hold positive views about the use of such technology to support clinician diagnosis and deliver faster, more precise and unbiased results.

The public are not passive recipients of care. They are essential stakeholders in the healthcare system. Their willingness to adopt new innovations can enable or constrain spread and scale.14 There is a need to understand how acceptable AI is in breast cancer screening services as well as the many ethical, social and legal implications of its use.15 A few qualitative studies, although with small sample sizes, have explored public perception of the use of AI in medicine.16–18 A recent survey conducted in the Netherlands involving 922 participants examined the perception of the use of AI to read mammograms. It found that the women surveyed did not support the use of AI without a human reader.19 If the benefits of AI are to be delivered in breast screening and the disbenefits minimised, then the public should be actively engaged in the design, development and monitoring of this technology.20 21

Our study seeks to address the gap in the research into public attitudes towards AI. We did this as part of a wider real-world testing of AI tools in the NHSBSP in England. The researchers developed a short survey which collected both quantitative and qualitative data. The researchers followed up with focus group discussions to understand the attitudes of a sample of women to the use of AI in breast screening. The NHSBSP currently invites women between the ages of 50 and 70 years for screening every 3 years. Mammograms are double read by two human readers.

This paper focuses on women’s attitudes to the possible future use of an AI second reader in the NHSBSP.

Materials and methods

This study was a prospective mixed method design. The study was conducted in four NHS trusts providing acute care in the East Midlands of England. All participants gave electronic informed consent to participate in the survey and focus groups.

Survey tool development and testing

We developed an open e-survey according to good practice guidelines,22 including the Checklist for Reporting Results of Internet E-Surveys.23 This is used for the development, administration and reporting of web-based surveys. Our research question set out in the study protocol was how do the attitudes of women to the use of AI in the breast screening process affect the adoption and spread of these innovations? We conducted a review of the literature on the influence of adopter attitudes to AI in general and innovation adoption in healthcare specifically. Based on this review, we developed a set of open and closed questions. These were tested with a small sample group of women (n=10) for question clarity, underlying assumptions (bias), question sensitivity, problems with Likert scale labels, question order and online user experience.

The final version of the survey had six sections:

  1. Personal attributes, which included age.

  2. Experience of breast cancer (direct or indirect).

  3. Knowledge and experience of breast screening.

  4. Use of AI-based technology in everyday life.

  5. Attitudes towards AI-based technology in general.

  6. Attitudes towards the use of AI in breast screening (figure 1).

Figure 1
Figure 1

Survey map: the topic covered in the survey in the order the questions were presented. NHSBSP, NHS Breast Screening Programme.

The survey tool was submitted for ethical approval along with the study protocol.

Data collection

The chosen sampling strategy was non-probability sampling. This was chosen because the topic being explored was under-researched and the study was exploratory rather than testing a hypothesis. The sample size for the survey was calculated based on a 1% response rate from the ≥18 years female population of the East Midlands of England, a confidence level of 95% and a margin of error of 2% (n=2435). This was submitted to the Health Research Authority as part of the ethical approval process. The survey was set up on a dedicated General Data Protection Regulation-compliant online survey platform and information was shared via a range of site communication channels with women over the age of 18 years working or volunteering at four acute hospital sites in the East Midlands and their friends and relatives. As one of the largest and most diverse employers in the region, the NHS workforce provided a good proxy for the wider population. Respondents were recruited between 4 December 2019 and 29 February 2020.

Information was gathered on age, ethnicity and employment status. This enabled us to identify any representation gaps in the sample cohort and guided targeted recruitment for the survey and focus groups. Focus group participants were recruited from the general population with a greater representation of women from black and minority ethnic groups since these were slightly under-represented in the survey. Due to COVID-19 restrictions, focus groups were conducted using a secure online video conferencing platform.

Data analysis

The survey responses were analysed using descriptive statistics to understand the current status of women’s views on AI-based technology generally and in the breast screening programme specifically. Likelihood ratios were used to determine the significance of differences between women under screening age and of screening age.

NVivo (NVivo is a qualitative and mixed methods data analysis software tool used by academics and professional researchers globally) software (QSR International, UK) was used to organise and visualise qualitative data from surveys (open-ended questions with free-text responses) and focus group transcripts. A hierarchical thematic framework was used to classify and organise data according to key themes, concepts and emergent categories. This approach allows us to explore data in depth while simultaneously maintaining an effective and transparent audit trail. This enhances the rigour of the analytical processes and the credibility of the findings.

Results

Sample

The survey was distributed to a population of 23 332 men and women working at four NHS trusts in the East Midlands. Of the consenting participants (n=4132), 4096 were identified as women. The respondents (n=4096) covered all the age bands targeted, with the largest group from the 50–59 years age band. Most women who took part were in paid employment (92.8%, 3802/4096) with the remainder retired, self-employed, carer of dependents or volunteers. The ethnicity profile of the respondents was like that of the profile for the East Midlands except for Asian/Asian British which was under-represented (2.88% in the survey responses as opposed to 6.5% in the East Midlands population). This guided the purposive sampling strategy for the focus groups where 20% of women recruited were Asian/Asian British.

The 4096 women were segmented into two groups: 1747 (42.7%) were or had recently been of screening age and 2349 (57.3%) were under screening age (<50 years) and, thus, future users of the programme (table 1).

Table 1
|
Age bands of the survey participants

Differences in self-reported technology use

Women of screening age were less likely to use technology platforms or applications for healthcare advice, 64.9% (1134/1747), than women under screening age, 76.2% (1790/2349)–likelihood ratio=0.85, 95% CI 0.82 to 0.89, p<0.001. Women of screening age were also less likely to trust the recommendations of these platforms, 57% (997/1747), than women under screening age, 61% (1449/2349)–likelihood ratio=0.93, 95% CI 0.88 to 0.97, p=0.003 (figure 2). These differences replicate the results of similar studies of attitudes to technology across whole populations.24 25

Figure 2
Figure 2

The self-reported level of trust that women under and of screening age had in everyday artificial intelligence-powered applications when seeking health advice.

Differences in attitudes towards the effect of AI on society

Women of screening age were less likely to agree that AI can have a positive effect on society, 47.1% (822/1747), than women under screening age, 52.9% (1242/2349)—likelihood ratio=0.89, 95% CI 0.84 to 0.95, p<0.001. Women of screening age were also more likely to be undecided on the issue, 47.7% (834/1747), than women under screening age, 41.3% (969/2349)—likelihood ratio=1.16, 95% CI 1.06 to 1.27, p=0.001, 95% CI 1.08 to 1.24, p<0.001 (figure 3). The likelihood of disagreeing that AI can have a positive effect on society was similar among women of screening age, 5.2% (91/1747), and women under screening age, 5.9% (138/2349)—likelihood ratio=0.89, 95% CI 0.69 to 1.15, p=0.359.

Figure 3
Figure 3

The self-reported level of agreement with the statement ‘artificial intelligence can have a positive effect on society’ for women under and of screening age.

Sentiment analysis of free-text responses on the issue of whether AI can have a positive effect on society found that many women, who had a negative or mixed view of the effect of AI in society, were unsure of why they felt this way (n=96). However, they described AI as an inevitable part of their lives in the future (n=20). Those who did express a view cited:

  1. Concern about the reliability and safety of technology (n=123).

  2. A lack of trust in the technology itself or the systems that sit around it (n=65).

  3. A fear about a combination of over-reliance on AI and job losses that might ensue (n=32).

  4. Concern about the absence of the human touch in interactions (n=46).

Differences in attitudes towards the use of AI in breast screening

Women’s baseline understanding of the current process of reading mammograms was weak. Only 22% of women under screening age and 27% of women of screening age identified that two human readers blind read all screening mammograms in the NHSBSP. Sentiment analysis of free-text responses (n=3987) showed that the largest proportion of women overall were positive about using AI in breast screening, 47.2% (1880/3987). The next largest group expressed mixed or undecided views, 35.9% (1432/3987) and 17.9% (675/3987) expressed a negative view. A further 109 women did not provide a free-text response, 2.7% of the total 4096 survey respondents (figure 4). Women of screening age were more likely to feel positive about using AI to read mammograms, 49.5% (849/1714), than women under screening age, 45.4% (1031/2273)—likelihood ratio=1.09, 95% CI 1.02 to 1.17, p=0.009. This finding was confirmed by the finding that women of screening age were less likely to have mixed or neutral feelings on the issue, 34.1% (584/1714), than women under screening age, 37.3% (848/2273)—likelihood ratio=0.91, 95% CI 0.84 to 0.99, p=0.036. Women of screening age, 16.0% (281/1714), and women under screening age, 17.3% (394/2273), were similarly likely to have negative views on the use of AI in breast screening—likelihood ratio=0.95, 95% CI 0.82 to 1.09, p=0.434.

Figure 4
Figure 4

The sentiment expressed in free text by women under and of screening age when asked how they felt about artificial intelligence being used to read mammograms in breast screening.

Thematic analysis of the free-text data focusing on the perceived benefits of using AI in the breast screening programme showed that women were most likely to say that they were not sure what these would be (n=543). When they did express a view, the most frequently mentioned perceived benefits were:

  1. Increased efficiency (n=162).

  2. Improved reliability (n=263).

  3. Greater safety (n=139).

A significant number of women expressed the view that AI in breast screening would and should happen (n=847) in the future.

Overall, women of screening age are less likely to use AI for health advice in their everyday life or have a positive view of its effect on society but are more likely to have a positive view on the use of AI in breast screening (table 2).

Table 2
|
Survey results summary

Detailed understanding of attitudes towards to use of AI in breast screening

A total of 25 women took part in six focus groups conducted during July 2020. Overall, 19/25 had either experienced a breast cancer diagnosis themselves or knew someone who had and 18/25 had attended a breast cancer screening appointment. Overall, 15/25 of the women who took part knew that two readers looked at mammograms. Therefore, they were a more informed group than the general population surveyed.

Many of the women who took part expressed the view that the use of AI in healthcare and specifically in the breast screening programme was inevitable. Some saw a positive contribution being made by AI generally. They identified the following key benefits from using AI in breast screening:

  1. Increased efficiency.

  2. Improved reliability.

  3. Improved outcomes and improved safety/fewer errors.

They also hypothesised that introducing AI into the breast screening programme might:

  1. Release staff for higher value patient-centred activities.

  2. Save money for the service.

  3. Help to address the workforce shortage within the breast screening programme.

The main concerns that were expressed by the women were:

  1. The absence of the ‘human touch’ in the diagnostic process.

  2. A lack of clarity on how the AI tools will be governed.

  3. Potential discriminatory bias.

  4. A lack of clarity on how data privacy will be protected.

When asked what kind of actions they thought would mitigate some of their concerns, the women suggested that breast screening process would always need to involve humans. For some women this meant human oversight of the AI technology which undertakes most of the activity including decision-making. For others, the human role is pre-eminent, with AI used only to augment clinical activity and decision-making. The women assumed that this technology would never be used without clear evidence of its effectiveness. They expected the impact on equity of access to breast screening to be closely monitored through governance processes.

Women were divided on whether or not they would want to be informed if AI tools were being used as part of the breast screening process. However, they agreed overall that women should be given information about the role of AI in breast screening as part of the process of informed consent when taking part in the breast screening programme.

Discussion

As the use of AI in the field of radiology accelerates rapidly,26–29 attention has focused on the performance and safety of the algorithms being used. Real-world deployment of these tools is imminent and a greater understanding of radiologist and radiographer attitudes to the technology in different countries across the globe is needed.30–38

This large-scale study, aimed at understanding the attitudes of healthy users to the use of this technology in diagnosis, has shown that women of screening age are open to the use of AI in breast screening. However, they are less likely than women under screening age to use other AI-based health applications. These differences replicate the results of similar studies of attitudes to technology across whole populations.24 25 There are large proportions of women in both groups who are undecided or hold mixed views about the use of AI. They cite a lack of understanding and trust in the technology and a desire to know more. This bears out the findings of recent smaller scale studies.16–18 Women of all ages see human interaction in diagnosis as critical to their experience of high-quality care.

Women of screening age have an immediate interest in screening that is as accurate, quick and reliable as possible. Previous studies7 11 found that those who are identified as ‘patients’ are more likely to perceive positive effects of new technology than those who are identified as ‘healthy users’.

In this case, women of screening age share ‘patient’ attributes as they are currently part of the NHSBSP. The openness of women of screening age to the use of AI in breast screening is moderated by:

  1. A desire to understand more about the technology.39

  2. The evidence to support its performance.40

  3. Its use to augment and not replace clinical interaction and decision-making.13

These moderators are evident in the literature on the adoption of digital health technology generally. Clinical adoption of novel digital technology, including AI, relies on robust evidence of accuracy through high-quality clinical trials.41 There is little evidence yet of a similar direct relationship for public adoption of AI in health. This goes some way to explain the large number of respondents who were equivocal or undecided in their attitudes towards the use of AI in breast screening.

Mass media stories and the views of the clinical professionals they are interacting with are more influential than direct exposure to evidence of accuracy.42 43 Several women responding to the survey highlighted the positive media representation of the Nature article on the performance of AI in breast image reading.1 This influenced their perception of AI in breast screening positively. Women’s views on the importance of retaining human interaction in the diagnostic process confirm the findings of previous studies.9 12

The response rate to the survey was substantially greater than targeted in the study protocol (4096/2435). However, women of Asian ethnicity were under-represented (3% in survey, 6% in East Midlands’ population). To address this, this group was successfully targeted for inclusion in the focus groups by design. Women in paid employment were also over-represented because NHS employees were used as a proxy for the general population. Some of the potential selection biases introduced by the non-probability sampling method were addressed by the mixed methods design of the wider study and purposive sampling for the focus groups. The authors recommend future survey administration should use probability sampling. The survey itself is not a psychometrically tested tool. This limits the generalisability of the findings, although adherence to accepted standards for research survey development have minimised this limitation.

Women invited to population breast screening are important stakeholders in the service and how it is delivered.44 This study demonstrates that women of screening age are open to the use of AI in breast cancer screening. However, there are large proportions of women who are undecided or have mixed views about the use of AI and remain to be convinced that it can be trusted. Understanding their attitudes will be an important factor in the acceptance and adoption of the AI-based technology. Regulators of health technology are starting to understand this.45 Attitudes change over time in response to multiple intrinsic and extrinsic factors. Education and dissemination of information about the use of AI in the clinical pathway will need to be considered.