Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Two BMJ Health & Care Informatics editors’ choice papers present insights based on case studies from real-world data and machine learning models for clinical risk prediction use cases. Seneviratne et al focus on case management to demonstrate how one might implement their proposed user-centred design toolkit consisting of process maps, storyboards and four questions.1 This toolkit was developed to address the tendency to develop machine learning models in an opportunistic manner based on the availability of data rather than through fundamental design principles focused on resolving the actual pain points of stakeholders. Chen et al created a screening tool for the early detection of patients with autism spectrum disorder (ASD).2 Here, we present a critical thought exercise using the four questions in the user-centred design toolkit, applied to the ASD screening tool. Any gaps identified are not intended to serve as criticism of the ASD tool, but serve as illustrative examples of the potential utility of the user-centred design toolkit. The authors examine the ASD tool from the perspectives of a conversation between clinicians and developers.
Question 1: Where are the current pain points? From the clinical perspective, early identification of patients with ASD is critical for active brain development, which is influenced by both genetics and experience. While benefits from early intensive applied behaviour analysis can reduce the need for support services such as occupational, physical and speech therapy over their lifetimes, current ASD screening tools perform poorly.2–4 From an artificial intelligence perspective, we often have a hard time identifying ASD aetiology. Most cases are idiopathic, but sometimes a cause is known (eg, fragile X syndrome). A structural causal model (often depicted as a directed acyclic graph) can be helpful to explore potential relationships between model features. For example, co-occurrences of ASD with anxiety and attention-deficit/hyperactivity disorder are common, as well as other neurological disorders, and recent studies have shown a link between ASD and the gut microbiome, but correlations are not necessarily causative.5 More concerning, individual confounders, such as vaccination and circumcision, have repeatedly been shown not to cause ASD but are circumstantial; as ASD unfolds over the first few years of life, a child can appear typical in the first few months before regressing during the time when babies are being vaccinated.6
Question 2: Where could machine learning add unique value? Large-scale retrospective clinical claims data contains potentially meaningful causal signals among spurious correlations that can be identified through application of various data science techniques. Chen et al used Lasso regularisation and random forests for dimensionality reduction in order to identify complex and hidden correlations in the data. Machine learning may be helpful in addressing temporal biases in the data. For example, prior research conducted over the same time frame of Chen et al suggests an artificial increased prevalence due to improved awareness and changes in the diagnostic criteria of ASD.7 In Chen et al, the key predictor of ASD was male, however, increased clinical attention to identifying ASD in females over the years of the study has demonstrated a statistical decrease in the male/female ratio which may manifest as drift in machine learning models.2 7
Question 3: How will the model output be acted on? The authors do not propose a specific clinical workflow within which this ASD screening tool could be used, though they speculate that it could be used as a triaging tool for identifying patients that would benefit from a comprehensive diagnostic evaluation that involves the integration of behavioural symptoms in the context of developmental history, family factors and cognitive level.
Question 4: What criteria should the model be optimised for? Screening tests are helpful if they can rule-in (Specific test when Positive rules IN the disease) or rule-out (Sensitive test when Negative rules OUT the disease) a diagnosis.8 9 Given that ASD is a rare (2.3%), underdiagnosed disorder, choosing a threshold based on the positive predictive value (PPV)-sensitivity curve (also known as thearea under the Precision-Recall curve (AUPRC)) may be challenging and suboptimal because PPV, in this case, is severely adversely affected by prevalence. Chen et al report PPV at three sensitivity target thresholds, with minimal improvements to PPV. Instead, optimising and choosing thresholds based off of the often-forgotten negative predictive value (NPV)-specificity curve may be more appropriate as early ASD interventions are costly, and having a high NPV reassures providers that the child is not likely being harmed through inaction or delay.3 4
This thought exercise adds insights to the existing evaluation of the ASD screening tool. This suggests that the Seneviratne et al toolkit is a potentially useful practical addition to the multitude of clinical machine learning guidelines, emphasising the utility of connecting with stakeholders to codesign models that are clinically meaningful and implementable in real-world workflows.1
Patient consent for publication
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Commissioned; externally peer reviewed.