Introduction
Breast cancer is the first and most common type of cancer in women.1 2 Anatomically, the breast consists of healthy blood vessels, connective tissue, ductal lobules and lymph nodes.3 Breast cancer is a problem with abnormal growth of the breast cells. By 2040, the burden of breast cancer is predicted to increase to over three million new cases and one million deaths every year because of population growth and ageing alone.2
Breast cancer is highly treatable if identified at an early stage, and hence, early detection is crucial to save lives. Among the methods of breast cancer detection, the most popular are ultrasound (US),4 mammography5 and MRI. However, traditional computer-aided design systems generally depend on manually created features and experience of the physiologist, therefore weakening the overall performance of breast cancer identification. Therefore, artificial intelligence (AI) methods like machine learning and deep learning-based techniques have emerged for breast cancer diagnosis with high accuracy. Additionally, improved breast cancer classification by combining graph convolutional network and convolutional neural network6 and abnormal breast identification by a nine-layer convolutional neural network with parametric rectified linear unit and rank-based stochastic pooling are used to support patients and doctors’ decisions.7 However, the algorithms lack ethical AI, right of explanation and trustworthy AI. These concepts are considered critical issues by high-level political and technical bodies (eg, G20, EU expert groups, Association of Computing Machinery in the USA).8 9
Additionally, AI algorithms like machine learning and deep learning are vulnerable to bad stuff (bad decisions, bad medical diagnosis and bad prediction) is the most common drawback of AI algorithms today. They are also black box for predictive interpretation.
To overcome this issue, the science of explainable AI (XAI) has grown exponentially with its successful application in breast cancer diagnosis. However, it still requires a comprehensive review of existing studies to help researchers and practitioners gain insight and understanding of the field. Therefore, his systematic review is conducted.
XAI is the extent to which people can easily understand the model. It has received much attention over the past few years. The purpose of a model explanation is to clarify why the model makes a certain prediction, to increase confidence in the model’s predictions10 and to describe exactly how a machine learning model achieves its properties.11 Therefore, using machine learning explanations can increase the transparency, interpretability, fairness, robustness, privacy, trust and reliability of machine learning models. Recently, various methods have been proposed and used to improve the interpretation of machine learning models.
There are different taxonomies for machine learning explainability. An interactive explanation allows consumers to drill down or ask for different types of explanations until they are satisfied, while a static explanation refers to one that does not change in response to feedback from the consumer.12 A local explanation is for a single prediction, whereas a global explanation describes the behaviour of the entire model. A directly interpretable model is one that by its intrinsic transparent nature is understandable by most consumers, whereas a post hoc explanation involves an auxiliary method to explain a model after it has been trained.13 Self-explaining may not necessarily be a directly interpretable model. By itself, it generates local explanations. A surrogate model is usually a directly interpretable model that approximates a more complex model, while visualisation of a model may focus on parts of it and is not itself a full-fledged model.
No single method is always the best for interpreting machine learning.12 For this reason, it is necessary to have the skills and equipment to fill the gap from research to practice. To do so, XAI toolkits like AIX360,12 Alibi,14 Skater,15 H2O,16 17 InterpretML,18 19 EthicalML-XAI,19 20 DALEX,21 22 tf-explain,23 Investigate.24 Most interpretations and explanations are post hoc (local interpretable model-agnostic explanations (LIME) and SHapley Additive exPlanations (SHAP). LIME and SHAP are broadly used explanation types for machine learning models from physical examination datasets. But these made explanations with limited meaning as they lacked fidelity and transparency. However, deep learning and ensemble gradients are preferable in performance for image processing and computer vision. This research is processing mammography and US images. Therefore, deep learning is recommended for breast cancer image processing.
Ensemble gradients are used to interpret deep neural networks,11 GradientSHAP is a sample interpretation algorithm that approximates SHAP values.25 Occlusion methods are most useful in situations such as image processing. Biological nurturing(BN) is ideal for clinical decision-making and, in general, for all assessments and studies involving multiple interventions and orientations. The oriented, modified integrated gradient (OMIG) interpretability method is inspired by the integrated gradients method. Since there is no one-size-fits-all approach to learning machine explanation, it needs a comprehensive evaluation of published papers and tools to bridge the gap in research to practice.
The research that does not consider objective metrics for evaluating XAI may lack significance and experience controversy, especially if negative reviews are not used.8 To avoid the issues, a study8 suggests four metrics based on performance differences, D, between the explanation’s logic and the agent’s actual performance, the number of rules, outputted by the explanation, the number of features, F, used to generate the explanation, and the stability, S, of the explanation. It is believed that user studies that focus on D, R, F and S metrics in their evaluations are inherently more valid.
The main contributions of this systematic review are:
Investigating XAI methods popularly applied for breast cancer diagnosis.
Identifying the algorithm’s explainability and their performance relation in breast cancer diagnosis.
Summarise the evaluation metrics used for breast cancer diagnosis using XAI methods.
Summarise existing ethical challenges that XAI overcomes in breast cancer diagnoses.
Analysing the research gaps and future direction for XAI for breast cancer detection.