Signal processing and machine learning algorithm to classify anaesthesia depth
•,,.
...
Abstract
Background Poor assessment of anaesthetic depth (AD) has led to overdosing or underdosing of the anaesthetic agent, which requires continuous monitoring to avoid complications. The evaluation of the central nervous system activity and autonomic nervous system could provide additional information on the monitoring of AD during surgical procedures.
Methods Observational analytical single-centre study, information on biological signals was collected during a surgical procedure under general anaesthesia for signal preprocessing, processing and postprocessing to feed a pattern classifier and determine AD status of patients. The development of the electroencephalography index was carried out through data processing and algorithm development using MATLAB V.8.1.
Results A total of 25 men and 35 women were included, with a total time of procedure average of 109.62 min. The results show a high Pearson correlation between the Complexity Brainwave Index and the indices of the entropy module. A greater dispersion is observed in the state entropy and response entropy indices, a partial overlap can also be seen in the boxes associated with deep anaesthesia and general anaesthesia in these indices. A high Pearson correlation might be explained by the coinciding values corresponding to the awake and general anaesthesia states. A high Pearson correlation might be explained by the coinciding values corresponding to the awake and general anaesthesia states.
Conclusion Biological signal filtering and a machine learning algorithm may be used to classify AD during a surgical procedure. Further studies will be needed to confirm these results and improve the decision-making of anaesthesiologists in general anaesthesia.
What is already known on this topic
Poor assessment of anaesthetic depth (AD) during general anaesthesia may result in overdosing or underdosing of the anaesthetic agent. Currently, there is no integration of biological signals processed by an automatic learning algorithm that allows analysing the AD during surgical procedures and avoiding complications during surgical procedures.
What this study adds
A classification system has been carried out with the monitoring of brain electrical activity to assess the depth of anaesthesia. This investigation describes an AD classification process method that includes the collection of biological signals, conditioning of said signals, monitoring of the activity of the central and autonomic systems, measurement of indices and classification of patterns in AD.
How this study might affect research, practice or policy
This algorithm provides a reliable and well-performing tool to estimate and monitor the depth of anaesthesia in surgical procedures. The application of this innovation makes it possible to eliminate ambiguity in monitoring during the reduction of intraoperative consciousness and to reduce the risk of complications associated with deep anaesthesia.
Introduction
Poor assessment of anaesthetic depth (AD) during general anaesthesia can result to overdosing or underdosing of the anaesthetic agent.1 2 In the context of anaesthetic agent overdose, extreme AD has been associated with an increased risk of mortality,3–6 intraoperative hypotension and hypoperfusion of heart and brain,7 perioperative nausea, vomiting and delirium.7–10 In the case of low dosage, there have been reports of intraoperative awareness, with an incidence of 0.1%–0.2%, approximately 26,000 cases per year in the USA.11 12
Assessment of AD through clinical signs such as state of consciousness, limb movement, heart rate, pupil size, blood pressure, arterial blood oxygen and perspiration is used in general anaesthesia because it reflects the activity of the autonomic nervous system (ANS) and central nervous system (CNS).13 Evoked potentials, entropy which include state entropy (SE) and response entropy (RE), Bispectral Index and Narcotrend indices are objective measurements of the activity of the ANS.14 All these indices are based on different algorithms that analyse and record changes in electroencephalography (EEG) signals and convert them into numerical values that correspond to certain levels of unconsciousness.15–19 Despite quantification of anaesthetic levels by these new technologies, there are issues such as reports of ambiguity in the reduction of intraoperative awareness and burst suppression pattern misinterpretation.14 20–23
Burst suppression pattern appears during deep anaesthetic levels, which may be interpreted as an error by the Bispectral Index and entropy indices22 24; causing a false estimation of AD, and decreasing the safety margin between anaesthetic administration and optimal anaesthetic level.22 23 25–27 Another issue is that the previously mentioned indices and devices do not take into consideration ANS variables as part of the EEG indices used in DA level quantification and classification.14 28 Therefore, there is no definitive gold standard for the evaluation of AD levels during surgery or intensive care units.20 29 Regarding the evaluation of ANS activity, heart rate variability is used to determine sympathetic or parasympathetic predominance, which could provide additional information on AD monitoring during surgical procedures. In our study, a machine learning algorithm was created that uses neural networks and physiological variables to classify AD levels.
Methods
Observational analytical study is carried out at the clinic at the Universidad de La Sabana, Chía, Colombia. Information on biological signals was collected during a surgical procedure under general anaesthesia for signal preprocessing, processing, and postprocessing to feed a pattern classifier to determine AD status of patients.
Criteria eligibility
Patients between 18 and 65 years old taken to general anaesthesia with 8-hour fasting, American Society of Anesthesiologists I and II, prior outpatient preanaesthetic assessment were included. Patients taking drugs with effects on the CNS and ANS, premedicated patients (opiates, antiemetics and sedatives such as benzodiazepines) and those who presented ANS alterations during surgical procedures, hearing and communication problems and allergy to propofol were excluded.
Data acquisition
General anaesthesia was administered with an infusion bomb using target control (B. Braun Medical, USA). Anaesthesia induction was done using 5 ng/mL of remifentanil (Minto model) and 2.5 µg/mL of propofol (Schneider). Data acquisition was initiated 4 min before induction and finalised after having a verbal response from the patient after the surgical procedure. The EEG and ECG signals were collected using a frontal entropy sensor and the S/5TM Collect software with a sampling frequency of 300 Hz. SE and RE were collected at 0.2 Hz. The correct functioning of the non-invasive blood pressure (NIBP) sensor was also verified, and NIBP values were collected every 2.5 min. Six clinical states were defined in online supplemental file 1.
CNS signal preprocessing
The main objective is that the signal really reflects the biological phenomenon of interest, reducing artefacts that contaminate the signal products due to electrical noise, surgical instruments and physiological artefacts such as eye movements. A technique which consisted of artefact noise filtering through a wavelet mother function was used. Those values superior to a specific threshold are removed from the signal by assigning a zero to the respective coefficient.30–33 Initially, 5 s of contaminated and non-contaminated EEG signal samples were selected by visual inspection of 20 records. Posteriorly, the stationary discrete wavelet transforms of six levels, with a coiflet-3 as a mother function, was applied to each signal sample (frequency bands 0–2.33 Hz, 2.33–4.69 Hz, 4.69–9.38 Hz, 9.38–18.75 Hz, 18.75–37.5 Hz, 37.5–75 Hz, 75–150 Hz). The wavelet function (coiflet-3) was chosen due to its morphology and its similitude to an ocular artefact. Through observation of wavelet function (high and low frequencies) significant median differences were observed. This means that the wavelet function has the potential to treat high-frequency artefacts. Additionally, a digital filter with a cut-off frequency of 47 Hz was applied to avoid noise from the power line (50 Hz or 60 Hz), and in general terms high-frequency contamination due to surgical instrument.
An additional threshold vector for low-frequency components and a scan of low-frequency wavelet components were defined to determine significant differences between EEG epochs under general anaesthesia and epochs with contaminated EEG recordings from an awake patient (online supplemental file 2).
CNS signal processing
Complexity sample entropy (SampEn) and permutation entropy measurements were obtained from successive 5-second rectangular windows. The calculations performed for SampEn are described in online supplemental file 3. Permutation entropy provides a greater probability of prediction in general terms but fails when it must quantify the pattern associated with AD. On the other hand, SampEn provides in general terms a lower probability of prediction, but it is a good measure of complexity to predict deep anaesthesia and quantify the burst suppression pattern, prediction probability values (Pk) paired with general anaesthesia, light anaesthesia and waking state were, respectively, 0.925, 0.942 and 0.967. Permutation entropy and SampEn are combined in the proposed index as follow: permutation entropy dominates the behaviour of Complexity Brainwave Index (CBI) in the induction phase. Once the permutation entropy value crosses the median of the respective box diagram for general anaesthesia, the SampEn algorithm is activated to predict AD states. The response of the index is given according to the decision rules in online supplemental file 4.
ANS signal preprocessing
The power in the bands LF (low frequency) and HF (high frequency) was estimated using the wavelet transform, in contrast to classical methods such as Fourier analysis, the wavelet transform does not assume stationarity of the signal analysed, and therefore fits better to evaluate transient and rapid changes in the heart rate variability series.34 Wavelet Daubechies-2 was used to decompose the signal, a decomposition was performed at eight levels, the high frequency component (WC-HF) was estimated by adding the relative contribution of the coefficients of levels 4–5, and the low frequency component (WC-LF) was estimated by adding the relative contribution of levels 6–7. These values can be normalised to express proportions of a total power defined by the sum of WC-HF and WC-LF.35 It is important to describe that the same wavelet filtering method was collected and applied to the ECG signal and the NIBP; later, according to the Pan-Tompkins algorithm,36 R peaks were detected to form the series of relative risk intervals.
ANS signal processing
Poincare analysis and cardiac regulation: non-linear methods have been proposed to evaluate cardiac function in volunteers using pharmacological experimentation, under controlled conditions of autonomic blockade with atropine and propranolol. Two non-linear indices of autonomic function have been proposed from the Poincare descriptors: An index sensitive to vagal cardiac function called Cardiac Vagal Index (CVI), CVI=log10 (SD1*SD2); an index sensitive to cardiac sympathetic function called Cardiac Sympathetic Index (CSI), CSI=SD2/SD1. The change in the indices suggests a shift in regulatory activity, not the degree of activity or tone of the SNA.37 The series formed by the duration of the intervals between R peaks in the ECG was analysed in windows of 60 s with an overlap of 91.67%, so each time is composed of 5 s of new information and the last 55 s of the previous era. Initially, the classification of the patient’s condition is based on the CBI indicator.
Design of pattern classifiers
The algorithms for classifying the patterns produced by the predictors of the CNS and ANS were designed with the aim of minimising the classification error in cross validation. In this way, a possible overfitting of the classifier is controlled. The classifiers were designed considering the following combinations of predictive indices {CBI, CVI}, {CBI, CSI}, {CBI, NIBP}, {CBI, CVI, CSI}, {CBI, CVI, NIBP}, {CBI, CSI, NIBP} and {CBI, CVI, CSI, NIBP}. The kth partition is used for the validation of the classification error, the classifier is adjusted or trained considering the remaining partitions of the data set. The above is done for k=1, 2, and finally the K classification errors are averaged. In general terms 5 or 10 partitions are recommended.36
Postprocessing of CNS–ANS
The entropy parameters were postprocessed with an S-shape function (Eq. 1) to obtain a mathematical index between 0 and 100. Parameters a and b were estimated according to the values of the first awakened and third quartile deep anaesthesia of the graph on the right in online supplemental file 4. Subsequently, a moving average filter of three entropy calculations was applied to reduce dispersion and achieve a smoother response rate that considers previous states. When a new entropy value was calculated, it was averaged with the two previous entropy calculations, or the number of entropies calculated for the first windows.
The process of classification of anaesthetic depth
This process comprised two main parts: (1) the analysis and selection of the predictors of the central nervous and autonomic systems. (2) the design of pattern classifiers. The pattern classifier was designed through the patient data set, formed by the biological signals of 60 patients (EEG, ECG, NIBP, SpO2), and the respective anaesthesia record. Hence, the use and change of concentration of the drugs is evidenced, as well as the moment in which the patient performs some type of movement during the surgical act. The predictors’ response in the following clinical events is analysed (online supplemental file 5). Clinical events define four states (categories to classify) of AD, and predictors of the CNS and ANS are described in online supplemental file 6.
Simple size and data recollection
The sample size was calculated for a correlation coefficient of 0.9, with a confidence level of 95%, accuracy of 10%, number of tests two, it is requiring a minimum of 60 subjects. Data were fully collected by the investigators and compiled using a secure server (Research Electronic Data Capture, REDCap software) and later development of the EEG Index was carried out through data processing and algorithm development using MATLAB V.8.1.
Results
A total of 25 men and 35 women were included, with a total time of procedure average of 109.62 min. Regarding the EEG analysis and CBI, the results show a high Pearson correlation between the CBI and the indices of the entropy module. Nevertheless, a high Pearson correlation does not necessarily imply that the behaviour of the indices agrees. On other hand, lower correlation values were reported by the intraclass correlation coefficient between CBI and the entropy module indices. In figure 1, the probability of prediction and the box diagrams corresponding to the patterns defined in the EEG. Li (light anaesthesia in recovery) and Lr (light anaesthesia on induction) were grouped in the same anaesthetic class or category, also Ak (awakened) and Rc (awakened, recovery). A higher prediction probability was provided by the CBI (Pk=0.935), SE (Pk=0.884) and RE (Pk=0.899).
Box plot diagrams for EEG patterns associated with previously defined clinical states, and prediction probability values associated with CBI, SE and RE. Ak, awakened; Bs, deep anaesthesia associated with suppression burst pattern; CBI, Complexity Brainwave Index; Da, deep anaesthesia; Ga, general anaesthesia; La, light dose; Li, light anaesthesia on induction; Lr, light anaesthesia in recovery; Pmk, probability of paired prediction; Rc, awakened, recovery; RE, response entropy; SE, state entropy.
A greater dispersion is observed in the SE and RE indices, a partial overlap can also be seen in the boxes associated with deep anaesthesia and general anaesthesia in these indices. A high Pearson correlation might be explained by the coinciding values corresponding to the awake and general anaesthesia states. The Bland-Altman graph of figure 2 shows that the differences between CBI and the entropy module indices exceed the concordance limits mainly for average values between 60 and 80 and 20 and 40, respectively. This suggests a lack of concordance in the states of light anaesthesia (estimated range: 60–80) and deep anaesthesia (estimated range: 20–40). The CBI, SE and RE associated with the defined clinical events are presented in figure 3.
Bland-Altman graphs to evaluate the agreement between CBI and the SE and RE indices. CBI, Complexity Brainwave Index; ICC, intraclass correlation coefficient; RE, response entropy; SE, state entropy. *The limits of agreement are defined as the average value (red line segmented mean)±2 SD (red line segmented upper and lower).
Values of CBI, SE and RE to different states clinical. CBI, Complexity Brainwave Index; RE, response entropy; SE, state entropy. *Triangle pointing down: induction of total intravenous anesthesia; circle: beginning of airway management; diamond: beginning of surgery; square: end of surgery; triangle pointing up: start of extubation. Figure developed by the author.
In the present article, we review the probability of prediction of the patient’s condition was estimated for all predictors shown in figure 4. In table 1 La (light dose), the CBI showed a similar performance when compared with the other indices being; SD1—light dose the best with a Pmk of 0.86, followed by CSI—light dose with Pmk of 0.85, CVI—the 0.84 Pmk and CBI 0.83.
Box plot diagram for probability of prediction (Pk) of the patient’s condition for central nervous system and autonomic nervous system indices. Ak, awakened; Bs, deep anaesthesia associated with suppression burst pattern; CBI, Complexity Brainwave Index; CSI, Cardiac Sympathetic Index; CVI, Cardiac Vagal Index; Da, deep anaesthesia; Ga, general anaesthesia; La, light dose; Li, light anaesthesia on induction; Lr, light anaesthesia in recovery; Rc, awakened, recovery; SD1/SD2, Poincare chart descriptors; WC-HF, high frequency component; WC-LF, low frequency component; WC-HFn, high frequency power of wavelet coefficients, and respective normalisation; WC-LFn, low frequency power of wavelet coefficients, and respective normalisation. *A total of 25 light analgesia states were identified—La. There is a reduction in the performance of CBI (from 0.935 to 0.823) when considering the event light dose—La, this mainly due to overlap with the range of values associated with the event of general anaesthesia—Ga. It can be noted that SNA-related indices alone provide a poor probability of predicting the anaesthetic depth (around 0.5, which indicates that the prediction isn’t better than chance). However, the moustache diagrams seem to indicate differences in respect to other states in the methods derived from the analysis of the Poincare chart.
Table 1
|
Probability of paired prediction
The capacity and clinical skills of trained medical staff may be affected by external factors such as personal problems, work fatigue, among others. Besides, a physician’s learning curve is not a constant independent of the previously mentioned factors, that’s why it’s necessary to compare the most promising machine learning methods to classify different anaesthetic levels obtaining the best outcome. In this study, the following results were obtained: In the decision tree, data set classification error and cross validation error were lowest with the data sets combinations of CBI–CVI–NIBP and CBI–CSI–NIBP. In the Bagging and adaptive Boosting Assembly methods, the CBI–CSI–NIBP and CBI–CVI–CSI data set groups showed the lowest classification error and X-Val errors. In the case of the neuronal network, lowest classification error and X-Val values were in the CBI–CVI–NIBP group. On the neuro-adaptive fuzzy inference system method, the CBI-CVI data set presented the lowest errors. However, when comparing all the previously mentioned methods, the neuronal network method showed the lowest classification error and X-Val values with the CBI–CVI–NIBP (table 2).
Table 2
|
Classifiers performance in deep anaesthesia
Discussion
The present study developed an algorithm that jointly considers changes in ANS and CNS pattern activity, to classify AD. Most devices used to assess anaesthetic effects on cerebral activity rely on EEG-based indices with ambiguity in reduction of intraoperative awareness.14 21 37 Among the most used EEG-based indices, one finds entropy and Bispectral Index.38 However, there have been reports of better performance by RE Index over Bispectral Index as predictor of response to painful stimulus.38 In our study, we demonstrated that an algorithm based on CBI along with other clinical variables related to ANS activity has a better performance in the classification of AD over the already known entropy indices.
Highlighting the process of innovation in medicine, we mention that this method of classification process of AD that includes the collection of biological signals, conditioning of said signals, monitoring of the activity of the central and autonomic systems, measurement of indexes and classification of patterns in AD was patented in the USA (US11504056B2), Brazil (BR112020013317A2), Colombia (CO2016002707A1) and the World Intellectual Property Organization (WO2019179544A1).39 40
The main difference with the other EEG indices previously mentioned lies in the fact that this algorithm uses clinical states to classify anaesthetic states, while combining them with CNS and ANS derived predictors such as CBI, CVI, CSI and NIBP.41–43 The present algorithm included clinical events such as anaesthetic dose adjustment and movement during surgery as inputs in the classification of AD as a light anaesthesia state. This could explain the low global concordance between the algorithm-related CBI and the entropy indices observed in intermediate and deep anaesthesia states seen in the Bland-Altman graphs in the Results section.42 This means that our algorithm detects dose adjustments or movement during surgery to classify intermediate anaesthesia depth, and therefore providing more opportunities for faster detection and response in the case of intermediate anaesthesia states.42
Another important aspect related to the comparison of EEG indices performance was the difference between CBI and entropy indices. A higher entropy index activity in comparison with the CBI was observed. This is most likely explained by a failure of the entropy indices in the detection of burst suppression pattern, which could be misread as the awake state.22 This could result in misinterpretation by the anaesthesiologist, which could lead in an increased anaesthetic administration. Thus, in the case of CBI, this showed a better response to burst suppression pattern. These results suggest that CBI is a better alternative; hence, reducing the error in the assessment of deep anaesthesia as the awake state and subsequent probability of dangerous anaesthetic overdose, and its derived complications.4 43
In recent years, a change of paradigm has been proposed, considering the monitoring with indices based on brain electrical activity and the monitoring of standard parameters as complementary methods, and not as techniques that compete for patient care. There has been the development of classificatory system integration with other parameters correlated to ANS activity. In the present study, by comparison of cross validation errors for the different methods and a confusion matrix for neural network, different machine learning methods were implemented to estimate the best method for comparing predictors derived from CNS and ANS.43
Among the different classification methods, our study found that the neuronal network with a hidden layer had the lowest cross-validation error when combining the CBI, CVI and NIBP predictors. This means our machine learning based classification algorithm had the best performance when neuronal networks were used. This clinically translates into a better prediction of AD states. However, it is important to mention the lower performance for awake and general anaesthesia states where the highest error was seen in error matrix. Therefore, such anaesthetic states remain to be a challenge by current classificatory systems as observed in the Results section. Despite the lower prediction values, the CBI used in our algorithm still shows the highest prediction value when compared with the other predictor variables. This means our algorithm, although it presents such limitations, still performs better than the other AD classification methods. Finally, this research was based on retrospective analysis of medical records, associated with information biases; however, the research group has adequate training for the analysis and interpretation of the results. Similarly, being a single-centre study may limit the extrapolation of the results.
Conclusion
Biological signal filtering and a machine learning algorithm can be useful to classify AD during a surgical procedure. In our study, we show that an algorithm based on CBI together with other clinical variables related to ANS activity has a better performance in the classification of AD over the already known entropy indices.