Objective Electronic prescribing systems often provide a drop-down list of medications and pre-specified reactions to record a patient’s allergy status. This list is non-exhaustive; less common reaction types require the user to add a free text note.
The Careflow Medicines Management EPMA system provides decision support preventing a prescriber initiating a drug a patient has a recorded reaction to. Where a free text reaction is recorded this functionality is not provided which results in increased risk to the patient.
The aim of this project was to identify recurring free text reactions and incorporate these into the system. Future avoidance of free text documentation will improve data quality and make reaction data available during built-in prescribing decision support.
Methods Free-text allergy notes added to the electronic prescribing system since implementation in March 2018 were extracted using structured query language. The data was cleaned and analysed using the Python.
Natural language processing techniques were employed to clean the data and reduce the dimensionality of the data set. A drug library extracted from the electronic prescribing system was used to tag medications within the text.
After pre-processing the most commonly occurring phrases were found by counting the most frequent bigrams present in the text. Further analysis was carried out using the apriori algorithm.
Results A total of 2872 notes were identified for analysis. The most common terms found were already included as part of the electronic prescribing systems allergy documentation system. This included the terms ‘rash’ and ‘penicillin’ which were recorded 480 and 400 times respectively. Of the top 20 most frequently appearing terms two were identified as not included in the system. These were ‘swelling’ which was recorded 320 times and ‘pain’ documented 210.
Applying a Bi-gram and filter identified that the term swelling was most often associated with the phrase ‘ankle swelling’ which appeared 60 times. The apriori algorithm identified an association between the terms ankle and swelling and amlodipine with high levels of confidence.
Pain was most often associated with the phrases ‘chest pain’ appearing 38 times and ‘abdominal pain’ or ‘abdo pain’ appearing a combined 55 times. Both are reaction types which cannot be documented in the prescribing system without the addition of a free-text note.
Conclusion Natural Language Processing can be applied to large sections of unstructured clinical documentation to quickly analyse themes and trends. With appropriate cleaning and manipulation of the data commonly occurring phrases relevant to clinical practice can be identified.
This permitted recurring drug reactions to be identified and added to the electronic prescribing system. It is hoped this will reduce the frequency of free-text notes added in the future and improve reaction documentation. It is anticipated that patient safety will be improved by making more reaction data available for electronic decision support.
Packages such as Python NLTK used for natural language processing are freely available and allow users to process data which would be too time consuming to process manually.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.