TY - JOUR T1 - 18 Implementing SNOMED CT in the oxford royal college of general practitioners clinical informatics digital hub (ORCHID) some problems encountered and lessons learned JF - BMJ Health & Care Informatics JO - BMJ Health Care Inform SP - A11 LP - A12 DO - 10.1136/bmjhci-2022-FCIASC.18 VL - 29 IS - Suppl 1 AU - John Williams Y1 - 2022/11/01 UR - http://informatics.bmj.com/content/29/Suppl_1/A11.abstract N2 - Objective Identification of significant problems encountered and solutions adopted while implementing SNOMED CT to replace legacy coding schemes in a busy research and surveillance unit using patient level coded General Practice data held in a database populated by extraction from a subset of English General Practices:Setting up a full SNOMED CT database from scratchChanging data extraction/search processes throughout the unit away from the use of legacy Read version 2 and Clinical Terms Version 3 codelists to reusable SNOMED CT ‘variables’ held in a libraryEstablishing a robust process for curating, storing and maintaining SNOMED CT ‘variables’Methods Retrospective review of an implementation project.Setting up full SNOMED CT database. Research required to find clear instructions as to how the release files available from TRUD should be processed to build a fully functional database and to avoid pitfalls. Further research to develop understanding of SNOMED CT concept inactivation and how to mitigate effectsCollation of legacy codelists into consistent format to pass through cross mapping tablesDesign and implementation of infrastructure to hold reusable SNOMED CT ‘variables’ taking into account naming, provenance, metadata to be included, handling of inactive conceptsDevelopment of robust and time efficient SNOMED CT variable curation process o Development of supporting tools o Training of clinicians to curateExplaining to researchers the concept of reusable ‘variables’ and the need for them to modify practices in order to match research and surveillance data needs to an existing library of ‘variables’ and to seek curation of new variables to fill gaps o Consideration of problems with defining research/surveillance data requirementsProviding the means to search the libraryExplanation of the implications of inactivationsVersion controlsConsideration of how best to convey the coverage and definition of ‘variables’ to othersResultsSNOMED CT database successfully set up: Combination of experimentation, outdated advice found in grey literature, informal help from terminology expert colleagueLegacy codelists: Found 350 in multiple formats, little or no provenance or definition, idiosyncratic naming. All translated in batch via cross mapping tables. Resulting outputs used as substrate for full curation. Only 154 of these taken forward. Full curation typically added many extra active and inactive conceptsInfrastructure developed: Supporting:Unique naming and numbering of ‘variables’ o Agreed editorial principles for naming o Recording of dates and names of curator and checkerAgreed metadata including output type, option for free text commentStorage of ‘variables’ in supertype/subtype formatGeneration of concept flatlists for searches on demandAgreed curation process, making best use of supertypes that can be added or subtracted. ‘SNOMED CT helper tool’ developed. Curating team trained in its use. All ‘variables’ checked by second team memberInteraction with researchers.Difficulties with:Shifting thinking away from fixed code listsObtaining plain English definitions of requirementsMatching requirements to existing ‘variables’/to identify gaps; help needed from curation teamExplaining implications of inactivationsScepticism about re-usabilityConclusionsSNOMED CT database implementation hampered by poor quality, inaccessible, guidanceCross mapping legacy codelists of limited value. Significant time wasted in inferring definition/purpose. Curation against full SNOMED CT led to richer more complete concept lists, and rejection of some original concepts as erroneous. Less than half of legacy codelists were fully processed into the library. Better to start afresh and apply clear definition direct to SNOMED CTInfrastructure o ‘Variables’ stored in supertype/subtype formulation easily exportable as Expression Constraint Language (ECL) statement which is human readable and computable. Built-in mitigation for inactivations occurring over timeEasy to overlook resources required to design and implement fit for purpose supporting infrastructureNo agreed standards for:▪ Naming ‘variables’▪ Associated metadataCuration process:Good support tooling essential to achieving major savings in time and increased efficacy.Curators should▪ Have clinical knowledge▪ Work as a team▪ Check each others’ workInteraction with researchers o Reproducibility of ‘variables’ still dependent on code lists whereas SNOMED CT version plus ECL formulation might be more robust and meaningfulData requirements evolve as projects develop, leading to variable mapping changes. Version control of documentation essential ER -