Article Text

Download PDFPDF

18 Implementing SNOMED CT in the oxford royal college of general practitioners clinical informatics digital hub (ORCHID) some problems encountered and lessons learned
  1. John Williams
  1. FCI Trustee


Objective Identification of significant problems encountered and solutions adopted while implementing SNOMED CT to replace legacy coding schemes in a busy research and surveillance unit using patient level coded General Practice data held in a database populated by extraction from a subset of English General Practices:

  • Setting up a full SNOMED CT database from scratch

  • Changing data extraction/search processes throughout the unit away from the use of legacy Read version 2 and Clinical Terms Version 3 codelists to reusable SNOMED CT ‘variables’ held in a library

  • Establishing a robust process for curating, storing and maintaining SNOMED CT ‘variables’

Methods Retrospective review of an implementation project.

  • Setting up full SNOMED CT database. Research required to find clear instructions as to how the release files available from TRUD should be processed to build a fully functional database and to avoid pitfalls. Further research to develop understanding of SNOMED CT concept inactivation and how to mitigate effects

  • Collation of legacy codelists into consistent format to pass through cross mapping tables

  • Design and implementation of infrastructure to hold reusable SNOMED CT ‘variables’ taking into account naming, provenance, metadata to be included, handling of inactive concepts

  • Development of robust and time efficient SNOMED CT variable curation process o Development of supporting tools o Training of clinicians to curate

  • Explaining to researchers the concept of reusable ‘variables’ and the need for them to modify practices in order to match research and surveillance data needs to an existing library of ‘variables’ and to seek curation of new variables to fill gaps o Consideration of problems with defining research/surveillance data requirements

  • Providing the means to search the library

  • Explanation of the implications of inactivations

  • Version controls

  • Consideration of how best to convey the coverage and definition of ‘variables’ to others


  • SNOMED CT database successfully set up: Combination of experimentation, outdated advice found in grey literature, informal help from terminology expert colleague

  • Legacy codelists: Found 350 in multiple formats, little or no provenance or definition, idiosyncratic naming. All translated in batch via cross mapping tables. Resulting outputs used as substrate for full curation. Only 154 of these taken forward. Full curation typically added many extra active and inactive concepts

  • Infrastructure developed: Supporting:

  • Unique naming and numbering of ‘variables’ o Agreed editorial principles for naming o Recording of dates and names of curator and checker

  • Agreed metadata including output type, option for free text comment

  • Storage of ‘variables’ in supertype/subtype format

  • Generation of concept flatlists for searches on demand

  • Agreed curation process, making best use of supertypes that can be added or subtracted. ‘SNOMED CT helper tool’ developed. Curating team trained in its use. All ‘variables’ checked by second team member

  • Interaction with researchers.

  • Difficulties with:

    • Shifting thinking away from fixed code lists

    • Obtaining plain English definitions of requirements

    • Matching requirements to existing ‘variables’/to identify gaps; help needed from curation team

    • Explaining implications of inactivations

    • Scepticism about re-usability


  • SNOMED CT database implementation hampered by poor quality, inaccessible, guidance

  • Cross mapping legacy codelists of limited value. Significant time wasted in inferring definition/purpose. Curation against full SNOMED CT led to richer more complete concept lists, and rejection of some original concepts as erroneous. Less than half of legacy codelists were fully processed into the library. Better to start afresh and apply clear definition direct to SNOMED CT

  • Infrastructure o ‘Variables’ stored in supertype/subtype formulation easily exportable as Expression Constraint Language (ECL) statement which is human readable and computable. Built-in mitigation for inactivations occurring over time

    • Easy to overlook resources required to design and implement fit for purpose supporting infrastructure

    • No agreed standards for:

      • Naming ‘variables’

      • Associated metadata

  • Curation process:

    • Good support tooling essential to achieving major savings in time and increased efficacy.

    • Curators should

      • Have clinical knowledge

      • Work as a team

      • Check each others’ work

  • Interaction with researchers o Reproducibility of ‘variables’ still dependent on code lists whereas SNOMED CT version plus ECL formulation might be more robust and meaningful

    • Data requirements evolve as projects develop, leading to variable mapping changes. Version control of documentation essential

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.