F1 scores when patients’ transcribed speech is excluded
Model | Including GP and patient speech | Only GP speech |
Naïve Bayes (multilabel) ICPC-2 | 0.323 (0.268, 0.362) | 0.372 (0.3, 0.417) |
Naïve Bayes (multiclass) ICPC-2 | 0.512 (0.462, 0.549) | 0.484 (0.429, 0.521) |
Nearest centroid ICPC-2 | 0.444 (0.384, 0.489) | 0.425 (0.361, 0.47) |
BERT conventional, CKS | 0.550 (0.494, 0.593) | 0.445 (0.384, 0.465) |
BERT NSP, CKS | 0.462 (0.424, 0.488) | 0.436 (0.398, 0.464) |
BERT MLM, CKS | 0.567 (0.512, 0.604) | 0.500 (0.434, 0.539) |
The classifiers were trained using their most effective distant supervision source and evaluated on the OIAM training set (repurposed as a validation set). Bold indicates best performance in a comparison between including and excluding patients’ speech with the same classifier.
CKS, Clinical Knowledge Summaries; GP, general practitioner; ICPC-2, International Classification of Primary Care-2; MLM, masked language modelling; NSP, next sentence prediction; OIAM, One in a Million.