F1 scores for different sources of distant supervision, and the effect of removing class A, evaluated on the OIAM training set
Model | ICPC-2 | ICPC-2 without A | CKS | CKS without A | ICPC-2 and CKS combined | Combined without A |
Naïve Bayes (multilabel) | 0.323 (0.268, 0.362) | 0.345 (0.286, 0.389) | 0.234 (0.196, 0.262) | 0.249 (0.207, 0.285) | 0.254 (0.21, 0.287) | 0.271 (0.225, 0.308) |
Naïve Bayes (multiclass) | 0.512 (0.462, 0.549) | 0.508 (0.458, 0.546) | 0.375 (0.325, 0.411) | 0.391 (0.34, 0.428) | 0.378 (0.33, 0.416) | 0.385 (0.338, 0.421) |
Nearest centroid | 0.444 (0.384, 0.489) | 0.093 (0.063, 0.12) | 0.365 (0.312, 0.401) | 0.086 (0.057, 0.107) | 0.367 (0.315, 0.403) | 0.090 (0.063, 0.113) |
BERT conventional | 0.057 (0.049, 0.065) | 0.027 (0.02, 0.037) | 0.550 (0.494, 0.593) | 0.521 (0.459, 0.565) | 0.540 (0.476, 0.576) | 0.545 (0.483, 0.590) |
BERT NSP | 0.285 (0.232, 0.324) | 0.347 (0.309, 0.371) | 0.462 (0.424, 0.488) | 0.434 (0.392, 0.466) | 0.445 (0.402, 0.476) | 0.467 (0.425, 0.498) |
BERT MLM | 0.505 (0.444, 0.544) | 0.486 (0.425, 0.528) | 0.567 (0.512, 0.604) | 0.497 (0.441, 0.535) | 0.532 (0.472, 0.571) | 0.475 (0.424, 0.512) |
Highest F1 scores in bold.
CKS, Clinical Knowledge Summaries; ICPC-2, International Classification of Primary Care-2; MLM, masked language modelling; NSP, next sentence prediction; OIAM, One in a Million.