Table 5

F1 scores for different sources of distant supervision, and the effect of removing class A, evaluated on the OIAM training set

ModelICPC-2ICPC-2 without ACKSCKS without AICPC-2 and CKS combinedCombined without A
Naïve Bayes (multilabel)0.323 (0.268, 0.362)0.345 (0.286, 0.389)0.234 (0.196, 0.262)0.249 (0.207, 0.285)0.254 (0.21, 0.287)0.271 (0.225, 0.308)
Naïve Bayes (multiclass)0.512 (0.462, 0.549)0.508 (0.458, 0.546)0.375 (0.325, 0.411)0.391 (0.34, 0.428)0.378 (0.33, 0.416)0.385 (0.338, 0.421)
Nearest centroid0.444 (0.384, 0.489)0.093 (0.063, 0.12)0.365 (0.312, 0.401)0.086 (0.057, 0.107)0.367 (0.315, 0.403)0.090 (0.063, 0.113)
BERT conventional0.057 (0.049, 0.065)0.027 (0.02, 0.037)0.550 (0.494, 0.593)0.521 (0.459, 0.565)0.540 (0.476, 0.576)0.545 (0.483, 0.590)
BERT NSP0.285 (0.232, 0.324)0.347 (0.309, 0.371)0.462 (0.424, 0.488)0.434 (0.392, 0.466)0.445 (0.402, 0.476)0.467 (0.425, 0.498)
BERT MLM0.505 (0.444, 0.544)0.486 (0.425, 0.528)0.567 (0.512, 0.604)0.497 (0.441, 0.535)0.532 (0.472, 0.571)0.475 (0.424, 0.512)
  • Highest F1 scores in bold.

  • CKS, Clinical Knowledge Summaries; ICPC-2, International Classification of Primary Care-2; MLM, masked language modelling; NSP, next sentence prediction; OIAM, One in a Million.