Table 4

F1 scores for distant supervision performance, evaluated on the OIAM training set, with different sets of stopwords, and training on either CKS topics or ICPC-2 descriptions

ModelNo removalEnglishMedicalCustomMedical+customEnglish+customEnglish+medical+custom
NB (multilabel), ICPC-20.1390.1700.1360.2530.2970.3230.297
NB (multilabel), CKS0.0960.1600.1190.1260.1910.2070.234
NB (multiclass), ICPC-20.3240.3540.3070.4610.4710.5120.470
NB (multiclass), CKS0.2450.2740.2490.2750.3400.3680.375
Nearest centroid, ICPC-20.3120.3540.3170.4320.4370.4450.437
Nearest centroid, CKS0.3260.3490.3440.3490.3530.3570.365
  • CKS, Clinical Knowledge Summaries; ICPC-2, International Classification of Primary Care-2; NB, Naïve Bayes; OIAM, One in a Million.