F1 scores for fivefold cross-validation performance on the OIAM training set with different sets of stopwords
Model | No removal | English | Medical | Custom | Medical+custom | English+custom | English+medical +custom |
Naïve Bayes (multilabel) | 0.157 | 0.159 | 0.154 | 0.143 | 0.166 | 0.170 | 0.175 |
Naïve Bayes (multiclass) | 0.225 | 0.266 | 0.243 | 0.228 | 0.245 | 0.272 | 0.300 |
SVM (multilabel) | 0.184 | 0.184 | 0.184 | 0.184 | 0.184 | 0.184 | 0.184 |
SVM (multiclass) | 0.141 | 0.151 | 0.141 | 0.142 | 0.142 | 0.150 | 0.154 |
Nearest centroid | 0.234 | 0.256 | 0.239 | 0.234 | 0.247 | 0.252 | 0.278 |
OIAM, One in a Million; SVM, support vector machine.