Table 2B

Studies that performed text classification using supervised approach, including the number of rater and associated inter-rater agreement expressed as Cohen’s kappa (κ), classifiers and configuration applied where reported. Studies are reported in chronological order

Author	Data source	Comments classified	No. of raters	κ	No. of themes	Classifier							Configuration
Author	Data source	Comments classified	No. of raters	κ	No. of themes	SVM	NB	DT	B	RF	GL	KN	Configuration
Alemi et al10 34	RateMDs	100% (n=955)	NR	NR	9	✓	✓	✓	✓				Sparsity rule SVM: RBF kernel
Greaves et al7	NHS choices	*17.56% (1000/5695)	2	0.76	3	✓	✓	✓	✓				Prior polarity Information gain SVM: RBF kernel
Wagland et al48	Cancer experience	14.19% (800/5634)	3	0.64–0.87	11	✓			✓	✓	✓		NR
Doing-Harris et al24	Press Ganey	0.58% (300/51 235)	3	0.73	7		✓						NR
Hawkins et al52	Twitter	7511/11 602†	AMT	0.18–0.52	10	✓	✓						NR

*Only n-grams classified.
†Tweets classified as pertaining to patient experience only.
AMT, Amazon Mechanical Turk; B, bagging; DT, decision trees; GL, generalised linear model; KN, k-nearest neighbour; NB, Naïve Bayes; NR, not reported; RBF, radial basis function; RF, random forest; SVM, support vector machine.