From: Building a protein name dictionary from full text: a machine learning term extraction approach
Training lists
PG
C
IK
Pr
# n-grams, n>=2
304
193
111
254
# occurrences in articles where the n-gram is most frequent
16,543
10,862
5,853
12,547