Skip to main content

Table 2 Main features used by the participating teams. The table shows the features and strategies adopted by the different participants and the number of users.

From: Evaluation of BioCreAtIvE assessment of task 2

Characteristics (C), resources (R) and methods (M)

Users

(C) Sentence level (retrieval unit)

[19,20,22,25,26]

(C) Paragraph level (retrieval unit)

[21,23,24]

(C) Full article processed

[19,21,22,24,25]

(C) Full article processed except methods section

[26]

(C) Only abstract processed

[20]

(C) GO term – Protein distance

[22,24,25]

(M) Stemming

[20,22,24,26]

(M) POS tagging

[25,26]

(M) Shallow parsing

[25]

(M) Finite state automata

[20,25]

(M) Edit distance ranking

[20]

(M) Vector space model

[20,21]

(M) Machine learning technique

[23-25]

(M) Support Vector Machines

[23]

(M) Naïve Bayes models

[24,25]

(M) N-gram models

[24]

(M) External resource – tool: GATE NLP tool

[21]

(M) External resource – tool: Morphological normalizer BioMorpher

[21]

(M) External resource – tool: qtile query based ranking tool

[26]

(M) External resource – tool: Grok POS tagger

[25]

(M) Heuristic rules

[22,24-26]

(M) Regular expressions/pattern matching

[19,20,22,24,25]

(M) Literal string matching

[22,24]

(R) Protein name aliases (link to external databases)

[22,24,26]

(R) GO terms used

[19-26]

(R) GOA data used

[22-24]

(R) GO term forming words/tokens

[19,22,24,26]

(R) GO term variants

[22,25]

(R) External resource – data: Dictionary of suffixes

[24]

(R) External resource – data: UMLS/MeSH dictionary

[20,24]

(R) External resource – data: HUGO database

[22,24,26]

(R) External resource – data: SGD database

[24]

(R) External resource – data: MGI database

[24]

(R) External resource – data: RGD database

[24]

(R) External resource – data: TAIR database

[24]

(R) External resource – data: Procter and Gamble protein synoyms

[21]