Skip to main content

Table 3 Statistics of annotations in different sections of text

From: The Genia Event and Protein Coreference tasks of the BioNLP Shared Task 2011

Item

Abstract

  

Full paper

  

All

TIAB

Intro.

R/D/C

Methods

Caption

Words

267229

80962

3538

7878

43420

19406

6720

Proteins

14969

6580

336

597

3980

916

751

(Density: P/W)

(5.60%)

(8.13%)

(9.50%)

(7.58%)

(9.17%)

(4.72%)

(11.18%)

Event triggers

11057

3280

216

312

2659

136

173

Events

13603

4436

272

427

3234

198

278

(Density: E/W)

(5.09%)

(5.48%)

(7.69%)

(5.42%)

(7.51%)

(1.02%)

(4.14%)

(Density: E/P)

(90.87%)

(67.42%)

(80.95%)

(71.52%)

(81.93%)

(21.62%)

(37.02%)

(Avg. Coord.: E/T)

(1.23)

(1.27)

(1.26)

(1.37)

(1.23)

(1.46)

(1.61)

Gene expression

2816

1193

62

98

841

80

112

Transcription

795

204

7

7

140

30

20

Protein catabolism

145

3

0

0

3

0

0

Phosphorylation

355

137

12

12

101

10

2

Localization

492

47

3

15

22

7

0

Binding

1485

380

16

74

266

6

18

Regulation

1426

371

35

30

281

4

21

Positive_regulation

4452

1385

98

131

1087

15

54

Negative_regulation

1637

716

39

60

520

46

51

  1. The Abstract column shows the statistics of the abstraction collection (1210 titles and abstracts), and the following columns show that of the full paper collection (14 full papers). TIAB = title and abstract, Intro. = introduction and background, R/D/C = results, discussions, and conclusions, Methods = methods, materials, and experimental procedures. Some minor sections, supporting information, supplementary material, and synopsis, are ignored. Density = relative density of annotation (P/W = Proteins/Words, E/W = Events/Words, and E/P = Events/Proteins). Avg. Coord = average number of coordinated events (E/T = Events/Triggers).