Skip to main content

Table 2 Statistics by the most frequent dependency and overlapped POS labels, sentence length (i.e. number of words in the sentence) and relative dependency distances i−j from a dependent wi to its head wj

From: From POS tagging to dependency parsing for biomedical event extraction

Dependency labels

        

GENIA

CRAFT

POS tags

Length

Distance

Type

%

Type

%

Type

% G

% C

Type

%

Type

% G

% C

advmod

2.3

ADV

4.0

CC

3.6

3.2

GENIA

 

<−5

4.1

3.9

amod

9.6

AMOD

1.9

CD

1.6

4.0

1-10

3.5

−5

1.2

1.2

appos

1.2

CONJ

3.6

DT

7.6

6.6

11-20

31.0

−4

2.1

2.1

aux

1.4

COORD

3.2

IN

12.9

11.3

21-30

35.7

−3

4.4

3.2

auxpass

1.5

DEP

1.0

JJ

10.1

7.6

31-40

19.4

−2

10.6

8.5

cc

3.5

LOC

1.7

NN

29.3

24.2

41-50

7.1

−1

24.1

21.7

conj

3.9

NMOD

33.7

NNS

6.9

6.6

>50

3.3

1

19.0

26.5

dep

2.1

OBJ

2.8

RB

2.5

2.4

  

2

9.4

9.8

det

7.2

P

18.4

TO

1.6

0.6

CRAFT

 

3

6.3

5.9

dobj

3.1

PMOD

10.6

VB

1.1

1.1

1-10

17.8

4

4.0

3.4

mark

1.1

PRD

0.9

VBD

2.1

2.2

11-20

23.1

5

2.4

2.3

nn

11.6

PRN

1.9

VBG

1.0

1.1

21-30

25.2

>5

12.3

11.6

nsubj

4.1

ROOT

3.9

VBN

3.1

3.8

31-40

17.5

-

-

-

nsubjpass

1.4

SBJ

4.9

VBP

1.4

1.1

41-50

9.3

-

-

-

num

1.2

SUB

0.9

VBZ

1.9

1.4

>50

7.1

-

-

-

pobj

12.2

TMP

0.9

-

-

-

-

-

-

-

-

prep

12.3

VC

2.4

-

-

-

-

-

-

-

-

punct

10.4

-

-

-

-

-

-

-

-

-

-

root

3.8

-

-

-

-

-

-

-

-

-

-

  1. In addition, % G and % C denote the occurrence proportions in GENIA and CRAFT, respectively