Skip to main content

Table 5 Stacking and other model combination techniques

From: Combining joint models for biomedical event extraction

Model

F1

  

UMass

54.7

  

UMass←Stanford

55.8

  

Model

Alone

Intersection with UMass

Union with UMass

Stanford (1N)

49.9

49.0

54.7

Stanford (1P)

49.0

48.3

54.6

Stanford (2N)

46.5

45.4

54.8

Stanford (2P)

49.5

49.1

54.4

Stanford (all)

--

42.4

53.0

Stanford (1N, reranked)

50.2

49.7

54.4

Stanford (1P, reranked)

49.4

50.2

53.2

Stanford (2N, reranked)

47.8

46.9

54.6

Stanford (2P, reranked)

50.4

50.0

54.4

Stanford (all, reranked)

50.7

50.0

54.7

Model

 

Intersection

Union

Stanford (all)

 

43.9

50.2

  1. Stacking and reranking outperform the intersection and union model combination baselines. The first section of the table summarizes the results from the UMass and stacked models. The second section gives the performance of each Stanford model alone and when combined with the pure UMass model via the intersection and union methods. In the last section, we evaluate the intersection and union baselines using only the four Stanford models as inputs. The "Stanford (all)" line represents using all four individual decoders without model combination (hence the Alone column in the second table is left empty--it cannot be evaluated since it isn't a single set of outputs). In "Stanford (all, reranked)", the reranker was used to combine the four decoders into a single output before being intersected or unioned. All results are on the development set for the Genia track.