Skip to main content

Table 1 Parameters of the modelling functions for the full dataset

From: Metrics for GO based protein semantic similarity: a systematic evaluation

 

LRBS

RRBS

Mean1

 

Stdev1

Mean2

Stdev2

Res.

Mean1

Stdev1

Mean2

Stdev2

Res.

simGIC

2,1

0,20

2,6

0,06

0,62

0,21

0,07

0,58

0,10

0,65

simUI

2,1

0,20

2,6

0,05

0,43

0,21

0,08

0,59

0,07

0,46

Resnik's measure

Avg

2,1

0,28

2,7

0,08

0,16

0,21

0,10

0,49

0,03

0,35

 

Max

2,1

0,05

2,6

0,18

0,24

0,16

0,06

0,49

0,22

0,37

 

BMA

2,1

0,18

2,6

0,06

0,47

0,20

0,08

0,58

0,09

0,55

 

GraSM

2,2

0,19

2,6

0,03

0,59

0,23

0,08

0,50

0,01

0,67

Lin's measure

Avg

2,1

0,31

2,7

0,08

0,15

0,21

0,12

0,49

0,03

0,29

 

Max

2,1

0,08

2,6

0,15

0,18

0,15

0,08

0,49

0,15

0,30

 

BMA

2,1

0,20

2,6

0,06

0,39

0,20

0,08

0,33

0,34

0,47

 

GraSM

2,2

0,18

2,6

0,03

0,47

0,22

0,09

0,50

0,01

0,57

Jiang & Conrath's measure

Avg

2,1

0,24

2,7

0,06

0,10

0,19

0,10

0,49

0,03

0,20

 

Max

2,1

0,04

2,4

0,43

0,14

0,16

0,08

0,49

0,15

0,21

 

BMA

2,1

0,16

2,6

0,09

0,22

0,19

0,08

0,45

0,25

0,28

 

GraSM

2,2

0,16

2,6

0,03

0,27

0,20

0,10

0,51

0,00

0,38

  1. For each semantic similarity measure in the full dataset, and with each of the sequence similarity metrics (LRBS and RRBS), the mean and standard deviation parameters for the two additive normal cumulative distribution functions (N CDF ) used to model it are shown. Also shown is the global resolution of the measure, which corresponds to the sum of the scale factors applied to each of the normal functions. Although there is some variability on the normal parameters (particularly in the results with the RRBS sequence similarity metric), most of that variability is due to the sensitivity of the modelling method, as the similarity in behaviour between the measures is evident (Figures 3,4 and5) with the exception of the average approach. As the main criterion to distinguishing between the measures is their resolution, the highest resolutions (for simGIC and Resnik's measure with the GraSM approach) are highlighted in bold (Mean1: mean of the first N CDF ; Stdev1: standard deviation of the first N CDF ; Mean2:mean of the second N CDF ; Stdev2: standard deviation of the second N CDF ; Res: resolution of each measure; LRBS: log reciprocal BLAST score; RRBS: relative reciprocal BLAST score).