Skip to main content

Table 1 Summary of 14 semantic similarity scores for protein pairs.

From: Revealing and avoiding bias in semantic similarity scores for protein pairs

Measure Description Range
Similarity scores for term pairs
Resnik [6] Information content of the most informative common ancestor of two terms ≥ 0
Lin [5] Normalized Resnik similarity score by assessing how close two terms are to their most informative common ancestor [0, 1)
RS [4] Weighted Lin similarity score by using the probability of annotations of the most informative common ancestor [0,1)
Jiang [7] Based on the difference between two terms and their most informative common ancestor in information content (0,1]
Similarity scores for protein pairs based on pairwise similarity scores between term groups
AVG [2] The average of the similarity scores for all pairs of terms between two groups of protein annotations Same with those for the corresponding similarity scores for term pairs
BMA [3] The score of the best-matching pairs between two groups of protein annotations  
Similarity scores for protein pairs based on groupwise similarity scores between term groups
TO [9] The number of terms shared by the annotations for two proteins ≥ 1
NTO [9] Dividing TO by the minimum of the annotation lengths of two proteins (0,1]
Dice [12] Dividing TO by the average of annotation lengths of two proteins (0,1]
Kappa [11] A chance-corrected measure of co-occurrence between two groups of protein annotations [0, 1]
GIC [8] Jaccard index weighted by the information content of each GO term [0, 1]
VSM [10] Cosine similarity weighted by the information content of each GO term [0, 1]