Skip to main content

Table 2 Accuracy for different components combinations of our proposed distance method

From: ComPhy: prokaryotic composite distance phylogenies inferred from whole-genome gene sets

Data sets

Number of species

GDD (%)

GCD (%)

GBD (%)

GCD*GDD (%)

GCD*GBD (%)

GDD*GBD (%)

GCD*GDD*GBD (%)

Dataset1

52

85.12

86.44

84.54

91.45

90.29

90.29

90.29

Dataset2

53

87.76

86.40

84.45

90.65

90.74

90.74

90.74

Dataset3

82

80.37

92.58

84.19

94.46

95.93

96.06

98.46

Dataset4

398

83.73

86.56

81.23

89.93

87.07

87.28

90.07

Dataset5

181

95.04

89.74

90.20

94.30

95.67

98.16

98.30

Dataset6

96

87.39

85.45

84.88

99.36

99.26

99.36

99.26

Dataset7

277

88.70

84.04

86.75

88.71

89.71

88.23

90.71

Dataset8

165

85.36

77.98

77.03

94.44

94.38

94.47

94.38

Dataset9

54

89.31

87.34

83.76

92.31

92.31

92.37

96.55

  1. GDD = gene dispersion distance, GCD = gene content distance, GBD = gene breakpoint distance. GCD*GDD is the combination of two distance components and GCD*GDD*GBD is the combination of all three terms. All distances are logarithmically transformed.