Skip to main content

Table 2 Accuracy of three different machine learning prediction algorithms (J48 Decision Tree, Naïve Bayes and SVM with SMO training) using the frequencies of all possible short tripeptide binary segments.a

From: Use of machine learning algorithms to classify binary protein sequences as highly-designable or poorly-designable

 

J48

Naïve Bayes

SMO

a) Sequences folding to the top 10% and the bottom 10% of designable conformations for the hexagon

89.7% correct

78.8% correct

91.0% correct

 

AUC 0.95

AUC 0.92

AUC 0.91

 

Sens: 0.91

Sens: 0.85

Sens: 0.84

 

Spec: 0.90

Spec: 0.77

Spec: 0.91

b) Sequences folding to the top 10% and the bottom 10% of designable conformations for the triangle

67.8% correct

56.7% correct

57.8% correct

 

AUC 0.69

AUC 0.61

AUC 0.58

 

Sens: 0.68

Sens: 0.58

Sens: 0.64

 

Spec: 0.68

Spec: 0.57

Spec: 0.57

  1. a We compare random subsets of sequences corresponding to the top 10% and the bottom 10% of designabile structures for the a) hexagon, and b) triangle. Prediction accuracy and area under the curve (AUC), sensitivity (Sens) and specificity (Spec) for each method are given.