Skip to main content

Table 1 Characterization of the EC families selected

From: Enzyme classification with peptide programs: a comparative study

EC family P Train N Train P Test N Test Avg SICP Std SICP Avg SICN Std SICN NNP NPN
1.1.1.1 168 3611 42 893 85% 15% 47% 9% 2 9
1.1.1.25 173 3607 44 890 79% 21% 41% 6% 6 4
1.8.4.11 163 156 41 40 87% 15% 18% 24% 2 5
2.1.2.10 172 1300 44 325 86% 16% 26% 17% 2 0
2.3.2.6 160 160 41 41 82% 18% 31% 21% 2 1
2.5.1.55 161 2652 41 648 94% 8% 41% 12% 1 1
2.7.1.11 162 3381 41 844 86% 20% 46% 5% 4 1
2.7.1.21 165 3381 42 840 83% 21% 39% 17% 7 1
2.7.2.1 173 942 44 236 85% 15% 46% 4% 1 1
2.7.7.27 167 5606 42 1313 89% 13% 29% 3% 0 4
3.1.26.11 168 1109 42 278 85% 15% 24% 17% 1 3
3.5.4.19 175 1549 44 388 84% 14% 30% 21% 0 0
4.1.1.31 161 2603 41 649 87% 17% 18% 18% 1 1
4.2.3.4 172 488 43 123 79% 20% 23% 21% 3 0
5.1.1.1 175 387 44 97 79% 23% 20% 17% 3 2
5.1.1.3 163 400 41 99 88% 17% 30% 17% 2 2
5.3.1.24 173 1744 44 431 78% 20% 43% 14% 8 3
6.3.4.3 164 1071 42 267 84% 15% 31% 14% 0 0
  1. P Train – size of positive training; N Train – negative training; P Test – positive testing; N Test – negative testing; Avg SICP – average sequence identity between each positive test protein and the closest positive training protein; Std SICP – standard deviation of the SICP; Avg SICN – average sequence identity between each positive test protein and the closest negative training protein; Std SICN – standard deviation of the SICN; NNP – near negative positives (i.e. positive test proteins that have a closest negative at least within 10% sequence indentity of the closest positive); NPN – near positive negatives.