From: Efficient use of unlabeled data for protein sequence classification: a comparative study
Method | Error | Top-5 Error | Balanced Error | Top-5 Balanced Error | F1 | Top-5 F1 |
---|---|---|---|---|---|---|
Without clustering | Ā | Ā | Ā | Ā | Ā | Ā |
full seq. | 50.16 | 21.82 | 67.17 | 32.55 | 37.43 | 71.40 |
region | 42.83 | 13.68 | 61.43 | 22.63 | 40.36 | 79.19 |
no tails (full seq.) | 50.16 | 21.82 | 71.81 | 32.59 | 30.17 | 69.12 |
max. length (full seq.) | 52.44 | 24.43 | 77.31 | 39.17 | 23.98 | 65.22 |
With clustering | Ā | Ā | Ā | Ā | Ā | Ā |
full seq. | 50.33 | 19.71 | 70.04 | 27.21 | 32.10 | 75.03 |
region | 40.88 | 13.68 | 57.86 | 22.82 | 47.54 | 79.03 |
no tails (full seq.) | 48.37 | 20.68 | 69.83 | 32.27 | 31.48 | 70.03 |
max. length (full seq.) | 52.44 | 23.29 | 77.05 | 36.52 | 26.84 | 68.02 |