From: Machine learning-based approaches for ubiquitination site prediction in human proteins
Window size | Samples with duplicate sequences and labels (Type 1) | Samples with duplicate sequences (Type 2) | Samples after removing types 1 and 2 | ||||||
---|---|---|---|---|---|---|---|---|---|
All | Positive samples | Negative samples | All | Positive samples | Negative samples | All | Positive samples | Negative samples | |
5 | 143,420 | 3728 | 139,692 | 154,835 | 12,125 | 142,710 | 84,861 | 13,122 | 71,739 |
7 | 6556 | 213 | 6343 | 7150 | 517 | 6633 | 191,493 | 15,066 | 176,427 |
9 | 2619 | 134 | 2485 | 2728 | 191 | 2537 | 193,891 | 15,106 | 178,785 |
15 | 1582 | 87 | 1495 | 1623 | 109 | 1514 | 194,544 | 15,133 | 179,411 |
21 | 1336 | 72 | 1264 | 1361 | 85 | 1276 | 194,699 | 15,141 | 179,558 |
27 | 1212 | 60 | 1152 | 1233 | 71 | 1162 | 194,776 | 15,147 | 179,629 |
33 | 1152 | 54 | 1098 | 1136 | 59 | 1077 | 194,829 | 15,151 | 179,678 |
45 | 1025 | 48 | 977 | 1034 | 53 | 981 | 194,888 | 15,153 | 179,735 |
55 | 969 | 44 | 925 | 978 | 49 | 929 | 194,920 | 15,155 | 179,765 |
77 | 854 | 37 | 817 | 863 | 42 | 821 | 194,992 | 15,159 | 179,833 |
99 | 771 | 34 | 737 | 776 | 39 | 737 | 195,047 | 15,161 | 179,886 |