From: Machine learning-based approaches for ubiquitination site prediction in human proteins
Window size | Samples with duplicate sequences and labels (Type 1) | Samples with duplicate sequences (Type 2) | Samples after removing type 1 and 2 | ||||||
---|---|---|---|---|---|---|---|---|---|
All | Positive samples | Negative samples | All | Positive samples | Negative samples | All | Positive samples | Negative samples | |
5 | 4861 | 47 | 4814 | 5492 | 398 | 5094 | 17,547 | 1589 | 15,958 |
7 | 172 | 6 | 166 | 180 | 10 | 170 | 20,506 | 1610 | 18,896 |
9 | 97 | 0 | 97 | 99 | 1 | 98 | 20,548 | 1613 | 18,935 |
15 | 57 | 0 | 57 | 59 | 1 | 58 | 20,571 | 1613 | 18,958 |
21 | 46 | 0 | 46 | 48 | 1 | 47 | 20,578 | 1613 | 18,965 |
27 | 42 | 0 | 42 | 44 | 1 | 43 | 20,580 | 1613 | 18,967 |
33 | 32 | 0 | 32 | 34 | 1 | 33 | 20,585 | 1613 | 18,972 |
45 | 20 | 0 | 20 | 20 | 0 | 20 | 20,592 | 1613 | 18,979 |
55 | 12 | 0 | 12 | 12 | 0 | 12 | 20,596 | 1613 | 18,983 |
77 | 8 | 0 | 8 | 8 | 0 | 8 | 20,598 | 1613 | 18,985 |
99 | 2 | 0 | 2 | 2 | 0 | 2 | 20,601 | 1613 | 18,988 |