From: Machine learning-based approaches for ubiquitination site prediction in human proteins
Architecture | LSTM | BERT-small | BERT-tiny | Nystromformer | SqueezeBERT |
Learning rate | 8e−04 | 8e−05 | 8e−05 | 5e−05 | 1e−04 |
Warmup steps | 1000 | 1000 | 600 | 1000 | 1000 |
Scheduler | Cosine | Cosine | Cosine | Cosine | Cosine |
Decouple weight decay | False | True | True | True | True |
Weight decay | 1.2e−06 | 1e−03 | 1e−04 | 1e−02 | 1e−02 |
Batch size | 512 | 512 | 512 | 512 | 512 |
Gradient clip | 5 | 2 | 5 | 2 | 2 |
Label smoothing | 0.0 | 0.2 | 0.0 | 0.1 | 0.1 |
Mixed precision | True | True | True | True | True |