Skip to main content

Table 2 The statistics of all sites in Benchmark datasets

From: Protein–protein interaction site prediction by model ensembling with hybrid feature and self-attention

Dataset

Sequences

Interaction sites

Non-interaction sites

All sites

Number

Average length

Length ≤ 200 (%)

Number

Proportion (%)

Dset_186

186

195

65.05

5517

15.23

30,702

36,219

Dset_72

72

252

56.94

1923

10.6

16,217

18,140

PDBset_164

164

205

60.37

6096

18.1

27,585

33,681

Dset_186_72_PDB164

422

209

61.85

13,536

15.37

74,504

88,040

Dset_448

448

260

35.94

15,810

13.57

100,690

116,500

The large dataset

9982

426

28.01

427,687

10.05

3,826,511

4,254,198