Skip to main content

Table 4 Performance results obtained for testing combinations of implicit and explicit compound and protein feature representations

From: A comparison of embedding aggregation strategies in drug–target interaction prediction

Dataset

Aggregation

MLP–MLP (implicit)

MLP–MLP (implicit+explicit)

CNN–CNN (implicit+explicit)

MSE

R\(^2\)

CI

Params (M)

MSE

R\(^2\)

CI

Params (M)

MSE

R\(^2\)

CI

Params (M)

DAVIS

Dot prod.

0.2804

0.6551

0.8653

5.2

0.2859

0.6484

0.8702

5.2

0.2690

0.6692

0.8752

2.3

Tensor prod.

0.3008

0.6301

0.8654

9.6

0.3138

0.6141

0.8632

9.6

0.3394

0.5825

0.8574

1.3

MLP

0.3104

0.6182

0.8585

8.3

0.3217

0.6043

0.8554

8.3

0.3135

0.6145

0.8580

2.0

KIBA

Dot prod.

0.2481

0.6500

0.8326

7.2

0.2277

0.6789

0.8365

7.2

0.2296

0.6761

0.8402

3.5

Tensor prod.

0.2625

0.6297

0.8276

12.9

0.2555

0.6396

0.8273

12.9

0.2631

0.6290

0.8221

2.6

MLP

0.2440

0.6559

0.8387

12.0

0.2590

0.6347

0.8285

12.0

0.2489

0.6490

0.8347

3.2

  1. The first combination uses to MLP branches that process dummy one-hot encoded features for the compounds and proteins. The second and third combinations represent two-level two-branch architectures. The outer level is comprised of the classic two-branch architecture with embeddings that can be combined using the three strategies we investigate. The two outer branches are composed of two-branch models that utilize both explicit and implicit features. For every combination, the average MSE, R\(^2\), CI, and number of trainable parameters of the top-5 performing (lowest overall loss) configurations are reported