A comparison of embedding aggregation strategies in drug–target interaction prediction

BMC Bioinformatics

Table 4 Performance results obtained for testing combinations of implicit and explicit compound and protein feature representations

Dataset	Aggregation	MLP–MLP (implicit)				MLP–MLP (implicit+explicit)				CNN–CNN (implicit+explicit)
Dataset	Aggregation	MSE	R\(^2\)	CI	Params (M)	MSE	R\(^2\)	CI	Params (M)	MSE	R\(^2\)	CI	Params (M)
DAVIS	Dot prod.	0.2804	0.6551	0.8653	5.2	0.2859	0.6484	0.8702	5.2	0.2690	0.6692	0.8752	2.3
	Tensor prod.	0.3008	0.6301	0.8654	9.6	0.3138	0.6141	0.8632	9.6	0.3394	0.5825	0.8574	1.3
	MLP	0.3104	0.6182	0.8585	8.3	0.3217	0.6043	0.8554	8.3	0.3135	0.6145	0.8580	2.0
KIBA	Dot prod.	0.2481	0.6500	0.8326	7.2	0.2277	0.6789	0.8365	7.2	0.2296	0.6761	0.8402	3.5
	Tensor prod.	0.2625	0.6297	0.8276	12.9	0.2555	0.6396	0.8273	12.9	0.2631	0.6290	0.8221	2.6
	MLP	0.2440	0.6559	0.8387	12.0	0.2590	0.6347	0.8285	12.0	0.2489	0.6490	0.8347	3.2

The first combination uses to MLP branches that process dummy one-hot encoded features for the compounds and proteins. The second and third combinations represent two-level two-branch architectures. The outer level is comprised of the classic two-branch architecture with embeddings that can be combined using the three strategies we investigate. The two outer branches are composed of two-branch models that utilize both explicit and implicit features. For every combination, the average MSE, R\(^2\), CI, and number of trainable parameters of the top-5 performing (lowest overall loss) configurations are reported

ISSN: 1471-2105