An adaptive multi-modal hybrid model for classifying thyroid nodules by combining ultrasound and infrared thermal images

Table 4 Comparison of classification performance of different models with difference encoders

Modality	Model	ACC	PRE	SEN	SPE	F1	F2
US	ResNet	0.7570	0.8316	0.5962	0.8958	0.6945	0.8140
IRT	ResNet	0.7098	0.7463	0.5660	0.8339	0.6438	0.7618
US+IRT	ResNet w/o AMCE	0.8444	0.9000	0.7472	0.9283	0.8165	0.8854
US	ViT	0.7395	0.7710	0.6226	0.8404	0.6889	0.7854
IRT	ViT	0.7378	0.7138	0.7245	0.7492	0.7191	0.7441
US+IRT	ViT w/o AMCE	0.8357	0.8178	0.8302	0.8404	0.8240	0.8383
US	Hybrid	0.7640	0.8009	0.6528	0.8599	0.7193	0.8086
IRT	Hybrid	0.7483	0.8040	0.6038	0.8730	0.6897	0.8015
US+IRT	Hybrid w/o AMCE	0.8636	0.8880	0.8075	0.9121	0.8458	0.8891

ISSN: 1471-2105