Skip to main content

Table 4 Comparison of classification performance of different models with difference encoders

From: An adaptive multi-modal hybrid model for classifying thyroid nodules by combining ultrasound and infrared thermal images

Modality

Model

ACC

PRE

SEN

SPE

F1

F2

US

ResNet

0.7570

0.8316

0.5962

0.8958

0.6945

0.8140

IRT

ResNet

0.7098

0.7463

0.5660

0.8339

0.6438

0.7618

US+IRT

ResNet w/o AMCE

0.8444

0.9000

0.7472

0.9283

0.8165

0.8854

US

ViT

0.7395

0.7710

0.6226

0.8404

0.6889

0.7854

IRT

ViT

0.7378

0.7138

0.7245

0.7492

0.7191

0.7441

US+IRT

ViT w/o AMCE

0.8357

0.8178

0.8302

0.8404

0.8240

0.8383

US

Hybrid

0.7640

0.8009

0.6528

0.8599

0.7193

0.8086

IRT

Hybrid

0.7483

0.8040

0.6038

0.8730

0.6897

0.8015

US+IRT

Hybrid w/o AMCE

0.8636

0.8880

0.8075

0.9121

0.8458

0.8891

  1. Bold values indicate the best results achieved in each indicator
  2. “w/o” indicates “without”; “ACME” indicates the adaptive cross-modal encoder