Skip to main content

Table 1 Performance comparison when using different number of labeled variants and different types of features. We trained SGAN models based on different size of labels (ranging from 250 to 4000 variants) and three groups of features: pathogenicity prediction scores, evidence-based scores, and ensemble scores (full features)

From: Model performance and interpretability of semi-supervised generative adversarial networks to predict oncogenic variants with unlabeled data

Features

Training size

Accuracy

Precision

Recall

Specificity

F1score

MCC

ROC AUC

PR AUC

Pathogenicity prediction scores

250

0.631

0.339

0.757

0.597

0.468

0.29

0.732

0.413

 

500

0.643

0.346

0.748

0.614

0.473

0.299

0.741

0.421

 

1000

0.625

0.337

0.774

0.584

0.469

0.294

0.734

0.412

 

2000

0.642

0.346

0.751

0.613

0.474

0.299

0.737

0.413

 

4000

0.631

0.34

0.767

0.594

0.471

0.297

0.734

0.406

Evidence-based scores

250

0.727

0.415

0.667

0.743

0.512

0.355

0.815

0.543

 

500

0.75

0.447

0.706

0.762

0.548

0.406

0.814

0.627

 

1000

0.729

0.428

0.778

0.716

0.552

0.416

0.848

0.676

 

2000

0.705

0.406

0.81

0.677

0.541

0.404

0.849

0.678

 

4000

0.78

0.492

0.781

0.78

0.604

0.486

0.832

0.647

Ensemble

250

0.541

0.306

0.897

0.444

0.456

0.29

0.786

0.507

 

500

0.637

0.355

0.85

0.579

0.501

0.352

0.828

0.617

 

1000

0.676

0.386

0.864

0.625

0.534

0.402

0.854

0.688

 

2000

0.653

0.371

0.889

0.589

0.524

0.392

0.859

0.691

 

4000

0.673

0.383

0.858

0.622

0.53

0.395

0.854

0.686