Skip to main content

Table 3 Performance of the classifiers on the Miller data set without feature selection

From: SMOTE for high-dimensional class-imbalanced data

  

ER

Grade

  

1-NN

3-NN

5-NN

1-NN

3-NN

5-NN

NC (CUT-OFF)

PA

0.838

0.862

0.874 (0.777)

0.779

0.839

0.835 (0.835)

 

PA1

0.925

0.953

0.972 (0.789)

0.897

0.954

0.949 (0.897)

 

PA2

0.294

0.294

0.265 (0.706)

0.352

0.426

0.426 (0.611)

 

AUC

0.610

0.692

0.772 (0.772)

0.625

0.769

0.816 (0.816)

 

G-mean

0.522

0.529

0.507 (0.746)

0.562

0.637

0.636 (0.741)

SMOTE

PA

0.271

0.249

0.249

0.364

0.373

0.384

  

(0.012)

(0.013)

(0.012)

(0.014)

(0.015)

(0.016)

 

PA1

0.156

0.130

0.132

0.194

0.209

0.223

  

(0.014)

(0.014)

(0.013)

(0.018)

(0.020)

(0.020)

 

PA2

0.996

0.992

0.984

0.979

0.966

0.966

  

(0.010)

(0.015)

(0.017)

(0.012)

(0.013)

(0.011)

 

AUC

0.576

0.632

0.671

0.586

0.680

0.736

  

(0.009)

(0.014)

(0.013)

(0.011)

(0.013)

(0.010)

 

G-mean

0.393

0.359

0.360

0.435

0.449

0.464

  

(0.018)

(0.020)

(0.019)

(0.020)

(0.021)

(0.021)

UNDER

PA

0.625

0.685

0.691

0.766

0.836

0.840

  

(0.065)

(0.056)

(0.049)

(0.017)

(0.012)

(0.012)

 

PA1

0.742

0.841

0.863

0.798

0.871

0.878

  

(0.017)

(0.013)

(0.012)

(0.016)

(0.011)

(0.012)

 

PA2

0.761

0.866

0.890

0.649

0.709

0.700

  

(0.017)

(0.013)

(0.010)

(0.051)

(0.039)

(0.028)

 

AUC

0.693

0.822

0.861

0.723

0.833

0.850

  

(0.033)

(0.021)

(0.021)

(0.027)

(0.015)

(0.008)

 

G-mean

0.689

0.770

0.784

0.719

0.786

0.784

  

(0.036)

(0.031)

(0.029)

(0.029)

(0.022)

(0.017)

  1. Overall predictive accuracy (PA), predictive accuracy for Class 1 (P A1), predictive accuracy for Class 2 (P A2), Area under the ROC curve (AUC) and G-mean for 1-NN, 3-NN and 5-NN achieved on the Miller data set with different methods of training set manipulation (no correction - NC (in brackets we report the results obtained by adjusting the threshold for 5-NN - CUT-OFF), SMOTE and undersampling - UNDER). Prediction of Estrogen receptor status (ER) and Grade of the tumor (Grade). All variables were considered when training the classifiers.