Skip to main content

Table 2 Error rates (estimated using the 0.632+ bootstrap method with 200 bootstrap samples) for the microarray data sets using different methods. The results shown for variable selection with random forest used ntree = 2000, fraction.dropped = 0.2, mtryFactor = 1. Note that the OOB error used for variable selection is not the error reported in this table; the error rate reported is obtained using bootstrap on the complete variable selection process. The column "no info" denotes the minimal error we can make if we use no information from the genes (i.e., we always bet on the most frequent class).

From: Gene selection and classification of microarray data using random forest

Data set

no info

SVM

KNN

DLDA

SC.l

SC.s

NN.vs

random forest

random forest var.sel.

         

s.e. 0

s.e. 1

Leukemia

0.289

0.014

0.029

0.020

0.025

0.062

0.056

0.051

0.087

0. 075

Breast 2 cl.

0.429

0.325

0.337

0.331

0.324

0.326

0.337

0.342

0.337

0. 332

Breast 3 cl.

0.537

0.380

0.449

0.370

0.396

0.401

0.424

0.351

0.346

0. 364

NCI 60

0.852

0.256

0.317

0.286

0.256

0.246

0.237

0.252

0.327

0.353

Adenocar.

0.158

0.203

0.174

0.194

0.177

0.179

0.181

0.125

0.185

0. 207

Brain

0.762

0.138

0.174

0.183

0.163

0.159

0.194

0.154

0.216

0. 216

Colon

0.355

0.147

0.152

0.137

0.123

0.122

0.158

0.127

0.159

0. 177

Lymphoma

0.323

0.010

0.008

0.021

0.028

0.033

0.04

0.009

0.047

0. 042

Prostate

0.490

0.064

0.100

0.149

0.088

0.089

0.081

0.077

0.061

0. 064

Srbct

0.635

0.017

0.023

0.011

0.012

0.025

0.031

0.021

0.039

0.038