Skip to main content

Table 4 Computing times.

From: A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data

  

no selection

univariate selection

Multivariate selection (Gini importance)

multivariate selection (PLS/PC)

  

PLS

PC

RF

PLS

PC

RF

PLS

PC

RF

PLS

PC

RF

MIR BSE

orig

5.7

11.1

9.9

46.4

53.9

46.8

88.8

97.0

91.5

87.9

92.4

88.0

 

binned

2.8

3.2

3.1

13.6

14.7

15.9

26.1

27.1

29.0

28.7

29.6

31.5

MIR wine

French

8.8

7.8

2.4

26.6

21.8

7.7

47.0

45.9

33.5

17.2

14.7

7.4

 

grape

12.1

10.3

2.5

28.9

22.3

8.0

54.0

47.6

33.5

15.8

13.1

6.5

NMR tumor

all

0.3

0.4

0.4

1.4

1.2

2.1

2.9

2.7

3.6

3.6

3.4

4.3

 

center

0.2

0.2

0.2

1.1

0.8

1.1

2.2

1.9

2.1

2.1

1.8

2.0

NMR candida

1

4.6

8.8

7.7

22.4

41.2

37.1

43.5

62.5

61.1

59.8

78.4

75.4

 

2

3.7

4.8

3.8

18.0

22.0

19.4

34.5

38.5

37.3

36.3

40.3

37.9

 

3

3.7

4.7

3.7

17.4

20.1

17.9

33.4

36.0

34.7

34.6

37.8

35.1

 

4

3.9

5.1

4.8

18.7

23.4

24.3

36.0

40.5

60.5

41.6

46.2

47.0

 

5

3.5

3.9

2.6

31.9

32.4

27.0

62.6

63.0

60.0

58.3

43.4

38.5

  1. The table reports the runtime for the different feature selection and classification approaches, and the different data sets (on a 2 GHz personal computer with 2 GB memory). Values are given in minutes, for a ten-fold cross-validation and with parameterisations as used for the results shown in Tables 2 and 3. For all methods, a univariate feature selection takes about five times as long as a classification of the same data set without feature selection. Both multivariate feature selection approaches require approximately the same amount of time for a given data set and classifier. Their computing time is no more than twice as long as in a recursive feature elimination based on a univariate feature importance measure.