Skip to main content

Table 4 Number k discussion

From: Modeling and mining term association for improving biomedical information retrieval performance

 

n

document

passage

passage2

Genomics 2007

1

0.3012

0.0918

0.1436

 

5

0.3349

0.1400

0.1588

 

10

0.3438

0.1422

0.1635

 

20

0.3438

0.1422

0.1635

 

100

0.3438

0.1422

0.1635

Genomics 2006

1

0.3974

0.1401

-

 

5

0.4049

0.1445

-

 

10

0.4087

0.1467

-

 

20

0.4083

0.1466

-

 

100

0.4083

0.1466

-

Genomics 2005

1

0.3012

-

-

 

5

0.3116

-

-

 

10

0.3123

-

-

 

20

0.3123

-

-

 

100

0.3123

-

-

Genomics 2004

1

0.3470

-

-

 

5

0.3555

-

-

 

10

0.3584

-

-

 

20

0.3584

-

-

 

100

0.3584

-

-

HARD 2004

1

0.2015

0.2005

-

 

5

0.2223

0.2197

-

 

10

0.2250

0.2208

-

 

20

0.2248

0.2208

-

 

100

0.2248

0.2208

-

  1. The number k is the parameter in the recursive re-ranking algorithm: (1) the empirical study makes a local optimization number k = 10 as the final depth in the final experiments; (2) k stands for the top k term associations weighted by the factor analysis based model; (3) the recursive re-ranking algorithm will re-rank the baselines according to these k terms; (4) the more the results contain terms among these k terms, the higher ranking scores the results obtain; (5) five numbers such as 1, 5, 10, 20, 100, are tested; (6) five original baselines from our five data sets respectively, namely Genomics 2007, Genomics 2006, Genomics 2005, Genomics 2004 and HARD 2004; (7) k affects the performance greatly when k is smaller than 10, while the final performance almost has no change if k becomes larger than 10.