Skip to main content

Table 1 Performance of baselines

From: Modeling and mining term association for improving biomedical information retrieval performance

k 1

b

Indices

Genomics 2007

Genomics 2006

Genomics 2005

Genomics 2004

HARD 2004

   

document

passage

passage2

document

passage

document

document

document

passage

0.4

2.0

word

0.1584

0.0675

0.0267

0.2662

0.0532

-

-

-

-

  

sentence

0.1368

0.0406

0.0154

0.2378

0.0398

-

-

-

-

  

paragraph

0.1086

0.0170

0.0094

0.2036

0.0192

0.1964

0.2952

0.2449

0.2635

  

BEST

0.1584

0.0675

0.0267

0.2662

0.0532

0.1964

0.2952

0.2449

0.2635

0.5

1.3

word

0.2108

0.0963

0.0364

0.3140

0.0718

-

-

-

-

  

sentence

0.1805

0.0700

0.0350

0.3030

0.0550

-

-

-

-

  

paragraph

0.1588

0.0452

0.0333

0.3109

0.0369

0.2602

0.3404

0.2802

0.2985

  

BEST

0.2108

0.0963

0.0364

0.3140

0.0718

0.2602

0.3404

0.2802

0.2985

1.0

1.0

word

0.1556

0.0434

0.0328

0.3097

0.0659

-

-

-

-

  

sentence

0.1809

0.0758

0.0350

0.2918

0.0521

-

-

-

-

  

paragraph

0.1902

0.0893

0.0327

0.2916

0.0337

0.2547

0.3425

0.2522

0.2718

  

BEST

0.1902

0.0893

0.0350

0.3097

0.0659

0.2547

0.3425

0.2522

0.2718

1.2

0.75

word

0.1809

0.0780

0.0295

0.3045

0.0651

-

-

-

-

  

sentence

0.1987

0.0814

0.0394

0.3202

0.0522

-

-

-

-

  

paragraph

0.2013

0.0648

0.0578

0.3381

0.0362

0.2874

0.3584

0.2617

0.2758

  

BEST

0.2013

0.0814

0.0578

0.3381

0.0651

0.2874

0.3584

0.2617

0.2758

2.0

0.4

word

0.1953

0.0844

0.0317

0.3152

0.0637

-

-

-

-

  

sentence

0.2084

0.0758

0.0401

0.3529

0.0490

-

-

-

-

  

paragraph

0.2025

0.0633

0.0641

0.3476

0.0362

0.2779

0.3483

0.2810

0.2895

  

BEST

0.2084

0.0844

0.0641

0.3529

0.0637

0.2779

0.3483

0.2810

0.2895

  1. The baseline results are presented: (1) five parameter settings for (k1, b) at the first and second columns; (2) three different indices, where "word" stands for the word-based index, "sentence" for the sentence-based index and "paragraph" for the paragraph-based index; (3) three evaluation measures as the document-level, the passage-level and the passage2-level; (4) five TREC data sets as the TREC 2004-2007 Genomics data sets and the TREC 2004 HARD data set; (5) only a paragraph-based index is set up for the TREC 2005 and 2004 Genomics data sets and the TREC 2004 HARD data set, as mentioned in the section of indexing.