Skip to main content

Table 4 Comparison with Genbank Annotations

From: Prodigal: prokaryotic gene recognition and translation initiation site identification

Organism

Genbank Genes with no Joins

Prodigal 1.20

Prodigal 1.20+TiCo

Prodigal 1.20+TriTisa

GenemarkHMM 2.6

Glimmer 3.02

EasyGene 1.2

MED 2.0

Escherichia coli K12

4268

4118/3823

(96.5%/89.6%)

4118/3779

(96.5%/88.5%)

4118/3778

(96.5%/88.5%)

4122/3685

(96.6%/86.3%)

4076/3563

(95.5%/83.5%)

3977/3565

(93.2%/83.5%)

4102/3711

(96.1%/86.9%)

Halobacterium salinarum

2110

2062/1857

(97.7%/88.0%)

2062/1809

(97.7%/85.7%)

2061/1790

(97.6%/84.8%)

2042/1676

(96.7%/79.4%)

2054/1609

(97.3%/76.2%)

2018/1692

(95.6%/80.2%)

2008/1469

(95.1%/69.6%)

Natronomonas pharaonis

2661

2630/2398

(98.8%/90.1%)

2630/2358

(98.8%/88.6%)

2630/2348

(98.8%/88.2%)

2624/2251

(98.6%/84.6%)

2622/2220

(98.5%/83.4%)

2548/2271

(95.7%/85.3%)

2586/1953

(97.2%/73.4%)

Bacillus subtilis

4174

4113/3705

(98.5%/88.8%)

4113/3678

(98.5%/88.1%)

4113/3679

(98.5%/88.1%)

4136/3713

(99.1%/89.0%)

4102/3569

(98.3%/85.5%)

3977/3578

(95.3%/85.7%)

4127/3596

(98.9%/86.2%)

Aeropyrum pernix

1699

1670/1430

(98.3%/84.2%)

1670/1363

(98.3%/80.2%)

1670/1353

(98.3%/79.6%)

1672/1364

(98.4%/80.3%)

1671/1317

(98.4%/77.5%)

1652/1389

(97.2%/81.8%)

1689/1309

(99.4%/77.1%)

Synechocystis PCC6803

3171

3146/2587

(99.2%/81.6%)

3146/2364

(99.2%/74.6%)

3146/2447

(99.2%/77.2%)

3124/2337

(98.5%/73.7%)

3123/2236

(98.5%/70.5%)

3053/2288

(96.3%/72.2%)

3126/2192

(98.6%/69.1%)

Pseudomonas aeruginosa

5565

5514/5038

(99.1%/90.5%)

5514/4885

(99.1%/87.8%)

5514/4821

(99.1%/86.6%)

5484/4698

(98.5%/84.4%)

5491/4705

(98.7%/84.5%)

5522/4761

(99.2%/85.5%)

5292/4539

(95.1%/81.6%)

  1. Table 4 shows the performance of gene-finding algorithms on seven Genbank files. The first number in each entry indicates the number of 3' ends of genes correctly identified. The second number in each entry indicates the number of 5'+3' ends (genes and their correct starts) exactly identified. Beneath these numbers are % representations for each of those values. It should be noted that Genbank genes are not experimentally verified; this table is just meant to provide a snapshot of performance over entire genomes.