Skip to main content

Table 2 Comparison of performance of different methods on reconstruction of three haplotypes for simulated data sets

From: HaploJuice : accurate haplotype assembly from a pool of sequences with known relative concentrations

a. Proportion of three samples: 0.5, 0.4, 0.1 (total length of three haplotypes: 30k)

Software

# contigs

Longest

N50

Haplotypes

Error rate %

 

≥ 500 bp

contig

 

coverage %

 

HaploJuice

3.0 ± 0.0

9975 ± 6.8

9971 ± 6.5

99.7 ± 0.0

0.001 ± 0.004

hmmfreq[8]

3.0 ± 0.0

9855 ± 6.8

9850 ± 6.3

98.5 ± 0.0

0.276 ± 0.254

shoRAH[3]

30.8 ± 11.7

9819 ± 124.8

9799 ± 116.7

97.5 ± 3.5

0.646 ± 0.492

SAVAGE[4]

9.8 ± 3.5

9972 ± 11.8

305 ± 300.3

51.3 ± 7.1

0.001 ± 0.004

PredictHaplo[5]

2.0 ± 0.2

9991 ± 4.2

9984 ± 5.6

67.7 ± 5.7

0.102 ± 0.034

QuRe[6]

3.7 ± 1.9

6993 ± 1306.3

7374 ± 686.5

43.8 ± 13.5

0.331 ± 0.318

b. Proportion of three samples: 0.5, 0.3, 0.2 (total length of three haplotypes: 30k)

Software

# contigs

Longest

N50

Haplotypes

Error rate %

 

≥ 500bp

contig

 

coverage %

 

HaploJuice

3.0 ± 0.0

9975 ± 6.3

9971 ± 7.8

99.7 ± 0.0

0.000 ± 0.001

hmmfreq[8]

3.0 ± 0.0

9854 ± 5.8

9850 ± 7.6

98.5 ± 0.0

0.089 ± 0.104

shoRAH[3]

27.9 ± 6.6

9814 ± 118.3

9789 ± 113.9

97.1 ± 4.7

0.591 ± 0.358

SAVAGE[4]

11.4 ± 3.4

9983 ± 8.2

436 ± 281.8

54.7 ± 7.1

0.001 ± 0.005

PredictHaplo[5]

2.0 ± 0.2

9991 ± 3.7

9984 ± 5.8

68.0 ± 6.6

0.087 ± 0.040

QuRe[6]

4.2 ± 2.2

7348 ± 820.8

7436 ± 776.9

44.9 ± 15.9

0.761 ± 0.851

c. Proportion of three samples: 0.6, 0.3, 0.1 (total length of three haplotypes: 30k)

Software

# contigs

Longest

N50

Haplotypes

Error rate %

 

≥ 500bp

contig

 

coverage %

 

HaploJuice

3.0 ± 0.0

9975 ± 7.3

9970 ± 7.7

99.7 ± 0.0

0.000 ± 0.000

hmmfreq[8]

3.0 ± 0.0

9854 ± 5.6

9849 ± 6.2

98.5 ± 0.0

0.210 ± 0.214

shoRAH[3]

25.2 ± 5.9

9837 ± 115.0

9808 ± 113.3

97.4 ± 4.8

0.749 ± 0.516

SAVAGE[4]

11.2 ± 3.0

9971 ± 20.9

419 ± 260.5

53.9 ± 6.3

0.001 ± 0.006

PredictHaplo[5]

2.0 ± 0.0

9991 ± 3.5

9984 ± 4.7

66.7 ± 0.0

0.089 ± 0.025

QuRe[6]

3.9 ± 1.9

7074 ± 1284.4

7300 ± 716.6

39.1 ± 14.5

0.492 ± 0.597

d. Proportion of three samples: 0.7, 0.2, 0.1 (total length of three haplotypes: 30k)

Software

# contigs

Longest

N50

Haplotypes

Error rate %

 

≥ 500bp

contig

 

coverage %

 

HaploJuice

3.0 ± 0.0

9976 ± 6.1

9971 ± 6.3

99.7 ± 0.0

0.005 ± 0.048

hmmfreq[8]

3.0 ± 0.0

9855 ± 6.2

9850 ± 6.7

98.5 ± 0.0

0.240 ± 0.220

shoRAH[3]

20.2 ± 4.7

9835 ± 115.0

9812 ± 106.4

93.8 ± 11.2

0.912 ± 0.630

SAVAGE[4]

15.2 ± 3.0

9974 ± 10.6

708 ± 161.7

65.1 ± 7.0

0.001 ± 0.005

PredictHaplo[5]

2.0 ± 0.0

9991 ± 3.8

9984 ± 4.7

66.7 ± 0.0

0.088 ± 0.021

QuRe[6]

3.6 ± 1.8

6787 ± 1333.0

7121 ± 809.6

28.4 ± 11.2

0.319 ± 0.535

  1. One hundred data sets were generated for each of the cases with different sets of sample proportions. Format of the data is: average ± standard deviation. The best value for each column is highlighted among the software outputting the contigs over 90% haplotype coverage