Skip to main content

Table 1 Numbers of heterozygous positions correctly identified, and numbers of false-positive predictions from low-copy-number repeats, expressed as a percentage of the identified SNPs.

From: Calling SNPs without a reference sequence

λ\x

4

5

6

7

8

0.5

473/54.8%

552/65.0%

561/68.3%

561/69.0%

561/69.1%

1.0

4,598/28.6%

6,131/37.9%

6,450/43.5%

6,501/45.9%

6,508/46.7%

1.5

14,119/14.9%

21,179/21.4%

23,386/26.8%

23,915/30.3%

24,021/32.0%

2.0

27,067/7.8%

45,111/11.8%

52,630/16.0%

55,036/19.5%

55,675/21.8%

2.5

40,080/4.1%

73,481/6.5%

90,877/9.4%

97,835/12.2%

100,145/14.6%

  1. The values are given as a function of fold coverage (λ, row labels) and the upper bound on the number of overlapping reads (x, column labels). For instance, at λ = 1.0 and x = 5, there are 6,131 correct SNP calls and 37.9% as many duplication-induced erroneous ones. This is a theoretical analysis based on informally fitting a model (see text) to data from the genome of Dr. James Watson.