Skip to main content

Table 3 Summary of the parameters used to generate the synthetic data sets

From: A comparison of methods for differential expression analysis of RNA-seq data

Sim. study

| G DE up |

| G DE down |

|{g;ϕ g  = 0}|

‘Single’ outlier fraction

‘Random’ outlier fraction

B 0 0

0

0

0

0

0

B 0 1250

1,250

0

0

0

0

B 625 625

625

625

0

0

0

B 0 4000

4,000

0

0

0

0

B 2000 2000

2,000

2,000

0

0

0

P 0 0

0

0

6,250

0

0

P 625 625

625

625

6,250

0

0

S 0 0

0

0

0

10%

0

S 625 625

625

625

0

10%

0

R 0 0

0

0

0

0

5%

R 625 625

625

625

0

0

5%

  1. In all synthetic data sets, the observations were distributed between two conditions (denoted S1 and S2), with the same number of observations (2, 5 or 10) in each condition. We let G DE up and G DE down denote, respectively, the number of genes that were up- and downregulated in condition S2 compared to S1. The number of genes whose counts were drawn from a Poisson distribution (i.e., with the dispersion parameter equal to zero) is given by |{g; ϕ g  = 0}|. The ‘single’ outlier fraction denotes the fraction of the genes for which we selected a single sample and multiplied the corresponding count with a factor between 5 and 10. The ‘random’ outlier fraction denotes the fraction of counts that were selected randomly (among all counts) and multiplied with a factor between 5 and 10. The notation for the simulation studies (leftmost column) summarizes the type of simulation (B - ‘baseline’, P - ‘Poisson’, S - ‘single outlier’, R - ‘random outlier‘), the number of DE genes that are upregulated in S2 (i.e., G DE up , in the superscript) and the number of DE genes that are downregulated in S2 (i.e., | G DE down | , , in the subscript).