Skip to main content

Table 1 Overview of analyses implemented in ANGSD

From: ANGSD: Analysis of Next Generation Sequencing Data

Analysis

Basis

Reference

Contamination estimates based on the X-chromosomes

BC

[19] b

Type specific error estimation estimated by simultaneously estimating allele frequencies and genotype likelihoods

GL

[10]

Type specific error estimation based on an outgroup and a high quality genome

BC

[20] ab

Genotype likelihoods (GL) (diploids)

BC/Seq

[6],[8],[10],[15]

Allele frequencies for a site

BC/GL/GP

[21] b [10]

SNP discovery (LRT) used for rejecting that the allele frequency is different from zero

GL

[10]

Genotype posteriors (GP) can be used for calling genotypes by specifying a cutoff

GL/SAF

[9],[10]

Sample allele frequencies (SAF) the probability of all read data given the sample allele frequency

GL/GP

[9] b

Population differentiation statistics F st

SAF

[14] ac

Population structure via principle components analysis (PCA)

GP

[14] ac

Admixture analysis (NGSadmix) NGS data

GL

[22] ab

Detection of ancient admixture ABBA-BABA/d-statistics

BC

[20] b

Estimation of SFS (1D)

SAF

[9] ab

Estimation of SFS (2D)

SAF

 

Selection scans, Neutrality tests (e.g θ's and Tajima's D)

SAF

[12] ab

Estimation of individual and site-wise Inbreeding coefficients. Also MAF and GP estimation for inbreed individuals

GL

[13] abc

Allele frequency based association for case/control data)

GL

[10]

Association score test in a generalized linear model framework for both quantitative and case/control data while allowing for additional covariates

GL-GP

[11] b

  1. Table of the supported analyses in ANGSD. aindicates methods that require a secondary program in ANGSD package. bindicates methods for which ANGSD is the de facto implementation and care user supplied extensions for ANGSD. The basis for each analysis is either the sequencing data (Seq), base counts (BC), genotype likelihood (GL), sample allele frequencies (SAF) or genotype probabilities (GP).