Skip to main content

Table 2 Layout of sequence processing parameters per pipeline

From: Concatenation of paired-end reads improves taxonomic classification of amplicons for profiling microbial communities

Pipeline

Cutadapt trimming

Merging/Concatenating

DADA2 processingh

NMd

None

Merged by DADA2d

Paired-end

LMd

Length-trimmed

Merged by DADA2

Paired-end

QMd

Quality-trimmed

Merged by DADA2

Paired-end

QdMd

None – Quality-trimmed by DADA2a

Merged by DADA2

Paired-end

NMp

None

Mergede

Single-end

LMp

Length-trimmedb

Merged

Single-end

QMp

Quality-trimmedc

Merged

Single-end

NBp

None

Merged & Concatenatedf

Single-end

LBp

Length-trimmed

Merged & Concatenated

Single-end

QBp

Quality-trimmed

Merged & Concatenated

Single-end

NCs

None

All Concatenatedg

Single-end

LCs

Length-trimmed

All Concatenated

Single-end

QCs

Quality-trimmed

All Concatenated

Single-end

NR1

None

N/A

Single-end

LR1

Quality-trimmed

N/A

Single-end

NR2

None

N/A

Single-end

LR2

Quality-trimmed

N/A

Single-end

  1. Default parameters were used for each tool unless otherwise specified. The first step for all pipelines was to remove primers with cutadapt. Paired-end reads were merged by DADA2, or alternatively merged, concatenated, or merged and concatenated prior to DADA2 processing (resulting in single-end reads) or treated separately without merging or concatenation. After DADA2 processing, taxonomic classification of all sequences was performed by aligning to both the Greengenes and SILVA reference databases separately
  2. aDADA2 performed the trimming, with trunc-q set to 20
  3. bLength trimmed: all sequences were trimmed to the same length based on the mean length where the average base quality dropped below a Q-score of 20
  4. cQuality trimmed: sequences were individually trimmed based on a PHRED score threshold of 20
  5. dMerged by DADA2: forward and reverse reads merged by DADA2 with default parameters with a minimum 20 bp overlap
  6. eMerged: forward and reverse reads merged by PANDAseq with a minimum 20 bp overlap
  7. fMerged and concatenated: after merging with PANDAseq, sequences unable to be merged were concatenated with PANDAseq and added to the merged sequence file
  8. gAll concatenated: forward and reverse read pairs were joined together after reverse complementing R2. No merging was performed
  9. hDADA2 processing: pipelines with merging and/or concatenating before this step, or that contained just the forward or reverse reads, were processed as single-end. Pipelines that had DADA2 performing the merging were processed as paired-end