Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Software for rapid time dependent ChIP-sequencing analysis (TDCA)

Fig. 1

TDCA analysis work flow, requirements, and performance. a Simplified work flow. Required input data are genomic coordinates in BED format and folders containing BAM TC sequence files. TDCA normalizes data based on total sequencing coverage of each time point and also handles input files and replicates using additional normalization procedures. Loci can be modeled as the following categories of signal change: rise, fall, hill, or valley. An identity matrix that predicts loci category is based on the time at which absolute minimum sequencing coverage (black arrows) and absolute maximum sequencing coverage (red arrows) occurs as set by user defined thresholds. Each sigmoid color indicates a rise or fall with different combinations of absolute maximum and absolute minimum coverage positions in time with genuine leading and trailing points. Alternatively, users can model all their data to a single sigmoidal curve. The resulting parameters from data fitting are then reported to the user along with raw sequencing coverage calculations. Graphical output is provided to the user which can be enriched by specifying genome and genes. R scripts are provided in case users would like to change the look of default figures. b Plots show sequencing coverage (y-axis) over time (x-axis) at loci for coordinates of chromosome 1:5,012,338–5,013,264 obtained from a H3.3 ChIP-seq experiment [10] using previously applied modeling strategies of inverse negative exponential (upper left) and multi-linear (upper right), and the sigmoidal fitting used by TDCA (lower). TDCA requires on terminal access to SAMtools [23] for sequencing coverage calculation of BAM files, BEDTools [37] for BED file manipulations, and R with the drc [22] package for curve fitting. In the example shown here, parameters that govern data modeling by TDCA can be fine-tuned to result in either a single or double sigmoid. The lower and upper horizontal dashed lines represent absolute minimum coverage and absolute maximum coverage values, respectively. The overall sequencing coverage range at a locus is shown as a vertical dashed line with red arrows. In this case, the three data points marked with white arrows exceed the plateau range threshold (gray boxes) and are defined as genuine absolute maximum trailing data points. This results in double sigmoid modeling as shown here. Parameters for both sigmoids are reported to users. The plateau range threshold and leading/trailing threshold could be adjusted such that the locus is modeled to a single sigmoid

Back to article page