Skip to main content
Fig. 6 | BMC Bioinformatics

Fig. 6

From: Anomaly detection in genomic catalogues using unsupervised multi-view autoencoders

Fig. 6

a Confirmation of the biological meaningfulness of identified correlations. Scores are considered after applying all normalizations described in Methods. We consider pairs of TFs. For each pair {A,B} we give the score of peaks from A when B is present too in the same CRM (blue) or when B is absent (red). This is the same elementary operation as the Q-score, except we do not average across the X axis but take the actual peak value. Most examples presented are of TRs with high correlation, such as GABPA and ERG in Jurkat which have many common binding sites, ELL2 and AFF4 in Hela, or RCOR1 and SFMBT1 in Hela which are both repressors. When TFs correlate, our model will have learned that and assign higher scores to peaks for a TF when one of its correlators is present. We also provide some counter-examples: CTCF and GABPA in Jurkat have a R coefficient of 0.2 which is high for CTCF but low for GABPA (GABPA is often seen with CTCF, but CTCF has other partners than GABPA) and as such the impact on the score is also unidirectional. Finally the pairs framed in red such as CTCF and RUNX in Jurkat or RCOR1 and ZNF143 in HeLa have a low correlation coefficient. For them, the presence of one TR of the pair has little to no impact on the score of the other. For cases such as AFF4 and ELL2 in HeLa which have one major correlator (namely, each other), the distributions of all scores (blue and red merged) is rather bimodal, as the presence of the other acts as a binary switch. b For each processed CRM, average (left) and maximum (right) score of the peaks present in it, depending on the total number of peaks in the CRM. Number of peaks given axis in log2 scale. As Transcriptional Regulators tend to work in complexes, it makes sense that richer CRMs would be on average of better quality. However, the relation is not strictly linear: CRMs with supernumerary peaks likely contain noise, which is reflected here in a lower average score

Back to article page