Skip to main content
Fig. 5 | BMC Bioinformatics

Fig. 5

From: Improved quality metrics for association and reproducibility in chromatin accessibility data using mutual information

Fig. 5

Random forest prediction of experimental relationships. A Distributions of the coefficient of determination (R\(^2\)) and normalized mutual information scores calculated on binned counts of WFpkm between ATAC-seq experiments. Blue, orange, and green dots mark comparisons between independent experiments, independent experiments using the same cell line, and true experimental replicates, respectively. B Example confusion matrix from a random forest model using R\(^2\) and normalized mutual information as features to predict experimental relationships (y-axis) presented in A (x-axis). The confusion matrix depicts results of model on a hold-out set (40% of data, accuracy = 95.12%). Light to dark colors depict the number of counts per class. C Bi-variate plot displaying the change of paired importance scores from ten-fold cross validation between the normalized mutual information (x-axis) and R\(^2\) (y-axis) features. Dashed lines depict the uni-variate means of the normalized mutual information and R\(^2\) scores. Blue and yellow colors depict the level of accuracy for each fold

Back to article page