Skip to main content
Fig. 3 | BMC Bioinformatics

Fig. 3

From: Anomaly detection in genomic catalogues using unsupervised multi-view autoencoders

Fig. 3

Scaling of the information budget with the data. We used artificial data of dimension 8 × 16, but the TRs are subdivided into 8 groups instead of 2 like in Additional file 2: Fig. S2 (I.e., there the TR 0 and 1 are a group, then 2 and 3 are another group, etc. up to 14 and 15. The groups are visually reminded on the figure as one grey box per group). At data generation, the stack is placed in one of the 8 groups. All 8 groups are equiprobable. The model parameters were 24 convolutional filters and a LR of 1E−4. The number of neurons in the Dense layers changes during the grid search. With lower deep dimensions (and so a lower information budget), the model is unable to learn separately the 8 existing correlation groups (B left) and will instead learn fewer and larger groups. A larger budget was needed to learn the 8 groups (B right). This highlights how the information budget must be adapted to the quantity of information in the data for a satisfactory result. Note that for this larger data, hundreds of neurons are required, compare to smaller models for the smaller data of Fig. 2. To help choose the budget, we propose a Q-score to quantify the quality of the rebuilding depending on the budget. This score assesses how well the model learns each existing pairwise correlations. More details about the Q-score of the models involved in this figure is presented in Additional file 7: Fig. S5

Back to article page