Skip to main content
Figure 3 | BMC Bioinformatics

Figure 3

From: Construction and use of gene expression covariation matrix

Figure 3

Neighbourhood similarity for probe sets targeting a common gene. The cumulative distribution frequency (cdf) of the logarithm of the p-values is plotted for the following categories of probe set pairs in network HG-U95-NR10: – unique pairs, i.e. genes targeted by exactly two probe sets (red: CORR, blue: ANTI). – multiple pairs, i.e. genes targeted by more than two probe sets. In this case, we considered either the best (magenta: CORR, cyan: ANTI) or worst p-values (magenta interrupted: CORR, cyan interrupted: ANTI). – random pairs, where the first probe set of unique pairs is matched with a probe set randomly selected from the second network (green: CORR, black: ANTI)- random network, where the first probe set of unique pairs is matched with a probe set taken from the second network after its neighbours have been randomized (green interrupted: CORR, black interrupted: ANTI). If the number of common neighbours is larger than expected, the log10(p-value) is calculated (left part of the curves, < 0), otherwise the opposite of the logarithm is calculated (right part of the curves, > = 0). A vertical line at log10(p-value) = -10 indicates the position of the inflection point used to tabulate the cdf values in Table 3. The presence of a strong inflection point at around -10 is an artefact of the algorithm, which is unable to calculate correctly very low p-values.

Back to article page