Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Correlated mutations via regularized multinomial regression

Figure 1

Multinomial setup for correlated mutations. (A) Mapping of amino acid characters to numerical factors. Matrix A represents the multiple sequence alignment, in which each amino acid is mapped to an integer (1-21). (B) Matrix M indicates the matrix to which A is converted: each column of A is expanded into 21 columns in M, as indicated for one particular column (column 10, A10, indicated by blue box). In this expansion, for each entry in the matrix A, the corresponding entry (14 and 12 in the example of column 10) in the M matrix is set to 1; the other 20 entries are set to 0. (C) Multinomial regression is used to find links between each column Ai of A and all the columns in M-i, i.e. all columns in M except those representing Ai. To do so, each column of A separately is used as dependent variable (Y) and all the columns in M that do not refer to that particular column of A are used as independent variables (X). In the example in this figure, Y = A1 (indicated with red box) and X = {M2,..,M9,M10}. Note that only part of X is shown.

Back to article page