 Research
 Open Access
 Published:
Tractable RNA–ligand interaction kinetics
BMC Bioinformatics volume 18, Article number: 424 (2017)
Abstract
Background
The binding of small ligands to RNA elements can cause substantial changes in the RNA structure. This constitutes an important, fastacting mechanism of ligandcontrolled transcriptional and translational gene regulation implemented by a wide variety of riboswitches. The associated refolding processes often cannot be explained by thermodynamic effects alone. Instead, they are governed by the kinetics of RNA folding. While the computational analysis of RNA folding can make use of wellestablished models of the thermodynamics of RNA structures formation, RNA–RNA interaction, and RNA–ligand interaction, kinetic effects pose fundamentally more challenging problems due to the enormous size of the conformation space. The analysis of the combined process of ligand binding and structure formation even for small RNAs is plagued by intractably large state spaces. Moreover, the interaction is concentrationdependent and thus is intrinsically nonlinear. This precludes the direct transfer of the strategies previously used for the analysis of RNA folding kinetics.
Results
In our novel, computationally tractable approach to RNA–ligand kinetics, we overcome the two main difficulties by applying a gradientbased coarse graining to RNA–ligand systems and solving the process in a pseudofirst order approximation. The latter is welljustified for the most common case of ligand excess in RNA–ligand systems. We present the approach rigorously and discuss the parametrization of the model based on empirical data. The method supports the kinetic study of RNA–ligand systems, in particular at different ligand concentrations. As an example, we apply our approach to analyze the concentration dependence of the ligand response of the rationally designed, artificial theophylline riboswitch RS3.
Conclusion
This work demonstrates the tractability of the computational analysis of RNA–ligand interaction. Naturally, the model will profit as more accurate measurements of folding and binding parameters become available. Due to this work, computational analysis is available to support tasks like the design of riboswitches; our analysis of RS3 suggests strong cotranscriptional effects for this riboswitch.
The method used in this study is available online, cf. Section “Availability of data and materials”.
Background
Riboswitches enable the specific response to the presence of ligands by transcriptional or translational control of gene expression. Their ability to switch genes on or off depending on small molecules such as theophylline or tetracycline makes them valuable biotechnological tools. The design of tailored riboswitches for specific applications and advanced control logic is therefore an attractive endeavor in synthetic biology [1]. A riboswitch can be understood as the composition of its aptamer and its actuator domain. It senses the ligand by binding it to a binding pocket of the aptamer domain; this influences the conformations of the actuator domain and thereby leads to a measurable response to ligand binding, e. g. by terminating transcription (OFFswitch) or suppressing the terminator hairpin (ONswitch).
The computational design of artificial riboswitches requires a sufficiently accurate model of the ligand binding process and the structural response of the RNA to ligand binding. The equilibrium thermodynamics of RNA–ligand binding has been studied for RNA–RNA interactions, e. g. in [2, 3], and for small molecule binding in RNA–ligand [4]. As in the case of single molecule RNA folding, purely thermodynamic models are sometimes insufficient because they disregard the dynamics of the process. This can cause dramatic mispredictions. Various approaches have analyzed the kinetics of single molecule RNA folding [5–8]. For tractability, the continuous process is decomposed into elementary steps, simplified based on heuristic assumptions, and/or approximated by a coarsegrained process.
Wolfinger et al. [7] present a coarsegraining approach to approximate RNA folding. Following e. g. [6], they analyze the folding process on the energy landscape of conformations, i. e. secondary structures, R _{ i } of an RNA R. Conformation change is modeled by elementary moves (base pair insertion or deletion) endowed with reaction rates that follow the Arrhenius rule and thus depend on the energy barrier between the source and target conformations. In the approximation of RNA secondary structures, activation energies for opening/closing of single base pairs are approximately constant. The energy barrier thus effectively depends only on the energy difference between source and target [7]. This defines a Markov Process on the state space of all secondary structures, which is too large to make it possible to analyze it by diagonalizing the corresponding rate matrix. To effectively reduce the state space, [7] combine states into basins that consist of all conformations that are connected to the same local minimum by their gradient walk on the energy landscape. Since gradient walks connect states to their lowest energy neighbors, they correspond to the fastest folding paths from a state into a local minimum. This provides the rationale for approximating the full process by the macroprocess on gradient basin macrostates, which are assumed to be equilibrated. Consequently, the rates between the macrostates are canonically derived as weighted sums of microrates of the original process. By employing this heuristic approach, the size of the conformation space of smaller RNA molecules of up to a hundred nucleotides is typically reduced to just a few thousand macrostates. For example, in our analysis of the 81 nucleotide long riboswitch RS3 [1], 11.4 millions of secondary structures are mapped to 1133 macrostates. The macroprocess is finally solved by diagonalization. In our approach we reuse ideas of this coarsegraining, which also allows us to reuse several tools for singlemolecule RNA kinetics (RNAsubopt [9], barriers [10], treekin [7]).
RNA–ligand interaction model
We are going to describe a reaction system of the RNA R (given by its sequence of nucleotides {A,C,G,U}) with the ligand L at the level of RNA and binding complex conformations, such that we can study the kinetics of association, dissociation, and conformation changes. For simplicity, we assume that there is only a single ligand conformation (also denoted L). In the same way as a single RNA molecule transitions between various conformations during the folding process, the complex of RNA and ligand adopts different conformations L R _{ i }. Importantly, only a subset of the RNA conformations binds the ligand. The part of the total state space that corresponds to the RNA–ligand complexes is therefore isomorphic to a subset of the state space of the free RNA molecule. Thus, our system of consideration consists of the reactions
for all i,j∈{1,…,N}, i≠j. According to the rate laws for elementary reactions, the rates of each of these reactions depend on specific rate constants and the concentrations of the molecules. The reactions 1 and 4 only have nonzero rate constants, if the RNA conformations R _{ i } and R _{ j } are related by an elementary move such as the insertion or deletion of a base pair. Moreover, L and R _{ i } can interact only for the subset of the R _{ i } that form an appropriate binding pocket; otherwise, the complex L R _{ i } is deemed unstable and thus excluded from the model. Since RNA conformations correspond to RNA secondary structures, the energies of monomer states can be calculated from the Turner energy model [11]. For dimer states, we add the aptamer–ligandspecific binding energy. For the exemplary studied riboswitch RS3, this energy can be derived from the empirical dissociation constant [1]. Finally, we derive the rate constants as Metropolis rates with appropriate preexponential factors that can be estimated from empirical rates. Note that the rates of base pair opening k ^{−} and closing k ^{+} are directly related by the energy change Δ G due to the closing. Concretely, k ^{−}/k ^{+}= exp(Δ G/R T) for Δ G<0. Experimental values are available for the zippering rate, which corresponds to the rate of closing the last hairpin in a helix. A careful analysis in [12] yields a value in the range 4.7·10^{7} to 10^{9}s^{−1} roughly consistent with earlier estimates [13–15]. In principle, a kinetic constant can be derived for the closing of first base pair in a loop from a wormlike chain model [12, 16]; following earlier work on RNA kinetic models [7], we use here a single kinetic parameter k ^{+} for all base pairs. An empirical rate of one specific theophylline aptamer association was reported as 600 M^{−1}s^{−1} [17], which may serve as rough estimate for comparable systems. Note that [17] measured the macroscopic apparent rate that depends on the rate of dimerization as well as the rate of refolding into structures with theophylline binding pocket.
While the Reactions 1, 3, and 4 are of first order, the second order association in Reaction 2 introduces nonlinearity into the system. Assuming ligand excess, which is a very plausible assumption for small molecular ligands, however, it is possible to devise a pseudofirst order approximation of the system.
Even if the reaction equations above appropriately model the RNA–ligand interaction, this system is still computationally intractable for typical riboswitch sizes. As a remedy, we construct a coarsegrained process based on the separate gradientbasins for the monomer and dimerstates. The monomer states with suitable binding pocket are connected to dimer states, cf. Fig. 1. Importantly, there is no direct mapping from monomer macrostates to dimer macrostates of our coarsegrained system because conformations without binding pockets are absent from the “dimer world”. The upper basin in the “monomer world” of Fig. 1 is subdivided into two basins in the dimer world; conversely, the middle and lower monomer basins correspond to a single basin of the dimer world.
Contributions
We start by elaborating the general macroprocess of RNA–ligand interaction, based on gradient basin macrostates, and derive the corresponding rate constants. This contributes an original description of coarsegrained interaction processes; it is the first fundamental prerequisite for our tractable RNA–ligand kinetics approach. Furthermore, we leverage that a wide spectrum of biological RNA–ligand systems operate under strong ligand excess, justifying the pseudofirst order approximation. On these grounds, we establish the first analytical approach for RNA–ligand interaction kinetics. Based on solving the master equation of the interaction process, this enables the computation of timedependent macrostate probabilities. Finally, we study the kinetics of the artificially designed riboswitch RS3 [1] interacting with theophylline. We analyze the system at different concentrations and present results that strongly suggest cotranscriptional effects.
Methods
We consider the fixed interaction system of the RNA R and the ligand L. Let X denote the set of all monomer microstates, X={R _{ i }∣i=1,…,N}; in our setting, the R _{ i } are the secondary structures of a given RNA sequence. The subset X ^{+}⊆X comprises the conformations that can bind the ligand. Here X ^{+} contains all states with a specific binding pocket. Furthermore, define X ^{∗} as the set of dimer microstates L R _{ i }, X ^{∗}={L R _{ i }∣R _{ i }∈X ^{+}}⊆{L R _{ i }∣i=1,…,N}.
A dimer microstate L R _{ i }∈X ^{∗} has the energy E(R _{ i })+θ, where θ<0 denotes the binding energy of R and L. The inverse temperature is \(\mathfrak {b}=\frac {1}{RT}\), where T is the absolute temperature and R is the universal gas constant. For a set S⊆X∪X ^{∗} of microstates, let \(Z[\,{S}\,] := \sum _{x\in S}\exp (\mathfrak {b} E(x))\) denote the partition function of S. The probability of a microstate x in S is then given by
Let x,y∈X∪X ^{∗} be microstates. The microrate constant from x to y is denoted k(x→y) (or k(y←x)). On microstates x,y, we define the symmetric neighborhood relation \(\,\mathcal {N}{}\) such that \({x}\,\mathcal {N}{y}\) holds if and only if x and y have a distance of exactly one elementary move.
For \({x}\,\mathcal {N}{y}\), let
be the activation energy of the transition x→y as defined by the Metropolis rule. Accordingly, the microrate constants for distinct states x,y∈X∪X ^{∗}, where \({x}\,\mathcal {N}{y}\), are defined as
otherwise, define k(x→y):=0. A(x→y) denotes the reactionspecific preexponential factor. For our purposes, we assume that this factor depends only on the type of reaction and the factors for conformation change in monomers and dimers are equal. Thus, we distinguish the factors A _{a} for association, A _{d} for dissociation, and A _{R} for conformation changes of the RNA secondary structure. As we will show later, A _{a}=A _{d} due to detailed balance.
Denote the powerset of a set S by \(\operatorname {\mathcal {P}}(S)\). A monomer (or dimer) macrostate is a set of monomer (or dimer) microstates, i. e. an element of \(\operatorname {\mathcal {P}}(X)\) (\(\operatorname {\mathcal {P}}({X}^{*}))\). We denote the (macro)rate constant from macrostate α to β by r(α→β) (or r(β←α)). Macrorate constants are defined by summing over microrate constants and their respective state probabilities, i. e.
We emphasize that we use the term macrostates freely to denote general sets of microstates. Only when we introduce specific partitions of the microstates into macrostates, it makes sense to distinguish represented macrostates of our specific coarsegrained system from other sets of microstates.
Results and discussion
Macrostate kinetics of RNA–ligand interaction
For a microstate x∈X ^{+}, we denote its corresponding dimer microstate (after binding to L) by Lx, i. e. for x=R _{ i }, L x=L R _{ i }. This notation is raised to sets of microstates by defining L α:={L x∣x∈α}. Lemma 1 below asserts that the rate constants between dimer microstates and macrostates can be computed exactly like rate constants of monomer states.
Lemma 1
For x,y∈X ^{+}, k(L x→L y)=k(x→y). Furthermore, Pr[ L x∣L α ]=Pr[ x∣α ] holds for all \(\alpha \in \operatorname {\mathcal {P}}({X}^{+})\). Finally, r(L α → L β)=r(α→β), for all macrostates \(\alpha,\beta \in \operatorname {\mathcal {P}}({X}^{+}).\)
Proof
The individual claims follow easily from the definitions. If \(({x},{y})\notin \mathcal {N}\), k(x→y)=0=k(L x→L y), so assume \({x}\,\mathcal {N}{y}\). Since
Furthermore,
Finally,
□
The microrate constant from monomer to dimer states is constant, whereas the back rate depends on the binding energy θ.
Lemma 2
(Association and dissociation microrate constants) For x∈X ^{+}, the rate of association is k(x→L x)=A _{a}, while the dissociation rate is \(k({Lx \rightarrow x})=A_{\mathrm {d}}\exp (\mathfrak {b} \theta).\) All other rates between monomer and dimer microstates are 0.
Proof
By Metropolis rule, for x∈X ^{+},
since \(E^{\Delta }_{xLx} = \max \{E(x),E(Lx)\}E(x)\) and, additionally, E(L x)=E(x)+θ _{ L }≤E(x). Analogously, for the inverse microrate,
□
The association (dissociation) microrates due to Lemma 2 induce corresponding macrorates, which additionally depend on the probability of the associable (dissociable) microstates in the source macrostate, respectively (Lemma 3).
Lemma 3
(Association and dissociation macrorate constants) For arbitrary \(\alpha \in \operatorname {\mathcal {P}}(X)\) and \(\beta \in \operatorname {\mathcal {P}}({X}^{+})\), equation \(r({\alpha {\rightarrow }\,L\beta })= A_{\mathrm {a}}\frac {Z[\,{\alpha \cap \beta }\,]}{Z[\,{\alpha }\,]} \) holds. Additionally, \(r({L\beta \rightarrow \alpha })= A_{\mathrm {d}}\frac {Z[\,{\alpha \cap \beta }\,]}{Z[\,{\beta }\,]}\cdot \exp (\mathfrak {b}\theta).\)
Proof
Let \(\alpha \in \operatorname {\mathcal {P}}(X)\) and \(\beta \in \operatorname {\mathcal {P}}(X^{+})\). Thus
□
A tractable model under ligand excess
For our coarsegrained RNA–ligand interaction process, we partition the monomer microstates X and the dimer microstates X ^{∗} into sets of macrostates Ξ and Ξ ^{∗}, respectively. For the theoretical discussion, we require only that Ξ and Ξ ^{∗} are partitions of the respective sets X and X ^{∗}. Later, in our application, we are going to define macrostates as gradient basins (within their respective component).
We denote the monomer macrostates (in Ξ) by α _{1},…,α _{ n } and the dimer macrostates (in Ξ ^{∗}) by β _{1},…,β _{ m }. Since—by model assumption—the ligand is in large excess, the change of the ligand concentration [L] is essentially negligible in relation to the change of RNA concentrations. Formally, we assume d/d t[L]=0, i. e. at all times [L]=l _{0}, for the initial ligand concentration l _{0}. The change of RNA monomer and RNA–ligand dimer concentrations over time is described by a system of ordinary differential equations (ODEs) corresponding to Reactions (1)–(4).
Following the firstorder rate laws, Reaction (1) causes n ^{2}−n flows r(α _{ i }→α _{ j })[α _{ i }] from α _{ i } to α _{ j } (1≤i,j≤n, i≠j); Reaction (4) m ^{2}−m flows r(β _{ i }→β _{ j })[β _{ i }] from β _{ i } to β _{ j } (1≤i,j≤m, i≠j); and Reaction (3) n·m flows r(β _{ i }→α _{ j })[β _{ i }] from β _{ i } to α _{ j } (1≤i≤m and 1≤j≤n). In contrast to these simple firstorder transitions, the state changes due to Reaction 2 follow secondorder rate laws contributing the n·m flows r(α _{ i }→β _{ j })[L][α _{ i }] from α _{ i } to β _{ j } (1≤i≤n and 1≤j≤m). Without the assumption d/d t[L]=0, the rate would depend on two variable concentrations, causing the system to be nonlinear. However, by our assumption, the concentration [L] is constant.
The system of ODEs. The change of concentrations is now described by summing over single contributions:
for i=1,…,n, and
for j=1,…,m.
We set γ:=(α _{1},…,α _{ n },β _{1},…,β _{ m })^{T} and define the (n+m) × (n+m)matrix R(l _{0}). Then the entire coarsegrained system under ligand excess can be expressed by the linear ODE \( \frac {d}{dt} [\gamma ] = R(l_{0}) [\gamma ], \) where
is constructed from four submatrices:

is an n × nmatrix with entries a _{ i j }=r(α _{ i }←α _{ j }) for 1≤i,j≤n, i≠j. For 1≤i≤n,
$$ a_{ii} := \sum_{{\substack{1\leq k\leq n\\k \neq i}}} r({\alpha_{k} \leftarrow \alpha_{i}}) \sum_{{1\leq k\leq m}} l_{0} r({\beta_{k} \leftarrow \alpha_{i}}). $$ 
is an m × mmatrix with entries b _{ i j }=r(β _{ i }←β _{ j }) for 1≤i,j≤m, i≠j. For 1≤j≤m,
$$ b_{jj} := \sum_{{1\leq k\leq n}}r({\alpha_{k} \leftarrow \beta_{j}})  \sum_{{\substack{1\leq k\leq m\\k \neq j}}} r({\beta_{k} \leftarrow \beta_{j}}). $$ 
is an n × mmatrix with entries c _{ i j }=r(α _{ i }←β _{ j }) for 1≤i≤n,1≤j≤m, and

is an m × nmatrix with entries d _{ i j }=r(β _{ i }←α _{ j }) for 1≤i≤m,1≤j≤n.
Computing RNA–ligand kinetics
The described ODE system can be solved analytically building on existing software. The entire computation pipeline consists of five major steps:

1.
enumeration of the RNA’s structure space

2.
computation of the gradient basins and corresponding rates for

(a)
the monomer landscape

(b)
the dimer landscape

(a)

3.
computation of the rates between the monomer and dimer basins

4.
construction of the full rate matrix R(l _{0})

5.
integration of the linear ODE system
Since an exhaustive enumeration of the structure space is infeasible even for short RNAs, Step 1 generates only a selected part of all possible secondary structures of the input RNA. For this work, we consider only structures up to a certain energy above the minimum free energy of the sequence as computed by RNAsubopt [9]. To further reduce the number of structures, only structures without any isolated base pairs are generated.
Often, the restriction to low energy structures excludes important microstates of relatively high energy such as the open RNA chain from the model. Simply adding such a structure to the system is insufficient without also including transitional structures that connect it to the remaining states. The solution for this work was to develop a heuristic algorithm to partially explore an energy landscape around a given structure of interest, flooding neighbored basins if their local minimum has a lower energy than the current one, until one reaches structures within the already explored energy band. This approach seems to be more adequate in the context of a gradient basin coarse graining than a direct path heuristic (e. g. findPath [6]).
In Step 2a, we compute the gradient basins and rates for the monomer landscape from the list of input structures using barriers [7] (with minh heuristic). For Step 2b, a list of all input structures that contain the binding pocket is generated with RNAsubopt’s constraint folding mode. This enables one to enumerate dimer structures up to a higher energy than possible for the entire landscape, ensuring the dimer world is connected. As shown in Lemma 1, the transition rates in the constrained dimer landscape are independent of the ligand’s binding energy and thus can be computed exactly like those of the monomer landscape.
In Step 3, the transition rates between monomer and dimer macrostates are computed based on Lemma 3 using the mapping of the monomer and dimer structures to their respective basins. For this purpose we modified barriers to output this information.
Step 4 yields the full rate matrix R(l _{0}) for one set of preexponential factors and a certain ligand concentration l _{0} by combining the previously computed rate constants. We emphasize that we can easily compute R(l _{0}) for different values of l _{0}, A _{a} and A _{R} without repeating the previous, more timeconsuming computation steps.
Finally, in Step 5 the system of ODEs is solved directly using the closed form \(\vec c(t) = \exp (tR(l_{0}))\cdot \vec c(0)\), where \(\vec c(t)\) is the vector of macrostate concentrations at time t. The exponential exp(t R(l _{0})) is obtained by diagonalizing R(l _{0}) numerically using the tool treekin [7], which performs this computation efficiently.
Parameters from empirical measurements
The binding energy θ can be derived from an empirically measured dissociation constant \({K}_{\mathrm {d}}^{\mathrm {A}}\) of the aptamer; e. g. in the case of theophylline, [18] measure a \({K}_{\mathrm {d}}^{\mathrm {A}}\) of 0.32 μ M for the theophylline aptamer of R S3. From the macroscopic measurement, we derive the binding energy as
where T _{A}=298K is the temperature of the measurement, R is the gas constant, and Pr[ “pocket”∣A,T _{A} ] denotes the equilibrium probability of the binding pocket in the aptamer at temperature T _{A} as calculated in the Turner energy model (cf. [1], which neglect the probability). This relation allows calculating the effective dissociation constant at temperature T _{R} of a theophylline riboswitch like RS3 that contains the aptamer, due to the inverse relation
For RS3 at T _{R}=313.15 K,
due to the pocketconstrained and unconstrained ensemble free energies in the Turner model. Thus,
For relating the rates of the different reaction types, one needs to estimate the preexponential factors of all reactions. Commonly, one assumes constant factors for each type of reaction. In reasonable approximation, we furthermore equate the factors for monomer and dimer conformation changes.
Given the apparent association rate \(A^{\mathrm {m}}_{\mathrm {a}}\) (which we assume to equal the macroscopic preexponential factor of dimerization), one can bound the microscopic preexponential factor A _{a}. If we assume that refolding is much slower than dimerization, then \(A^{\mathrm {m}}_{\mathrm {a}}\) is a product of the microrate and the equilibrium probability of the binding pocket. Conversely, if we assume the refolding to be much faster, than \(A^{\mathrm {m}}_{\mathrm {a}}\) directly measures the dimerization microrate. Thus,
In the case of theophylline,
and consequently, A _{a}≈Aam.
Finally, the preexponential factor for dissociation A _{d} equals A _{a}. This is a consequence of detailed balance of the dimerization reaction, i. e.
which implies
Empirical results
We apply our system to demonstrate the effect of changes in ligand concentrations to the interaction of the designed ONswitch RS3 from [1] with the ligand theophylline. Using our prototypical software, we precompute the macroprocess for RS3 including rate constants in several hours. As noted above, this yields 1133 macrostates in total, 22 of which are dimer states representing RNA molecules bound to the ligand. From the technical point of view, our system is described by 1133 coupled differential equations for the states α _{1},…,α _{1111} through β _{1},…,β _{22}. Subsequently, we compute kinetics for each combination of concentrations and preexponential factors within seconds (on a Core i5750 @ 4×2.67GHz). Figure 2 summarizes our results; each subfigure plots the probabilities of prominent monomer and dimer states over time. In addition, the minimum energy secondary structures is shown for the most important macrostates. It serves as a suitable representative of the macrostate’s ensemble of structures. In this sense it provides a useful, coarsegrained picture of the most likely refolding paths. We set the preexponential factors to the estimations A _{R}=10^{6}s^{−1} and A _{a}=600 M^{−1}s^{−1} as described before. This allows interpreting the time and ligand concentrations in concrete units and relates the speed of folding and dimerization.
Figure 2 a–c show the results for ligand concentrations 10^{4} M, 10^{5} M, and 10^{6} M. In the RS3 riboswitch, the aptamer domain is fused to a rhoindependent terminator at the 3’end. Thus, during transcription the aptamer is available shortly before the strong terminator stem can be formed and then dominates the entire structure ensemble. Therefore, we study a partially transcribed riboswitch RS3 that is shortened by the 3’half of the terminator stem and the 3’ polyU stretch. The kinetics of the shortened riboswitch are shown for concentrations of 10^{−7} M, 10^{−6} M, and 10^{−3}M in respective Fig. 2 d–f. Note that the time scales for interaction of RS3 with theophylline are in accordance with the computed dissociation constant \(K^{\text {RS3}}_{\mathrm {d}}\), which implies that the monomer and dimer concentrations are balanced at about 10^{4}M ligand concentration. This extreme concentration suggests that the riboswitch would be nonfunctional without further, probably cotranscriptional, effects. This is a plausible hypothesis since RS3 was designed to regulate at the transcriptional level.
The estimated rates are derived from a small number of empirical measurements at different conditions, such as ion concentrations (100 mM NaCl in [12], 5 mM MgCl _{2} and 0.5 M NaCl in [18], no Mg ^{2+} and 100 mM NaCl in [17]), temperatures, and actual sequences; hence they are not directly comparable. Nevertheless, they provide reasonable ball park estimates, because we observed that the qualitative behavior of the system is robust against variations of these parameters by several orders of magnitude.
Conclusions
Several refinements of the model remain for future research. Most importantly, the assumption of only a single binding motif is rather stringent. In general, one would like to support multiple binding sites with different binding energies. Our model can be naturally generalized to such scenarios by introducing multiple “dimer worlds” corresponding to different binding motifs. Furthermore, some ligands, such as Mg ^{2+} have multiple binding sites. The current implementation of the Arrhenius approximation of the RNA folding kinetics, finally, is quite simplistic, using only a single kinetic prefactor for all structural rearrangements. A refined model would presumably distinguishing constants for nucleation, stack extension, base pair sliding, and loop pinching. Moreover, in particular transcriptional riboswitches, which operate temporally coupled with the progressive transcription of the RNA, will be influenced by this kinetic interplay. It is well known that RNA chains commonly change their optimal structure while growing during transcription. Consequently, the RNAs refold during the process of transcription [8]. The framework presented here can be extended to cotranscriptional interaction analysis. However, this will require additional experimental measurements to calibrate the parameters of the model to properly relate the different “reaction” speeds. In particular, this entails accurate measurements of the thermodynamic parameters for the ligand binding and of the kinetic prefactors of folding and dimerization as well as the speed of transcription.
Abbreviations
 ODE:

Ordinary differential equation
References
 1
Wachsmuth M, Findeiss S, Weissheimer N, Stadler PF, Mörl M. De novo design of a synthetic riboswitch that regulates transcription termination. Nucleic Acids Res. 2013; 41(4):2541–51.
 2
Dimitrov RA, Zuker M. Prediction of hybridization and melting for doublestranded nucleic acids. Biophys J. 2004; 87(1):215–26.
 3
Bernhart SH, Tafer H, Mückstein U, Flamm C, Stadler PF, Hofacker IL. Partition function and base pairing probabilities of RNA heterodimers. Algorithms Mol Biol. 2006; 1(1):3.
 4
Borujeni AE, Mishler DM, Wang J, Huso W, Salis HM. Automated physicsbased design of synthetic riboswitches from diverse RNA aptamers. Nucleic Acids Res. 2015; 44(1):1–13. Available from: http://0dx.doi.org.brum.beds.ac.uk/10.1093/nar/gkv1289.
 5
Mann M, Kucharik M, Flamm C, Wolfinger MT. Memory efficient RNA energy landscape exploration. Bioinformatics. 2014; 30(18):2584–91.
 6
Flamm C, Fontana W, Hofacker IL, Schuster P. RNA folding at elementary step resolution. RNA. 2000; 6(3):325–38.
 7
Wolfinger MT, SvrcekSeiler WA, Flamm C, Hofacker IL, Stadler PF. Efficient computation of RNA folding dynamics. J Phys A: Math General. 2004; 37(17):4731–41. Available from http://0stacks.iop.org.brum.beds.ac.uk/03054470/37/4731.
 8
Hofacker IL, Flamm C, Heine C, Wolfinger MT, Scheuermann G, Stadler PF. BarMap: RNA folding on dynamic energy landscapes. RNA. 2010; 16(7):1308–16.
 9
Lorenz R, Bernhart SH, Höner Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, et al. ViennaRNA Package 2.0. Algorithms Mol Biol. 2011; 6:26.
 10
Flamm C, Hofacker IL, Stadler PF, Wolfinger MT. Barrier trees of degenerate landscapes. Zeitschrift für Physikalische Chemie. 2002; 216(2/2002). Available from: http://0dx.doi.org.brum.beds.ac.uk/10.1524/zpch.2002.216.2.155.
 11
Turner DH, Mathews DH. NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure. Nucleic Acids Res. 2009; 38((Database)):D280–2. Available from: http://0dx.doi.org.brum.beds.ac.uk/10.1093/nar/gkp892.
 12
Kuznetsov SV, Ansari A. A Kinetic Zipper Model with Intrachain interactions applied to nucleic acid hairpin folding kinetics. Biophys J. 2012; 102(1):101–11. Available from: http://0dx.doi.org.brum.beds.ac.uk/10.1016/j.bpj.2011.11.4017.
 13
Pörschke D. Model Calculations on the Kinetics of Oligonucleotide Double Helix Coil Transitions. Evidence for a fast chain sliding reaction. Biophys Chem. 1974; 2:83–96.
 14
Cocco S, Marko JF, Monasson R. Slow nucleic acid unzipping kinetics from sequencedefined barriers. Eur Phys J E Soft Matter. 2003; 10:153–61.
 15
Zhang W, Chen SJ. RNA hairpinfolding kinetics. Proc Natl Acad Sci USA. 2002; 99:1931–6.
 16
Toan NM, Morrison G, Hyeon C, Thirumalai D. Kinetics of loop formation in polymer chains. J Phys Chem B. 2008; 112:6094–106.
 17
Latham MP, Zimmermann GR, Pardi A. NMR (Chemical Exchange as a Probe for LigandBinding Kin)letics in a TheophyllineBinding RNA Aptamer. J Am Chem Soc. 2009; 131(14):5052–53. Available from: http://0dx.doi.org.brum.beds.ac.uk/10.1021/ja900695m.
 18
Jenison RD, Gill SC, Pardi A, Polisky B. Highresolution molecular discrimination by RNA. Science. 1994; 263(5152):1425–9.
 19
Kühnl F, Stadler PF, Will S. Tractable Kinetics of RNALigand Interaction In: Bourgeois A, Skums P, Wan X, Zelikovsky A, editors. Bioinformatics Research and Applications: 12th International Symposium. vol. 9683 of Lecture Notes in Bioinformatics. Berlin: Springer International Publisher: 2016. p. 337–8.
Acknowledgements
A twopage abstract of this work appeared at ISBRA’16 [19].
Funding
This work is supported by the German Federal Ministry of Education and Research (BMBF; support code 031A538B) within the German Network for Bioinformatics Infrastructure “de.NBI” and by the German Research Foundation (DFG; grant STA 850/151). The authors acknowledge funding support for publication charges from the German Research Foundation (DFG) and Universität Leipzig within the program of Open Access Publishing.
The authors declare that the funding body has not been influencing the design or conclusion of this study.
Availability of data and materials
The method used in this study is available as free software at www.bioinf.unileipzig.de/~felix/software/RLIkin.
About this supplement
This article has been published as part of BMC Bioinformatics Volume 18 Supplement 12, 2017: Selected articles from the 12th International Symposium on Bioinformatics Research and Applications (ISBRA16): bioinformatics. The full contents of the supplement are available online at https://0bmcbioinformaticsbiomedcentralcom.brum.beds.ac.uk/articles/supplements/volume18supplement12.
Author information
Affiliations
Contributions
FK implemented the method and conducted experiments. FK, PFS, and SW designed the algorithms and experiments. All authors wrote the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Kühnl, F., Stadler, P.F. & Will, S. Tractable RNA–ligand interaction kinetics. BMC Bioinformatics 18, 424 (2017). https://0doiorg.brum.beds.ac.uk/10.1186/s1285901718235
Published:
DOI: https://0doiorg.brum.beds.ac.uk/10.1186/s1285901718235
Keywords
 RNA secondary structure prediction
 RNA interaction kinetics
 RNA–ligand interaction
 Riboswitches