Cooperative binding: a multiple personality

Cooperative binding has been described in many publications and has been related to or defined by several different properties of the binding behavior of the ligand to the target molecule. In addition to the commonly used Hill coefficient, other characteristics such as a sigmoidal shape of the overall titration curve in a linear plot, a change of ligand affinity of the other binding sites when a site of the target molecule becomes occupied, or complex roots of the binding polynomial have been used to define or to quantify cooperative binding. In this work, we analyze how the different properties are related in the most general model for binding curves based on the grand canonical partition function and present several examples which highlight differences between the cooperativity characterizing properties which are discussed. Our results mainly show that among the presented definitions there are not two which fully coincide. Moreover, this work poses the question whether it can make sense to distinguish between positive and negative cooperativity based on the macroscopic binding isotherm only. This article shall emphasize that scientists who investigate cooperative effects in biological systems could help avoiding misunderstandings by stating clearly which kind of cooperativity they discuss.

binding isotherm only. This article shall emphasize that scientists who investigate cooperative effects in biological systems could help avoiding misunderstandings by stating clearly which kind of cooperativity they discuss.

Introduction
The topic of cooperative binding has been addressed in many publications (e.g. Ben-Naim 2001; Gutierrez et al. 2012;Hunter and Anderson 2009;Ullmann and Ullmann 2011) and different underlying equations, measures and definitions of "cooperativity" have been used since the discovery of cooperative binding of oxygen to hemoglobin (Hill 1910(Hill , 1913. Among the different equations which have been used to describe binding curves are the Hill equation, the Adair equation, the Klotz equation (for a review of the history of these equations see e.g. Stefan and Le Novère 2013) and several "measures" or "indicators" have been developed, sometimes based on the validity of the respective model. Even though already Hill (1985) provided a rigorous treatment of cooperative binding based on statistical mechanics nearly thirty years ago (which included results of even earlier work) and which in particular highlighted that different concepts of cooperativity do not necessarily coincide, these consistency problems are still not fully recognized by the scientific community investigating cooperative phenomena. For instance, Hill (1985) clearly highlighted that the significance of the A.V. Hill coefficient (in the form of how it was initially generalized from the Hill to the Adair equation) as an indicator of cooperativity depends on a certain kind of symmetry between the binding sites and should be substituted by another quantity if an asymmetric system is considered. In this work, we will follow Hill's treatment of binding curves in the setup of the grand canonical ensemble, analyze under which circumstances different measures of cooperativity coincide and highlight difficulties which will arise if different measures are used. In particular, we present several hypothetical binding systems to illustrate differences between the concepts of cooperativity. We start with a recapitulation of the grand partition function in the next section which is followed by a section on definitions of cooperative binding related to the ligand activity ("effective concentration") dependent behavior on the macrostates (the number of bound ligands) and a section on definitions of cooperative binding on the microstates (including the information which sites are occupied). Note that this work deals with cooperative ligand binding. The term "cooperativity" will not be discussed in other contexts.

The grand canonical ensemble
The grand canonical ensemble defines a family of parameterized probability distributions on the possible states of a small subsystem which is embedded in a much larger system with which it can exchange energy and particles. The chemical activity ("effective concentration") of the particle in the whole system and the temperature of the whole system are the parameters. If we regard a molecule with n ligand binding sites in solution as such a subsystem we obtain the following probability for a binding state k = (k 1 , . . . , k n ) ∈ {0, 1} n , where k i = 0 means that site i is unoccupied and k i = 1 that site i is occupied (Schellman 1975): the so-called Boltzmann factor, G(k) the free energy of the molecule in state k compared to a fixed, arbitrarily chosen reference state, R the Boltzmann constant, T the absolute temperature in°K, λ ≥ 0 the thermodynamic activity of the ligand in the environment and |k| the number of occupied sites in microstate k (note that the domain λ ≥ 0 will underly all statements in this work). The overall titration curve is given by the expectation of the number of bound ligands as a function of the activity of the ligand in the solution at fixed temperature (also called "isotherm"): Ψ (λ) = n · a n λ n + (n − 1) · a n−1 λ n−1 + · · · + a 1 λ a n λ n + a n−1 λ n−1 + · · · + a 1 λ + 1 , where the denominator is the grand partition function, which is a polynomial Φ(λ) = a n λ n + · · · + 1 of degree n in the case of a target molecule with n binding sites. Here, the coefficients a i are the sum of all Boltzmann factors g(k) (Eq. 2) which belong to the respective macrostate i. In the context of ligand binding, Eq. (3) is called the Adair equation (Adair et al. 1925). The overall titration curve Ψ is what can be measured easily in experiments and we would like to deduce some characteristics of the family of measures described in Eq. (1) from Ψ . An example for a characteristic could be a change in "affinity" of the ligand to a certain site if another site changes its binding state. Note that even if Eq. (3) describes at first sight only the expectation of the distribution of |k| (macrostates) as a function of λ, it defines the whole family of distributions on the macrostates uniquely (see Martini 2014, p. 26). However, what is not uniquely determined is the original distribution on the microstates: an infinite number of families of distributions on the microstates produce the same overall binding curve. In the following, we will introduce and discuss several quantities related to "cooperativity", starting from definitions by properties of the distributions on the macrostates. The A. V. Hill equation was developed as a phenomenological description of the overall binding of oxygen to hemoglobin (Hill 1910(Hill , 1913 and it is not based on any mechanistic principle of ligand binding. It is given by with n denoting the maximal number of bound ligands, K the positive association constant and α an appropriate positive constant. Even though Eq. (4) is "only" a phenomenological description and consequently not necessarily covered by Eq.
(3), we will investigate under which circumstances Eq. (4) fits to the setup of Eq. (3) to generate an intuition of how to interpret α. Firstly, we see that, e.g. for the case of α = n = 2 and K > 0, Eq. (4) fits the shape of Eq.
(3) only if a 1 = 0. This is not possible, since the coefficients of the polynomial have to be positive, since they are sums of exponential terms. However, Eq. (4) can be interpreted in this example as a limit case with a 1 much smaller than 1. To include these limit cases, we will from now on also use binding polynomials with coefficients equal to zero. Including these polynomials, the question arises whether there are other options than n = α ∈ N denoting the number of binding sites to obtain an equation of shape of Eq. (4) from Eq. (3). An answer is given by Proposition 1 and Example 1 (see the proof section for the derivation of the results of Proposition 1).
Proposition 1 Let Eq. (4) with K > 0 be an overall binding curve of a molecule with n binding sites and a n = 0 (all sites can be occupied simultaneously). Then The coefficients a iα are nonzero for all i ∈ {1, . . . , n α }.
Note that Proposition 1 does not state that the denominator of Eq. (4) is necessarily the binding polynomial, which is true if α = n. Indeed, binding polynomials can be constructed whose corresponding overall titration curve satisfies Eq. (4) and α = n, which is illustrated by Example 1.
Example 1 An example of a binding polynomial whose corresponding overall titration curve is a Hill equation with a denominator unequal to the binding polynomial is Its overall titration curve is given by Proposition 1 showed that within the framework of Eq. (3), the parameter α can be regarded as a measure which tells us how many ligand particles bind simultaneously as a "package" to the target molecule.

The Hill coefficient
In Eq. (4), the parameter α is called the Hill coefficient, but it is also a useful tool (based on the more general Definition 1) if the binding curve of interest is not of shape (4). As illustrated in Proposition 1, the parameter α can be regarded as a measure which tells us how many ligand particles have to bind as a "package" to the target molecule. Even though, in the setup of the grand canonical ensemble, α has to be a natural number (positive integer), we can generalize the domain of α with different concepts. One possibility is to fit measured data to Eq. (4) with α ∈ R + , thus "projecting" a curve of shape of Eq. (3) on the class of functions described by Eq. (4) and interpret α in an analogous way, even if α is not a natural number. A value of e.g. 1.5 for α could be interpreted as packages of one and a half ligands which bind simultaneously. Consequently, a value of α > 1 in the best fitting function of shape of Eq. (4) can be interpreted as an indicator for cooperative binding. A second possibility to transfer the Hill coefficient α to curves of shape of Eq. (3) is to generalize the definition such that it coincides with α if we consider a curve of shape Eq. (4). This is done by Definition 1 which gives a more general definition of the Hill coefficient and which can be motivated by following idea: We start with the concept that ("positive") cooperativity between different sites means that the binding of the ligand to one site increases the "affinity" of the ligand to the other sites. This view leads to the intuition that cooperativity should lead to a steep slope of the titration curve, since the binding enhances the binding to the other sites (similar to a positive feedback loop). Having the example of hemoglobin in mind, this means that only a "small" change in oxygen activity is required to transit from a distribution with mainly completely empty molecules to a distribution with mainly fully occupied molecules. Consequently, "positive" cooperativity should be related to the steepness of the slope of an overall titration curve.
Definition 1 The Hill coefficient η of an overall titration curve Ψ is defined as the slope of as a function of the natural logarithm of the ligand activity log(λ), at the activity at which Ψ = n 2 (Hill 1985(Hill , 1910(Hill , 1913. Note that the Hill coefficient of Definition 1, coincides with α of Eq. (4), if this equation describes the overall titration curve (see the proof section). Moreover note that the transformation of the overall titration curve is strictly monotonous, i.e.
which means that we can express the information that we find in the Hill plots H Ψ also directly through Ψ .
We have summarized up to now that the Hill coefficient is motivated by the phenomenological Eq. (4) and that the usual way to generalize it to curves of shape of Eq. (3) is the use of Definition 1 which coincides with α of Eq. (4) if the function is of this shape. Since the Hill coefficient is usually applied to curves of shape of Eq. (3), the question arises which characteristics of the family of measures defined by Eq. (1) are really measured by the Hill coefficient. The answer to this question highlights a certain phenomenon of the grand partition function: Within this setup, characteristics of certain curves as a function of λ are related to characteristics of certain probability measures on {0, . . . , n} when λ is fixed. In particular, the Hill coefficient describes the variance of the distribution on the macrostates when Ψ = n 2 . Proposition 2 The Hill coefficient ν of Definition 1 satisfies with V(λ) denoting the variance of the distribution on the set of macrostates as a function of λ and V(λ) |Ψ = n 2 its value at Ψ = n 2 . Proposition 2 states that the Hill coefficient, which measures the steepness of the slope of the expectation as a function of log(λ) at Ψ = n 2 , also gives the variance of the distribution on the macrostates at Ψ = n 2 . This fact fits well to the idea of ("positive") cooperativity increasing the "affinity" of the ligand to the other sites when a certain site is occupied, since this circumstance will lead to a distribution on the macrostates with more weight towards the extreme occupational states 0 and n than if the ligands bind independently and thus will increase the variance.
However, the question arises why the point Ψ = n 2 is regarded. The answer is given by the fact that under a certain symmetry condition, the slope and consequently the variance of the distribution on the macrostates has a local extremum at this point., and that actually abnormally high values of variances of the distributions on the macrostates are quantities of interest. Before presenting the symmetry condition according to Hill (1985), we shortly describe why high values of variances of the distributions on the macrostates are of interest at all: High values of the variance of the distribution on the macrostates are important characteristics of the macroscopic system since the variances of independent systems define bounds. If a bound is exceeded, the observation can not result from an independent system. An "abnormally high" variance is an indicator for relevant interaction between the sites. To understand the threshold which is recognized by the Hill coefficient, let us regard an independent system. If the individual binding sites do not interact (in terms of interaction energy zero), the sites will bind the ligand stochastic independently for every value of λ (Martini et al. 2013b). Thus, the variance of the sum (the macrostates) equals the sum of the variances of the Bernoulli variables X i describing whether site i is occupied or not. The variance of a Bernoulli variable is bounded by 0.25, which is reached if P(X i = 1) = P(X i = 0) = 0.5. Thus, a variance larger than 0.25n (which at Ψ = n 2 equals a Hill-coefficient of 1) indicates that the observed behavior cannot result from a system of stochastic independent binding sites [We will show later that this reference point of half-saturation which is used in Definition 1 and which can often be found in literature (e.g. Hill 1985;Onufriev and Ullmann 2004;Ge and Qian 2009;Hunter and Anderson 2009) is not in general meaningful]. Hill (1985) gives the following symmetry condition: a λ 0 exists such that and states that the "definition of the Hill coefficient [. . .] can be used for all systems with the above symmetry" (Hill 1985, p. 83). Since we did not find a proof for this statement in literature, we will prove Proposition 3 (proofs can be found in Sect. 6).
In the following, we will use the notation for the derivative of a function with respect to the natural logarithm of the activity of the ligand. Moreover, note that i.e. the derivative of the titration curve with respect to log λ is the variance of the distribution on the macrostates |k| ∈ {0, . . . , n} (calculate the derivative or see e.g. Hill 1985).
Proposition 3 Let Ψ be an overall titration curve satisfying Eq. (9). Then Ψ and thus the variance of the distribution on the macrostates |k| has a local extremum at λ 0 .
Hill (1985) also rewrote the condition of Eq. (9) to the equivalent statement that the coefficients of the binding polynomial Φ(λ) = a n λ n + · · · + a 1 λ + 1 fulfill a r λ r 0 = a n−r λ n−r 0 (11) which means for the distribution at λ 0 that In particular, this equation illustrates that if a λ 0 exists which satisfies Eq. (9) [or equivalently Eq. (11)], this implies that λ 0 also satisfies Ψ (λ 0 ) = n 2 . Note that this kind of symmetry does not necessarily mean that the binding sites are identical, i.e. that they have the same energy levels. We will illustrate this point in Example 2.
Example 2 (Homooligomers and symmetry) Let M be a molecule with three binding sites. Moreover, let g i be the Boltzmann factor of a microstate of macrostate i. Then the binding polynomial is given by This special choice of the coefficients means that all the three sites are equal in terms of the free energy of binding or releasing a ligand from the site: the Boltzmann factor of a microstate only depends on the corresponding marcrostate (this could be a valid assumption for homooligomers). In spite of the binding polynomial belonging to a symmetric system (in terms of being composed of energetically identical sites) it does not necessarily fulfill the symmetry condition of Eq. (9), which is illustrated by g 3 = 4, g 2 = 2, g 1 = 1 and λ 0 = 0.603 as the activity of half-saturation. Moreover, even any arbitrarily chosen binding polynomial with coefficients a i can result from a Fig. 1 Left side the mean of the number of occupied sites as a function of the ligand activity (the "overall titration curve" or "isotherm", a decadic logarithmic scale is used) of the polynomial of Example 3. Right side the variance as a function of the ligand activity (at constant temperature) of the system described by the polynomial of Example 3 molecule with identical sites with g i = a i n i −1 , which illustrates that the discussed symmetry condition is not related to a chemical symmetry of the target molecule. Conversely, a binding polynomial fulfilling Eq. (9) does not necessarily come from a molecule in which all sites are identical. Eq. (9) describes a property of the distributions on the macrostates and not on the microstates. The symmetry condition (9) is not related to a chemical symmetry of the molecule. However, note here that the relation g i = a i n i −1 allows us to deduce the free energies of the microstates from the binding polynomial (if we know that we are dealing with a chemically symmetric system).
We have seen that the Hill coefficient measures the variance of a certain distribution and that the variance is extreme at the ligand activity of half-saturation if condition Eq. (9) is satisfied. An important observation is that even if Eq. (9) is fulfilled, the Hill coefficient is only a locally extreme variance, but not necessarily a locally maximal variance which is illustrated by Example 3.
We have demonstrated up to now that-from a theoretical point of view-the use of the reference point Ψ = n 2 is not fully convincing. Firstly, Example 3 showed that even if Eq. (9) is fulfilled by the system, we can measure a locally minimal variance. Secondly, if Eq. (9) is not fulfilled, the reference point Ψ = n 2 does not in general give any special information. In particular, the variance at this point being smaller than 0.25n does not necessarily mean that there is no other point at which the variance is "abnormally high". This observations can motivate the following generalizations of the Hill coefficient.

An extension of the Hill coefficient definition: the maximal variance
An idea for the extension of the Hill coefficient to other systems which do not satisfy the symmetry condition of Eq. (9) can be found in literature (Hill 1985;Onufriev and Ullmann 2004). The concept is to substitute the point Ψ = n 2 at which the variance (or the slope of the titration curve) is measured by the point at which the maximal variance is reached (see Hill 1985, p. 73). This idea is a direct consequence of the reference point Ψ = n 2 not necessarily offering special information and of the initial question whether the threshold of 0.25n for the variance is exceeded. The variance of a system of independent binding sites will always be smaller than or equal to this bound and thus, exceeding this threshold anywhere is an indicator for non-negligible interaction. Moreover, we have the obvious implication that, if the Hill coefficient is greater than 1 (which means the variance at Ψ = n 2 exceeds 0.25n) this implies also that the maximal variance exceeds this threshold. Thus, the maximal variance exceeding this threshold is obviously a generalization of the Hill coefficient.

A further extension of the Hill coefficient: different binomial distributions as reference for the variance
Instead of using the variance of the binomial distribution Bin(n, p = 0.5) as constant reference and which defines the upper bound of 0.25n for the variance of a system of independent variables, we can also compare the variance of the considered system point-wise to the variance of a binomial distribution Bin(n, p = Ψ n ) (Abeliovich 2005): Proposition 4 (Binomial variance as reference) Let us consider a target molecule with n binding sites. Moreover, let a λ 0 exist such that Then the observed overall titration curve cannot result from a system of n independent binding sites.
The derivation of this result which we found in literature was based on the additional assumptions of identical binding sites. Note that this assumption is not made here (for a proof see Sect. 6). Proposition 4 further generalizes the criterion of the maximal variance exceeding 0.25n: In case that the maximal variance exceeds 0.25n, the variance also exceeds the right-hand site of Eq. (12), since it is bounded by 0.25n. Moreover, this criterion gives much sharper bounds for the variance that an independent system can exhibit and extends an elegant generalization procedure: The Hill coefficient of Definition 1 compares a variance at a specific point with a reference variance. The maximum variance criteria uses the same fixed threshold of 0.25n but compares all variances of the system to this fixed boundary. At last, the criterion given by Eq. (12) defines saturation dependent bounds and checks whether this boundary function is exceeded at any point.
However, it is important to note here that both generalizations of the Hill coefficient are based on the idea of comparing the variance at fixed λ with the variance of a reference system. The following criterion is based on an alternative concept of investigating the change of the variance, when λ changes.

A sigmoidal shape in a linear plot
Even though Stefan and Le Novère (2013) use a definition of positive cooperativity based on microstates (a change in affinity of the ligand to a certain site if the ligand binds to another site), they state that if the normed titration curve "as a function of ligand concentration is sigmoidal in shape, as observed by Bohr for hemoglobin, this indicates positive cooperativity. If it is not, no statement can be made about cooperativity from looking at this plot alone". We will investigate how this property of a sigmoidal binding curve is related to large variances which were the characteristics of interest in Sect. 3.1. We start with a definition of a sigmoidal shape.
Definition 2 An overall titration curve Ψ is said to be of sigmoidal shape in a linear plot if a λ 0 > 0 exists such that This condition can be rewritten to Proposition 5 states why a sigmoidal shape in a linear plot can be regarded as an indicator of cooperative binding.
Proposition 5 Let Ψ (λ) be an overall titration curve of sigmoidal shape. Then Ψ (λ) cannot result from a system of independent binding sites.
Proposition 5 gives another criterion for an "abnormal" ligand binding behavior of the system by comparing the change in variance to a linear function. The question arises how the indications of cooperativity by the different criteria are related. The answer is that there is no implication which is illustrated by the following examples (first we will compare the criterion of maximal variance exceeding 0.25n to sigmoidal shape and then add an example comparing the latter to criterion (12)).
Then the maximal variance of the system is smaller than 0.75 (for a formal proof see Sect. 6), which is the maximal value an independent system of three binding sites  Left side the variance of the number of occupied sites of the system of Example 5 as a function of the ligand activity (constant temperature, a decadic logarithmic scale is used). Right side the variance divided by λ of the same binding system could reach. However, the corresponding function λ −1 V(λ) has a local maximum (criterion (12) is also satisfied). Both curves are illustrated in Fig. 2 (for earlier work on differences between the Hill coefficient and a sigmoidal shape, see also Ge and Qian 2009).
Example 5 Let us consider the binding polynomial 4λ 3 + 2λ 2 + 4λ + 1. Then the maximal variance of the system is approximately 1.06 which is greater than 0.75, which is the maximal value an independent system could reach. However, the corresponding function λ −1 V λ (|k|) does not have a local maximum (for a formal proof see Sect. 6). Since here, the maximal variance exceeds the threshold 0.25n, criterion (12) is fulfilled, too. Both curves are illustrated in Fig. 3.
Example 5 shows that there are cases in which Eq. (12) is fulfilled but the curve is not sigmoidal. What remains to be shown is that the reverse case, which means having sigmoidal shape but not fulfilling Eq. (12) exists as well. We did not find an example for this in the case of three binding sites. However, Example 6 describes a molecule with five binding sites and these properties.

Non-real roots of the binding polynomial as a definition of cooperative ligand binding
The use of the existence of non-real roots of the binding polynomial as a definition of cooperative binding (Onufriev and Ullmann 2004;Martini 2014) has the big advantage that it unifies and generalizes both previously mentioned concepts in a way: A system with non-interacting, stochastically independent binding sites for every value of λ possesses a binding polynomial with real roots only (see e.g. Martini and Ullmann 2013). Consequently, the appearance of non-real roots indicates that the polynomial cannot belong to a system of independent binding sites. Conversely, if a binding polynomial has only real roots then non of the previously described criteria can be fulfilled, since it is not possible to distinguish the system from an independent system in which the binding energies of the binding sites are transformed roots of the polynomial (see e.g. Martini and Ullmann 2013;Martini et al. 2013b). Since both previously described types of criteria measure an abnormal deviation from the behavior of an independent system, both criteria imply that the binding polynomial has at least two roots which are non-real. Moreover, there are cases in which the system exhibits "abnormal" behavior which can not result from an independent system, since the binding polynomial has complex roots, but both previously discussed criteria are not fulfilled, which is illustrated by Example 7. Still, the cooperativity of this system is captured by the property of the polynomial possessing non-real roots.
Example 7 An example of a polynomial which shows a titration behavior which can not result from an independent system but whose variance does not exceed the boundary given by Eq. (12) and whose corresponding titration curve is not of sigmoidal shape is 100λ 3 + 191λ 2 + 101.9λ + 1.
Two of its roots are non-real. The fact that the two criteria of a maximal variance exceeding the threshold of 0.75 and being of sigmoidal shape are both not fulfilled is illustrated in Fig. 4. A theoretical treatment of λ −1 V(λ) not having a local maximum and the bound (12) not being exceeded is given in Sect. 6. However, compared to the criterion of measuring the maximal variance, the existence of non-real roots as a definition for cooperativity has the disadvantage that it is not directly obvious whether and how the cooperative behavior can be quantified. For a system with three binding sites, the number of non-real roots will either be zero or two and a quantification of the cooperative effects in the system might only

Other measures of cooperativity based on the macroscopic behavior
Other additional approaches of capturing cooperative behavior as a real number exist. However, these concepts will have consistency problems if no additional conditions, typically such as a restriction to two binding sites [in this case several implications between the different types of cooperativity exist (Onufriev and Ullmann 2004;Martini 2014)] or the assumption of identical binding sites are imposed.
An example of another coefficient which captures cooperative effects is the αcoefficient used in the review by Hunter and Anderson (2009). Assuming that the concentration of the free ligand is constant (this assumption is more or less fulfilled if the number of ligands is much higher than that of the target molecule and it is the basic assumption for the validity of the grand canonical ensemble), the coefficient α of a system with binding polynomial a 2 λ 2 + a 1 λ + 1 is given by . This expression represents more or less the discriminant of the polynomial. Consequently, the polynomial will have non-real roots if α > 1 and thus the α > 1-criterion coincides with the appearance of non-real roots (n = 2). An open question is how to define α in the case of more than two binding sites in the best way (as an α i for each binding step or an overall α) and how to decide then whether the target molecule binds its ligand cooperatively or not.

Cooperative binding as a property of the family of distributions on the microstates 4.1 Cooperative binding defined by interaction energies of binding sites
We consider a molecule with n = 2 binding sites and the free energy differences of each state k with respect to the chosen reference state {0} n=2 (all sites unoccupied). As already previously mentioned, G(k) shall denote the difference between the free energies of the molecule in state k and the reference state. For a molecule with only two binding sites, the (relative) energies of the four states are G((0, 0)) = 0, G ((1, 0)), G((0, 1)), G((1, 1)) We call G ((1, 0)) the binding energy of site 1, G((0, 1)) the binding energy of site 2 and W 1,2 := G ((1, 1))−G((1, 0))−G((0, 1)) the interaction energy. If the number of binding sites n is greater than 2, we can obtain also interaction terms of higher order, which will be illustrated later. For the moment, we consider a molecule with only pairwise interaction, which means that for any number of binding sites, the energies of every state are determined by the energies of states with one and two sites occupied. We can use the following definition of cooperative binding on microstate properties.

Definition 3 Let M be a molecule with n binding sites and only pairwise interaction.
Then sites i and j are said to bind the ligand -positive cooperatively if w i, j > 1 (which corresponds to W i, j < 0) -non-cooperatively if w i, j = 1 (which corresponds to W i, j = 0) -negative cooperatively if w i, j < 1 (which corresponds to W i, j > 0).
At first sight, Definition 3 seems to be an appropriate way to define cooperativity. Advantages are that the definition is simple and unambiguous and that it captures the concept of an altered affinity of a ligand to a certain site if another site is occupied which often is the motivating idea (Berg et al. 2007;Stefan and Le Novère 2013): The binding of a ligand to the first binding site of a molecule with two binding sites leads to a change of the energy level G(k 1 ) if the other site is unoccupied but the molecule gains G(k 1 ) + W 1,2 if the second site is already occupied.
As a first observation, we see that there are molecules which exhibit different types of cooperativity according to Definition 3 but which are not distinguishable on the macroscopic level (Example 8 is taken from Martini and Ullmann 2013): Example 8 The following molecules (g 1 , g 2 , w 1,2 ) are examples of positive, negative, or non cooperative ligand binding according to Definition 3, but share the same overall titration curve and thus all macroscopic properties.
Since the roots of the binding polynomial are real, the overall titration curve is the sum of two standard Henderson- Hasselbalch (Henderson 1913;Hasselbalch 1916) curves describing the independent binding curves of the sites of the second molecule (Onufriev et al. 2001;Martini and Ullmann 2013;Martini et al. 2013b). Moreover, since the sites are energetically not identical, the point of maximal variance of both sites will not coincide, which means that the Hill coefficient of this system is smaller than 1. This fact will lead to a classification as undecidable or negative cooperative binding if the underlying assumption that the sites are equal is made. The curve is not sigmoidal [undecidable (Stefan and Le Novère 2013)], the roots are non complex (non cooperative binding), the coefficient α (Hunter and Anderson 2009) is smaller than 1 (analogous to the Hill coefficient) and Definition 3 classifies the three systems as negative, non and positive cooperative molecules. Example 8 illustrates that it is not in general possible to distinguish between the different types of cooperativity of Definition 3 based on the macroscopic binding isotherm. However, for two binding sites, we still have the implication that non-real roots of the binding polynomial imply a negative interaction energy (positive cooperative binding according to Definition 3 (Martini and Ullmann 2013, Corollary 2)). For more than four binding sites, it is not possible to deduce positive cooperativity as defined in Definition 3 from the existence of non-real roots. This point can be illustrated by a molecule with five binding sites (Example 3 of Martini and Ullmann 2013) showing negative cooperativity between all binding sites (according to Definition 3) and two non-real roots.
Another problem is that in larger systems (n > 2), negative cooperativity in terms of interaction energy between different sites can also lead to microscopic effects which are usually expected for "positive" cooperative relations (depending on the definition of "affinity"): Example 9 illustrates that an occupation of a certain site can increase the probability of another site being occupied, in a molecule which is purely negativecooperative, according to Definition 3.
However, if we compare the marginal probability of site 2 being occupied to the conditional probabilities conditioned on site 3 being unoccupied and to the probability conditioned on site 3 being occupied (Fig. 5), we see that the probability of site 2 being occupied can be increased if site 3 is occupied. For molecule M 2 , an occupation of site 3 does not decrease the probability of site 2 being occupied for any activity of the ligand. This effect is caused by the circumstance that the influence of an occupied site 1 on the binding to site 2 is more negative than that of site 3 and an occupation of site 3 reduces the probability of site 1 being occupied.
Example 9 shows that in a system which is fully negative cooperative according to Definition 3, the binding of a ligand to a certain site can increase the probability of another site being occupied. This is caused by the principle that "the enemy of my enemy is my friend" and this principle can also lead to a macroscopic behavior which will be classified as positive cooperativity, which is illustrated by Example 10.
Finally, another problem with Definition 3 appears when we deal with larger systems: The advantage of a simple discrimination between positive, non and negative cooperativity can vanish in larger systems, since additional interaction terms of higher order can contradict the pairwise interaction terms.

Conditional probability functions
Another idea to distinguish between different types of cooperativity on microstates and which has already been mentioned in the context of Example 9 is the comparison of the conditional probabilities P(k i = 1|k j = 0) vs. P(k i = 1|k j = 1) as functions of λ (for which the conditional probabilities exist, i.e. P(k j = 0) = 0 = P(k j = 0)). Here the problem arises that a simple separation into positive, non and negative cooperativity may not be possible if the whole functions are considered, which has already been illustrated by molecule M 1 of Example 9 (see Fig. 5), in which the conditional probability functions intersect. A variant to characterize a system's cooperative behavior by conditional probabilities was used by Ben-Naim (2001) who used the function for systems with two binding sites and analogous generalizations for larger systems.
Of special interest is here the limit lim λ→0 P(k 1 = 1|k 2 = 1) which can be used to distinguish between negative, positive and non-cooperativity by comparing the limit to the reference value 1 (Ben-Naim 2001). Obviously, Eq. (14) Ligand Activity Pairwise correlation (Eq. 14) between sites 2 and 3 Fig. 7 Left side the pairwise "correlation" defined by Eq. (14) of sites 2 and 3 of system M 1 of Example 9 as a function of the ligand activity (constant temperature, a decadic logarithmic scale is used). Middle analogously for system M 2 of Example 9. Right side the pairwise "correlation" of molecule M of Example 11 which includes interaction terms of higher order

The covariance function
Another possibility to characterize the cooperative behavior is to use the covariance function (or the stochastic correlation) Cov(k 1 , k 2 ) := Ek 1 k 2 − Ek 1 Ek 2 .
The basic concept that underlies this approach is coinciding with Eq. (14): If the sites bind the ligand independently, the covariance will be zero and still a function instead of a single value has to be used to characterize the cooperativity between the sites fully. Compared to the approach of Eq. (14), we have the disadvantage here that the limit for λ → 0 does not give any information. The stochastic correlations (covariance of k 1 , k 2 divided by the square root of the product of both variances) of sites 2 and 3 of the molecule M of Example 11 and of molecules M 1 and M 2 of Example 9 are illustrated in Fig. 8. Both concepts of correlation highlight the fact that the degree of interaction between sites depends on the state of the environment. In particular, interacting binding sites can behave in an independent way at certain non-trivial ligand concentrations (when Eq. (14) equals one and Eq. (16) equals zero). The use of correlation functions instead of single values seem to be more appropriate to capture the interaction between sites. In particular, this concept also solves the problem of possibly contrary interaction energy terms of different order.

Summary and outlook
In this work, we discussed the relation between different definitions of cooperative binding and presented extreme examples to illustrate differences between the concepts of cooperativity. For cooperativity defined on the overall titration curve, i.e. on macroscopic properties, we highlighted that a Hill coefficient larger than 1 shows that the variance of the distribution on the macrostates of the considered system is at half-saturation higher than the maximal variance a system consisting of independent binding sites can reach. In this context, we underlined that the reference point of half-saturation does not in general offer special information and discussed generalizations of the Hill coefficient which also consider "abnormally high" variances. The most general criterion measuring "abnormally high" variances is the threshold given by Eq. (12) (Abeliovich 2005), which we proved also for systems consisting of non-identical binding sites. Moreover, we showed that a sigmoidal shape of an overall titration curve in a linear plot also indicates that the considered system cannot consist of independent binding sites and that-contrary to the Hill coefficient of Definition 1 and its generalizations-this criterion does not consider the value of the variance as important characteristic, but the behavior of the change in variance with respect to a change of λ. We presented different examples to show that these two concepts of measuring "abnormal" variance (by considering its value or its change) do not coincide. In this regard, we highlighted that non-real roots of the binding polynomial can be a unifying definition for cooperativity. This criterion offers the advantage that both previously mentioned concepts imply that the binding polynomial has at least two roots which are non-real and thus it is a generalization. Moreover, the appearance of non-real roots is actually also the only relevant observation to see that the observed titration curve cannot be a result of an independent system. Any binding polynomial which has real roots only can be regarded as an independent system of binding sites whose binding energies are given by transformation of the roots of the binding polynomial (Onufriev et al. 2001;Martini and Ullmann 2013;Martini et al. 2013b). Conversely, if the polynomial has non-real roots it can not belong to a system of independent binding sites, since the independence of the sites would lead to a binding polynomial which is the product of the polynomials (linear factors) of all individual sites and which consequently would have only real roots. However, based on this definition, it is not clear how the degree of cooperativity can be quantified and whether the discrimination between positive and negative cooperativity can be well defined. For cooperativity defined on the measures on microstates, we also showed that a single value alone as a description of the cooperative ligand binding behavior of a molecule might not be enough to characterize the system's properties. Instead different correlation functions could be used. However, even based on the definitions on microstates it is not clear how to distinguish between positive and negative cooperativity well, since the qualitative behavior of the correlation can depend on the ligand activity, already if systems with three binding sites are regarded. Concerning the relation between the definitions of cooperativity on the macro-and microstates, we presented examples showing that "negative" cooperativity on the microscopic level can produce phenomena which are usually assigned to "positive" cooperative ligand binding. Considering additional information on the macroscopic binding kinetics might allow the deduction of more microscopic properties from the macroscopic binding behavior (Martini et al. 2013a;Martini and Habeck 2015). The presented examples create the suspicion that a real ligand activity independent discrimination between "positive" and "negative" cooperativity, based on the overall binding isotherm only, is impossible if general binding systems are considered. Overall, this work showed that scientists investigating cooperative effects should clearly state what cooperativity means in the context of their work and check whether preconditions have to be satisfied by the target molecule for consistency of the used definition(s).

Proofs
Proof of Proposition 1 (a) Let Φ denote the numerator of Eq.
(3) (this is also the derivative of Φ with respect to log(λ)). Then the constant n of Eq. (4) has to coincide with the number of binding sites, since if a n = 0 the overall binding curve will tend to the number of binding sites for λ → ∞. By precondition, we know which implies and thus β(λ) is a polynomial of degree smaller or equal to (n − 1), since it is a difference of two polynomials with the same leading coefficient. Moreover, the constant term of β(λ) is 1, since the constant term of Φ is 1. Thus, α has to be a natural number (a positive integer), since otherwise β(λ)(1 + K λ α ) would not be a polynomial. (b) In the case of α = 1, the statement is true. Let us consider now the case α > 1. Let β(λ) = b n−1 λ n−1 +· · ·+b 1 λ+1. Equation (17) implies that b i = 0, ∀i > n −α. Moreover, it shows that which implies For i = n, Eq. (20) is always fulfilled, since both sides are zero, independent of the value of b n−α . However, for i = n, this equation states that b i = 0 implies b i−α = 0, which gives and thus α divides n. (c) 0 < a n = b n−α K ⇒ b n−α > 0, since K > 0. Eq. (20) gives 0 < b n−α = α n−α b n−2α K which implies b n−2α > 0, as long as 2α ≤ n. This implies that Proof (Hill coefficient of a Hill equation) Using Definition 1, the Hill coefficient of an equation of shape (4) is Proof of Proposition 2 See Hill (1985), result in Eq. (11.41).
Proof of Proposition 3 Ψ = Φ Φ is a quotient of two polynomials in the variable λ. We introduced the notation f for the derivative with respect to log(λ).
is also a rational function in the variable λ. To see that Ψ has a local extremum we consider its derivative with respect to λ, which is a rational function in λ, since Ψ is a rational function. Thus, ∂ dλ Ψ = 0 means that its enumerator is equal to zero. Since this enumerator is a polynomial, it equals zero on a finite number of points λ 1 , . . . , λ m . We assume the equality which we will deduce from Eq. (9) later. Ψ is continuous (on λ ≥ 0) and it has only a finite number of local extremes, since ∂ dλ Ψ has only a finite number of zeros. Let > 0 be a number smaller than the distance of λ 0 to the nearest root of the enumerator of ∂ dλ Ψ . Then the ball B (λ 0 ) does not include more than one extremum (which would have to be in λ 0 ). Let (a n ) n∈N → 1 be a strictly monotonously increasing sequence with a n λ 0 ∈ B (λ 0 ) ∀n. Then a n λ 0 → λ 0 increases strictly monotonously, and Ψ (a n λ 0 ) → Ψ (λ 0 ) is strictly monotonously (either increasing or) decreasing, since otherwise another root of the denominator of ∂ dλ Ψ within the ball B (λ 0 ) would exist. Then, the sequence Ψ ((a −1 n λ 0 )) → Ψ (λ 0 ) is strictly monotonously (either increasing or) decreasing, due to Eq. (22), but with a −1 n λ 0 strictly monotonously decreasing. Consequently, Ψ has a local extreme in λ 0 . To prove Eq. (22), we regard the following: Thus, Eq. (22) is true if To see that Eq. (24) is true, we consider the functions f (a) := aλ 0 and g(a) = a −1 λ 0 and rewrite Eq.
Calculating the derivative with respect to a gives Proof of Proposition 4 Let X i denote the Bernoulli variable describing whether site i is occupied or not. The variance of the distribution on the macrostates V(λ) is given by the sum of the variances V λ (X i ) and covariances Cov λ (X i , X j ) We will show that V(λ) exceeding the threshold of Eq. (12) implies that which in particular implies that the sum of the covariances is not zero and which again means that the considered system cannot consist of independent binding sites. For this purpose, we show that With p i,λ denoting the probability of site i being occupied at ligand activity λ, and thus with Ψ = n i=1 p i,λ , the upper equation means This "sum of squares equation" is a result of Lagrange's identity (see e.g. Weisstein 2002, with one of the vectors of Lagrange's identity with constant entry 1). This means that Eq. (12) implies n i=1, j=1,i = j Cov λ (X i , X j ) > 0 and consequently that the observed overall titration curve cannot result from a system of independent binding sites.
Proof of Lemma 1 Let a λ 0 exist such that Since the second derivative of an overall titration curve will be negative if λ is sufficiently large, and since it is continuous, a λ 1 exists such that dΨ dλ has a local maximum at λ 1 . Moreover, since this implies that V(λ) λ has a local maximum at λ 1 (analogously the converse).
Proof of Proposition 5 Let us assume that an independent system exists which produces the observed overall titration curve. We will prove the statement by contradiction.
We can write the variance of |k| as the sum the variance of the independent Bernoulli variables The variance of a Bernoulli variable which is independent of the remaining system is given by g i λ (g i λ + 1) 2 with g i characterizing the affinity of the ligand to the ith binding site and thus The right side of this equation is a sum of monotonously decreasing functions and thus cannot have a local maximum at λ 1 .
Proof of Example 4 Example 4 shall demonstrate that a system in which the variance of the number of occupied sites is always below the bound 0.25 · n can be of sigmoidal shape in a linear plot, which means V(λ) λ has a local maximum. The first property of V(λ) being smaller than 0.75 can be seen in Fig. 2, however, we give a short proof to exclude the possibility that the local maximum is higher but is not recognized in the plot because of the choice of the lattice on which the function is evaluated. It is clear from the plot that values of λ exist such that V(λ) < 0.75. What we have to show is (since V(λ) is continuous) that V(λ) − 0.75 does not have a root on the positive real numbers. Writing with E denoting the expectation operator, Φ the binding polynomial and Z i the corresponding numerators gives that we are interested in the roots of the polynomial The computer algebra program Maxima tells us that a positive root does not exist.
Moreover, to avoid discussions about the exact definition of a sigmoidal shape (exactly vs. at least one inflection point), we show that the function has exactly one inflection point, the following way: We use the computer algebra program Maxima, calculate the derivative of V(λ) λ with respect to λ (diff(%, x, 1)) and calculate the exact roots of the numerator, which is a polynomial (allroots(..), ratnumer(..)). There is exactly one root in the positive numbers.

Proof of Example 5
The fact that V(λ) is not bounded by 0.75 is obvious. What has to be shown is that the curve is not even slightly sigmoidal. Analogously, to the procedure described in the former proof, we calculate the derivative of V(λ) λ using the computer algebra program Maxima and determine the roots of the numerator. None of them is positive.

Proof of Example 6
The statement is that the binding isotherm defined by 10λ 5 + 10 3 λ 4 + (10 4 + 1)λ 3 + 110λ 2 + 10 3 λ + 1 is sigmoidal, but Eq. (12) is not fulfilled. To see that the binding curve is sigmoidal, plot the function λ −1 V(λ) on the interval λ ∈ [10 −2 , 1] to see that it has a local maximum. Equation (12) is not satisfied if To see that this is true, we rewrite this equation to The binding polynomial Φ has only real non-negative coefficients with constant term 1, and is consequently positive for all non-negative λ. We multiply both sides with Φ 2 to obtain Plugging the polynomial of Example 6 into this equation, we have to show that −0.2 · (2999900λ 8 + 59973000λ 7 + 598860006λ 6 − 19200120λ 5 −59993400λ 4 + 359970λ 3 + 3998900λ 2 ) ≤ 0 ∀λ ≥ 0 The left side represents a polynomial with negative leading coefficient with value 0 at λ = 0. Calculating its roots shows that it does not have any other root on the positive real numbers, which implies the statement.
Proof of Example 7 The corresponding titration curve is not sigmoidal: We used the computer algebra program maxima to calculate the derivative of λ −1 V(λ), which is a rational function. A local maximum has to be a root of its numerator. None of its roots is a positive real number.
To see that Eq. (12) is not fulfilled, calculate Φ Φ − (1 − 1/n)Φ Φ − Φ Φ, which is a polynomial of degree four with negative leading coefficient and no positive root.