A study of the compactness of wave functions based on Shannon entropy indices: a seniority number approach

This work reports the formulation of Shannon entropy indices in terms of seniority numbers of the Slater determinants expanding an N-electron wave function. Numerical determinations of those indices prove that they provide a suitable quantitative procedure to evaluate compactness of wave functions and to describe their configurational structures. An analysis of the results, calculated for full configuration interaction wave functions in selected atomic and molecular systems, allows one to compare and to discuss the behavior of several types of molecular orbital basis sets in order to achieve more compact wave function expansions, and to study their multiconfigurational character.


Introduction
The configuration interaction (CI) methods have played an important role in describing N-electron systems, since they demand a lower computational cost than that required for determining the full configuration interaction (FCI) expansions which provide the exact descriptions for a given Hilbert space.Consequently, there has been a considerable interest in formulating N-electron wave functions in terms of CI expansions providing a rapid convergence to the FCI ones [1][2][3][4].Traditionally, the CI wave functions have been expanded by means of N-electron Slater determinants selected according to their excitation degrees with respect to a given reference determinant.However, more recently, another selection criterion has also been proposed.This criterion is based on the seniority number of the Slater determinants used to construct the CI expansions [5][6][7][8][9][10][11]. Results arising from both excitation-and seniority number-based CI schemes show that the seniority number-based selection procedure is particularly suitable to describe systems exhibiting strong (static) correlation [5], what has increased the interest on this approach [9][10][11].As is well known, the seniority number of a Slater determinant is defined as the number of singly occupied orbitals which possesses that determinant [12,13].The seniority number concept has been extended in Refs.[6,9] to N-electron wave functions which describe electronic states of atomic and molecular systems, 85 Page 2 of 7 as well as to N-electron spin-adapted Hilbert spaces.The expectation value of the seniority number operator with respect to an N-electron wave function is a weighted sum of the seniority numbers of all determinants involved in the expansion of that wave function.The weights that determine those expectation values depend on the molecular orbital basis set used to express the wave function, and consequently, this feature has been utilized to evaluate the compactness of the FCI and CI expansions in several molecular basis sets.Likewise, the seniority number value with respect to a wave function allows one to analyze the multiconfigurational character of the N-electron expansion, which is useful to describe the static and dynamic correlation of a determined state [14][15][16][17].
On the other hand, in Ref. [18], the extent of the multiconfigurational character of an N-electron wave function was evaluated by means of numerical determinations of an index set formulated within the Shannon information entropy approach [19][20][21].This treatment provides a suitable information concerning the distribution of the wave function among different configurations characterized by the excitation degree of the Slater determinants.The aim of this work is to extend this methodology to the senioritybased CI scheme and to report the corresponding Shannon index numerical values in terms of the contributions of Slater determinants classified according to the seniority number criterion.More recently, in Ref. [6], we have proposed unitary transformations which lead to the construction of basis sets of molecular orbitals in which the expectation values of the seniority number operator with respect to N-electron wave functions reach minimum values.The results found using this type of molecular orbitals show that the wave function expansions present a more rapid convergence than those arising from the use of other molecular orbitals [9,11].Another aim of this work is to evaluate and compare quantitatively, by means of the proposed Shannon entropy indices, the compactness of wave functions expressed in canonical molecular orbital (CMO) basis sets, natural orbitals (NO), and those mentioned orbitals M min , which minimize the expectation value of the seniority number operator.
This article has been organized as follows.Section 2 summarizes the notation and formulation of the main concepts used in this work; it also reports the formulation of the Shannon entropy indices in terms of the seniority numbers of the Slater determinants.In Sect.3, we present numerical values of those indices for wave functions of selected atomic and molecular systems; these values allow one to characterize the compactness of the wave function expansions.The calculation level and the computational details are also indicated in this section.An analysis and discussion of these results are reported in Sect. 4. Finally, in the last section, we highlight the main conclusions and perspectives of this work.

Theoretical framework
The K orbitals of an orthonormal basis set will be denoted by i, j, k, l, . . .and their corresponding spin-orbitals by i σ , j σ ′ , . . .(σ and σ ′ mean the spin coordinates α or β).The spin-free version of the N-electron seniority number operator ˆ has been formulated as [6,9,11] where are the spin-free first-and second-order replacement operators, respectively [22][23][24][25] and a † i σ and a i σ are the usual creation and annihilation fermion operators [26].
Closing both sides of Eq. ( 1) by an N-electron Slater determinant of S z spin projection quantum number, one obtains where, according to Eq. ( 1), the expectation value is the difference i �E i i � − i �E ii ii �, which is the number of total electrons N minus the number of electrons corresponding to doubly occupied orbitals.The possible values for the parameter are positive integers belonging to the sequence ; their meaning is the number of non-repeated orbitals in each determinant.That parameter allows one to classify the Slater determinants of S z quantum number, according to the corresponding seniority level, and they will be denoted hereafter by �(�).Consequently, a FCI N-electron wave function with given spin quantum numbers S and S z will be expressed by where C �(�) stands for the coefficient corresponding to the Slater determinant �(�).Obviously, since there is no contribution of Slater determinants with � < 2S to spin-adapted N-electron wave functions (��(� < 2S)|�(N, S, S z )� = 0), the lowest integer in the sum is 2S.If we truncate the series in Eq. (3), we obtain CI( ) wave function expansions involving only Slater determinants belonging to the selected levels.
According to Eqs. ( 1) and (3), the expectation value of the operator ˆ with respect to the FCI wave function �(N, S, S z ) is [6]   (1) which is a spin-free quantity, independent of the S z value, and consequently, we have dropped that quantum number.The coefficients C �(�) and the � �� �(N,S) values are strongly dependent on the molecular orbital set utilized to formulate the Slater determinants �(�) in the expansion of the wave function expressed by Eq. (3).As mentioned in the Introduction, in Refs.[6,9,11], we have performed unitary transformations of the molecular orbitals, based on iterative procedures [27], which lead to the minimization of the � �� �(N,S) values; the resulting molecular orbital basis sets have been denominated M min .That minimization requires the search of molecular orbitals leading to high values for the coefficients |C �(�) | corresponding to the determinants which possess greater doubly occupied orbital numbers, i.e., those providing higher values of the K i ��(�)| Êii ii |�(�)� quantities.Our results [6,9,11] have proven that the expansions for ground-state wave functions of atomic and molecular systems expressed in the molecular orbital basis sets M min turn out to be more compact than those arising from the canonical molecular orbitals (CMO) or natural orbitals (NO).
Quantitative measures of the compactness of an N-electron wave function have been reported in Ref. [18] by means of the informational content (I C (or Shannon entropy) within the traditional CI expansion method based on Slater determinants classified according to the excitation level with respect to a given reference determinant.Assuming that the N-electron wave function is normalized to unity , the counterpart formulation of that index for the seniority-based CI approach is in which the index runs over all the values defining the chosen CI ( ) expansion seniority levels.According to Eq. ( 5), the I C index accounts for the wave function configurational distribution, having a minimum value in case of a single-determinant wave function.
The values of this I C index quantify the multiconfigurational character of the CI wave function but do not report any detailed information on the contributions corresponding to different seniority subspaces.For CI ( ) expansions involving several values of the index, we can define a weight W which groups the contributions of all the Slater determinants with given seniority number in that expansion These weights provide the definition of the cumulative index I W , which in the seniority number approach is (5) which evaluates that entropic quantity in terms of the weights corresponding to the seniority numbers , providing a measure of the distribution of the wave function on different seniority subspaces.One can also consider the distribution of each subspace in terms of its corresponding Slater determinants and calculate its specific entropic index, which can be evaluated by means of the relationship where the denominators W have been introduced for normalization requirements.Formula (8) accounts for the configuration distribution within a determined seniority number level.
As mentioned above, the multiconfigurational character of an N-electron wave function expanded in terms of Slater determinants allows one to distinguish between systems exhibiting static (strong) correlation (in which a suitable zeroth-order description requires several Slater determinants) and those possessing dynamic correlation (in which a single Slater determinant is a good zeroth-order wave function).In the next sections, we report numerical values of the I C and I W indices in selected atomic and molecular systems, in order to assess the ability of these devices to describe quantitatively both types of electronic correlation within the seniority number approach.Likewise, we present values of the I index which show the influence of the bond stretching on the configurational distribution within the seniority number subspaces.As these Shannon entropy indices and the seniority number quantity for a determined wave function are not invariant under a unitary single-particle transformation, it is possible to perform molecular basis set unitary transformations and to compare values of these entropic indices according to the different molecular basis sets utilized.In particular, we compare values of Shannon indices arising from the molecular basis sets M min (in which the seniority number achieves their minimum values) with those provided by the CMO and NO sets.

Results
We have determined expansions of wave functions of several atomic and molecular systems in their ground states, at FCI level.These wave functions have been expressed in the three mentioned molecular basis sets CMO, NO, and M min , in order to study their compactness in different molecular orbital basis sets.Our aim is to analyze the structure and compactness of those expansions by means of the entropic indices proposed in Eqs. ( 5), (7), and (8) according to the seniority numbers.We have mainly chosen the systems of ( 8) four-and six-electron Be, LiH, BeH + , Li 2 , BH, BH + 2 , and BeH 2 and the basis sets STO-3G, in order to face up to an affordable computational cost.Moreover, we also report results corresponding to the Be atom in the cc-pVDZ basis set and the Mg one in the 6-31G basis sets, which are prototype examples of strongly correlated systems due to the near-degeneracies between it s and p shells.The molecular systems have been studied at equilibrium distances (R e ) and at stretched ones (R st ).The experimental geometrical dis- tances have been used for the neutral species LiH, Li 2 , BH, and BeH 2 [28]; in the molecular ion BeH + , we have used the internuclear distance reported in Refs.[29] and [30], while in the system BH + 2 , the geometry was optimized with the GAUSSIAN package [31] at single and double excitations.The one-and two-electron integrals and the Hartree-Fock canonical molecular orbitals basis sets required for our calculations have been obtained from a modified version of the PSI 3.3 code [32].We have constructed our own codes to determine the ground-state FCI wave functions for these systems expressed in the basis sets of CMO and NO; the orbitals minimizing the seniority number for a given wave function have been obtained from an iterative procedure reported in Ref. [27], using the CMO sets as initial bases of that iteration.The results found for the I C and I W quantities in those systems are gathered in Table 1, while Table 2 collects the I index values of each seniority number level in the corresponding wave function expansion.

Discussion
The numerical results reported in Tables 1 and 2 have been obtained from the FCI method which coincides with the CI ( = 0, 2, 4) one, in the CI framework, for the 4-electron systems (Be, LiH, and BeH + ).Likewise, for the case of the 6-electron systems (Li 2 , BH, BH + 2 , and BeH 2 ), the FCI and the CI( = 0, 2, 4, 6) methods are identical.A survey of the results included in Table 1 shows that all described systems present low values for the I W index, mainly at equilibrium geometries as well as at stretched ones in the NO and M min molecular basis sets.It means that most of the Slater determinants involved in expansion (3) can be grouped into a weight W , constituting a narrow -level distribution.In fact, the weights corresponding to = 0 for these closed-shell singlet ground states are close to unity (W (�=0) ∼ 1) [6].Consequently, the determinants �(� = 0) are quite dominant in those expansions, while the others �(� � = 0) can be neglected.The CI ( = 0) method has also been called doubly occupied configuration interaction (DOCI) [33], since their N-electron wave functions are expanded on all possible �(� = 0) determinants.5) and ( 7)) for the ground states of atomic and molecular systems described by FCI expansions expressed in the canonical molecular orbitals (CMO), in the orbitals which minimize the seniority number M min and in the natural orbitals (NO) Equilibrium distances (R e ) at experimental or optimized bond lengths and symmetrically stretched ones (R st ) at R st = 2.002 R e (for LiH), The low values found for the I W index show that, in these FCI wave functions, the CI ( = 0) or DOCI expansions are close to the FCI ones, which is in agreement with the conclusions reported in Ref. [34] where the DOCI method has been picked up as a valuable tool to describe a wide variety of systems possessing strong correlation.As shown in Table 1, the I C values are higher than their counterpart I W ones, indicating that the expansions (3) involve significant contributions of �(� = 0) Slater determinants other than the ground closed-shell ones, which are usually chosen as reference determinants within the traditional excitation CI approach.This result is confirmed by the values reported in Table 2, where the I =0 indices show a configurational dis- tribution that cannot be considered as narrow.The results for the Be and Mg atoms show that these general trends are kept when basis sets larger than minimal STO-3G ones are used.The presence of strong correlation in the Be atom is well known, and consequently, its wave functions possess a multiconfigurational character even at zeroth-order descriptions; identical behavior has been found in the Mg atom.
Our results confirm this feature showing that in the three isoelectronic species Be, LiH(R e ), and BeH + (R e ), the high- est I C index value corresponds to the Be atom (the widest multiconfigurational distribution), while the I W index presents small values for that atomic system; its wave functions have a narrow distribution in terms of seniority levels, with low contribution of �(� � = 0) determinants.
The results reported in Table 1 also allow one to compare, in terms of the values of the indices I C and I W , the expansions of the wave functions of these systems according to the molecular orbital basis sets in which they are expressed.As can be seen from that table, the values of both indices are considerably lower in the NO and M min basis sets than in their CMO counterparts (except for the Be atom in the STO-3G basis set); the Be atom recovers the improvement in the M min and NO molecular basis sets when the larger cc-pVDZ basis set is used.These results again confirm that the NO and M min molecular basis sets lead to more compact wave functions, as has been reported in Refs.[6,9,11].These values also point out that the I C and I W indices constitute suitable devices to describe quantitatively the compactness of a wave function.The high values found for the I C indices in the Be and Mg atoms in the three molecular basis sets can be interpreted in terms of the strong correlation exhibited by those systems.The appropriate ground-state wave functions for these atoms require several dominant Slater determinants.The I values reported in Table 2 reflect that seniority levels with very low contribution to the wave functions can present a broad determinantal distribution, i.e., the Li 2 molecule exhibits I �=4 > 5 values because its W =4 = 10 −4 weight is expanded on 7560 Slater determinants in the STO-3G basis set [6].Moreover, the I =0 index values reported in that table indicate that all systems possess a narrower  distribution at equilibrium distances in the three molecular basis sets.Likewise, the molecular system descriptions at the stretched geometries systematically present higher values for the I C and I W indices (Table 1) than their counterparts at the equilibrium distances.This effect is interpreted in the framework of the progressive openness of the chemical bonds until their complete dissociation, which is reflected in the values of both indices.However, the index I C shows a more sensitive character than the I W one, and consequently, its use must be favored in order to account for the influence of the bond stretching on the wave function features.

Concluding remarks and perspectives
In this work, we have extended the formulation of the Shannon entropy indices, informational content (I C ), cumulative (I W ), and specific -subspace (I ), within the framework of the seniority number criterion for constructing N-electron wave function expansions in terms of Slater determinants.
The quantitative evaluation of these indices has allowed us to implement analyses of the wave function expansions, determining their compactness in the well-known canonical molecular orbital and natural orbital basis sets, as well as in the recently proposed molecular orbital basis set which minimizes the seniority number of a given wave function.
The results obtained for several atomic and molecular systems described at the FCI level show the suitability of the seniority-based formulation of these indices to measure quantitatively the wave function expansion compactness, as well as to analyze their multiconfigurational structure.
We have also studied the ability of these indices to provide information on the evolution of the wave functions according to the stretching of the chemical bondings.We are currently working in our laboratories on the formulation of unitary transformations of molecular basis sets leading to the minimization of the Shannon entropy indices, in order to achieve a higher improvement on the compactness of wave function expansions.

Table 1
Calculated values of the I C and I W quantities (Eqs.( R e (for BeH 2 ).Results for molecules correspond to standard STO-3G basis sets

Table 2
Calculated values of the quantities I (Eq.8) for the ground states of atomic and molecular systems described by FCI expansions expressed in the canonical molecular orbitals (CMO), in the orbitals which minimize the seniority number (M min ) and in the natural orbitals (NO) Equilibrium distances (R e ) at experimental or optimized bond lengths and symmetrically stretched ones R e (for BeH 2 ).Results for molecules correspond to standard STO-3G basis sets