Wavelet entropy and fractional Brownian motion time series

We study the functional link between the Hurst parameter and the Normalized Total Wavelet Entropy when analyzing fractional Brownian motion (fBm) time series--these series are synthetically generated. Both quantifiers are mainly used to identify fractional Brownian motion processes (Fractals 12 (2004) 223). The aim of this work is understand the differences in the information obtained from them, if any.


Introduction
When studying the laser beam propagation through a laboratory-generated turbulence [1] we have introduced two quantifiers: the Hurst parameter, H, and the Normalized Total Wavelet Entropy (NTWS), S WT . The former quantifier was introduced to test how good the family of fractional Brownian motion [2] (fBm) processes model the wandering of such laser beam, while the NTWS is a more general quantifier aimed to study any given dynamic system [3]. Also, in a recent work we have analyzed the dynamic case: the laboratory-generated turbulence was set up to change in time [4]. We have observed that these quantifiers are correlated, but at the time only a qualitative argument was given. Furthermore, each one of these quantifiers have been used separately to obtain information from biospeckle phenomenon [5,6].
The fBm is the only one family of processes which are self-similar, with stationary increments, and gaussian [7]. The normalized family of these gaussian processes, B H , is the one with B H (0) = 0 almost surely, E[B H (t)] = 0, and covariance for s, t ∈ R. Here E[ ·] refers to the average with gaussian probability density. The power exponent H has a bounded range between 0 and 1. These processes exhibit memory, as can be observed from Eq. (1), for any Hurst paremeter but H = 1/2. In this case successive Brownian motion increments are as likely to have the same sign as the opposite, and thus there is no correlation. Otherwise, it is the Brownian motion that splits the family of fBm processes in two. When H > 1/2 the correlations of successive increments decay hyperbolically, and this sub-family of processes have long-memory. Besides, consecutive increments tend to have the same sign, these processes are persistent. For H < 1/2, the correlations of the increments also decay but exponentially, and this sub-family presents short-memory. But since consecutive increments are more likely to have opposite signs, it is said that these are anti-persitent.
The Wavelet Analysis is one of the most useful tools when dealing with data samples. Thus, any signal can be descomposed by using a diadic discrete family {2 j/2 ψ(2 j t−k)}-an orthonormal basis for L 2 (R)-of translations and scaling functions based on a function ψ: the mother wavelet. This wavelet expansion has associated wavelet coefficients given by C j (k) = S, 2 j/2 ψ(2 j · −k) . Each resolution level j has an associated energy given by E j = E |C j (k)| 2 . If the signal has stationary increments the coefficients are independent on k and then the relative wavelet energy, RWE, is with j ∈ {−N, . . . , −1}, where N = log 2 M is the number of sample points, and E tot = −1 j=−N E |C j | 2 is the total energy. Thus the NTWS is defined as (see [1] and references therein) For a signal originated from a fBm the energy per resolution level can be calculated using the formalism introduced in Ref. [8], see Appendix 3, for any mother wavelet election satisfying R ψ = 0. From (4) the relative wavelet energy for a finite data sample is which becomes independent on wavelet basis. And so it does the normalized total wavelet entropy, As it was expected the entropy decreases when H increases, with H measuring the level of order of the signal.

Simulations and tests
To test the functional relation between the Hurst exponent and NTWS we have simulated 50 fractional Brownian motion data samples [9] for each H ∈ {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9}. Since we have examined data of 5000 point in length in Ref. [1], these samples are set to the same length. For each set we estimate H and S WT . Moreover, we employ an orthogonal cubic spline function as mother wavelet. Among several alternatives, the cubic spline function is symmetric and combine in a suitable proportion smoothness with numerical advantages. It has become a recommendable tool for representing natural signals [10,11].
The Hurst parameter is estimated as usual: by plotting the logarithm of the estimated energy per resolution level j, with N j = 2 −j M the number of coefficients at resolution level j, versus j and fitting a minimum square line. The slope of the line is the desired estimator.
For NTWS we start dividing the signal into N T non-overlapping temporal windows with length L, N T = M/L. The wavelet energy estimator at resolution level j for the time window i is given by where N j represents the number of wavelet coefficients at resolution level j corresponding to the time window i; while the total energy estimator in this time window will be E Hence, the time evolution estimators of RWE and NTWS will be given by: In order to obtain a quantifier for the whole time period under analysis [3] the temporal average is evaluated. The temporal average of NTWS is given by and for the wavelet energy at resolution level j then the total wavelet energy temporal average is defined as E tot = j<0 E j . In consequence, a mean probability distribution {q j } representative for the whole time interval (the complete signal) can be defined as with j q j = 1 and the corresponding mean NTWS as In Figure 1 we compare H against its estimator. It has a good performance for 0.3 < H < 0.9 and fails outside. Furthermore, the estimators fits better the larger values. Figure 2 represents the temporal average of NTWS, S WT , and the mean NTWS, S W T , estimated with our procedure and compared against the theoretical result in eq. (6) with N = 12. As usual, boxplots [12] show lower and upper lines at the lower quartile (25th percentile of the sample) and upper quartile (75th percentile of the sample) and the line in the middle of the box is the sample median. The whiskers are lines extending from each end of the box indicating the extent of the rest of the sample. Outliers are marked by plus signs. These points may be considered the result of a data entry error or a poor measurement.

Conclusions
For a fBm we have found, eq. (6), there is an inverse dependence: as H grows the temporal average, S WT , and mean NTWS, S W T , diminishes. It is verified with the synthetic fBm data samples. This relation is logical, the spectrum has less high-frequency components as H gets higher and all the energy is closer to the origin, and, if H gets lower the energy contribution at high frequencies becomes relevant. Observe that the closerĤ is to the exact value, the better are the results for both estimators of the entropy.
From an analytical point of view both H and S WT are equivalent. Although, the NTWS also contains information about the extension of the data set. Nevertheless, from a computational point of view the latter is independent on the scaling region, making the entropy less subjective. On the other hand the logarithm in the entropy definition introduces important errors, as we see in Figure 2. To narrow these it is necessary to increase the data samples. It should be stressed that extending the length of the data samples reduces the statistical error.

APPENDIX A
Let us take as the signal S(t) = B H (t, ω), ω is fixed and represents one element of the statistic ensemble and it will be omitted hereafter. The wavelet coefficients are calculated using the orthonormal wavelet basis {2 −j/2 ψ(2 −j · −k)} j,k∈ , for the last step we used the self-similar property of the fBm; that is, Since the fBm can be written, using the chaos expansion described in Ref. [8], as where {ξ n } n∈N are the Hermite functions, and the operator M H is defined as follow [13] M where the hat stands for the Fourier transform, c 2 H = Γ(2H + 1) sin(πH), and φ is any function in L 2 (R). Then, we introduce the following coefficients to finally obtain: The evaluation of the coefficients d H n (k) is straightforward from their definition and eq. (16): where Ψ(ν) = ψ(ν).
The chaos expansion in eq. (18) corresponds to a Gaussian process, then for integers j, k, j ′ , k ′ the correlation is equal to [14] E Now, from eq. (19) and orthogonality of the Hermite functions the above equation is rewritten in the following way (20) which is the usual expresion found in many works, see [15] and references therein. The integral has convergence problems near the origin. These are resolved chosing a mother wavelet φ with K null moments. That is, for k = 0, · · · , K − 1. Therefore, |Ψ(ν)| 2 = a 1 |ν| 2K + a 2 |ν| 2K+1 + o(|ν| 2K+1 ). When k and k ′ are far apart, i. e., m = k − k ′ → ∞, the integral in eq. (20) is dominated by the contribution of frequencies in the interval [0, 1], thus giving for K > H. The coefficients of a wavelet expansion are highly correlated. But, for j = j ′ and k = k ′ , we recover the mean energy by resolution level j. Therefore, the RWE is obtained replacing the above into eq. (2): where the last equation comes from the evaluation of the geometric series corresponding to the total energy. Its logarithm (base 2) is simply log 2 p j = (1 + 2H)(j + 1) + log