Electrocardiograms have long been studied in order to analyze cardiac function and predict health, disease and mortality. In many cases, linear methods in the time and frequency domains are used to analyze the information from the electrocardiogram. One such linear method, is referred to as heart rate variability (HRV). In time domain analysis, a range of normal values for HRV analyzed in the time domain, frequency domain and geometrically are established based on 24-hour ambulatory recordings. Similar metrics, particularly in the time domain, are not universally accepted for short-term recording so stratification of continuous data can be used.
In contrast to time domain analyses, that do little to account for irregularities, the irregularity in the time-sampled intervals of electrocardiographic ventricular activation (RR) can be accounted for in frequency domain analyses in order to calculate an estimate of the power spectrum density (PSD). Because the typical PSD estimators implicitly assume equidistant sampling, the interval time series is for example, first converted to equidistantly sample a series using a cubic spline interpolation method to avoid generating additional harmonic components in the spectrum.
PSD estimations are performed as a method of cardiac assessment using the FFT (Welch's periodogram) and the parametric maximum-entropy “forward-backward linear least squares” autoregressive (AR) methods. In the FFT method, spectrum powers are calculated by integrating the spectrum over the frequency bands. In contrast, the parametric AR method models the time series as a linear combination of complex harmonic functions, which include pure sinusoids and real exponentials as special cases, and fits a function of frequency with a predefined number of poles (frequencies of infinite density) to the spectrum. The AR method asserts that the position and shape of a spectral peak is determined by the corresponding complex frequency and that the height of the spectral peak contains little information about the complex amplitude of the complex harmonic functions. In the AR method, the spectrum is divided into components and the band powers are obtained as powers of these components.
There are several fundamental limitations to all forms of frequency domain analyses. Nonstationarity in time series severely limits the range of frequencies that can be studied by all methods of frequency-domain analyses. Frequency-domain analyses, while retaining some information relating to ordering of observations, conceal details of interactions between mechanisms. (e.g., respiration-mediated change in heart rate may stimulate other mechanisms). Heart rates have “self-similar” fluctuations, affected not only by the most recent value but also by much more remote events, or in other words, a “memory” effect. In time series, these phenomena may be quantified as a repetitive pattern of fluctuation, but in the frequency domain, it may be indistinguishable from uncorrelated fluctuations.
Nonlinear dynamic analyses are an alternate approach for understanding the complexity of biological systems. By definition, a nonlinear system has an output that is simply “not linear,” i.e., any information that fails criteria for linearity, i.e., output is proportional to input (e.g., Ohm's law), and superposition (behavior predicted by dissecting out individual input/output relationships of sub-components).
Virtually all biological signals demonstrate nonlinear properties. A simple common example is nonstationarity (e.g., drift in heart rate or blood pressure during sleep-wake cycles). Although a variety of stationarity tests provide useful measures, some arbitrary criteria are needed to judge stationarity including statistical properties and relevant time scale. Moreover, important information on pathological states and natural processes (e.g., aging) is contained within the nonstationary properties of biological signals, as illustrated in FIGS. 1A-1C. FIG. 1A illustrates fractal temporal processes of a healthy RR. FIG. 1B illustrates wavelet analysis of healthy RR time series of >1500 beats (x-axis is time, y-axis is wavelet scale (5 to 300 secs). FIG. 1C illustrates the wavelet amplitudes.
It is quite common for the output of nonlinearly coupled control systems to generate behaviors that defy explanation based on conventional linear models, as illustrated in FIGS. 2A-2E. Characteristic behaviors of nonlinear systems include self-sustained, periodic waves (e.g., ventricular tachycardia), abrupt changes in output (e.g., sudden onset of ventricular fibrillation) and, possibly, chaos. FIGS. 2A-2E illustrate an RR time series demonstrating quantifiable nonlinear dynamics that are distinct within patients with OSA, as illustrated in FIGS. 2A-2C, and also distinct within healthy individuals at high altitude, as illustrated in FIGS. 2D-2E.
On the other hand, nonlinear systems that appear to be very different in their specific details may exhibit certain common output patterns, a characteristic referred to as universality. Moreover, outputs may change in a sudden, discontinuous fashion (e.g., bifurcation), often resulting from a very small change in one of the control modules. For example, the same system may produce a wildly irregular output that becomes highly periodic or vice versa (e.g., electrical alternans, ST-T wave alternans preceding ventricular fibrillation, pulsus alternans during congestive heart failure)
Prior studies have used various nonlinear measures of RR interval complexity, including Poincaré plot, various forms of entropy analysis, and detrended fluctuation analysis to provide insight into heart rate regulatory mechanisms and prediction of adverse events.
The Poincaré plot is a graphical representation of the correlation between successive RR intervals, i.e. plot of RRn+1 as a function of RRn. The significance of this plot is that it is the two-dimensional reconstructed phase space (i.e., the projection of the system attractor that describes the dynamics of the time series). Because an essential feature of this analysis method is the shape of the plot, a common approach used by previous studies has been to parameterize the shape to fit an ellipse oriented according to the line-of-identity (i.e., for a first order plot, RRn=RRn+1). For example, a cigar-shaped plot along the principal diagonal (x=y) would reveal high autocorrelation within the time series, while a circular plot would reveal periodicity (e.g., the Poincaré plot of a sine wave or a pendulum is a circle).
Detrended fluctuation analysis (DFA) is another nonlinear form of analysis to offer insight into temporal dynamics by measuring correlations within the HRV signal. Typically, the correlations are divided into short-term (α1, range 4≦n≦16) and long-term (α2, range 16≦n≦64) fluctuation [0<α<0.5 indicates a large value is followed by a small value and vice versa, 0.5<α<1.0 indicates a large value is likely to be followed by a large value]. An a value of 0.5, 1.0, >1.0, or >1.5 indicates white noise, 1/f noise, different kinds of noise, or brown noise (integral of white noise), respectively.
Classical information theory, founded by Claude Shannon has been widely utilized for the study of nonlinear signals. Related to thermodynamic entropy, the information entropy can be calculated for any probability distribution (i.e., occurrence of an event that had a probability of occurring out of the space of possible events). The information entropy quantifies the amount of information needed to define the detailed microscopic state of a system, given its macroscopic description, and can be converted into its thermodynamic counterpart based on the Boltzmann distribution. Recent experimental evidence supports this method of conversion.
Shannon entropy (ShanEn) measures information as the decrease of uncertainty at a receiver (or physiological process). ShanEn of the line length distribution is defined as
  ShanEn  =            ∑              l        min                    l        max              ⁢                  n        l            ⁢                          ⁢      ln      ⁢                          ⁢              n        l            where nl is the number of length l lines such that
      n    l    =            N      l                      ∑                  l          min          ′                          l          max                    ⁢              N                  l          ′                    
From a chemical thermodynamics perspective, the reduced AG would be equal to the minimum number of yes/no questions (using log2) that needed to be answered in order to fully specify the microscopic state, given the macroscopic state. An increase in Shannon entropy indicates loss of information.
For clinical application to short and noisy time series, another measure “approximate entropy” (ApEn) was developed based on the Kolmogorov entropy, which is the rate of generation of new information. ApEn examines time series for similar epochs such that the presence of more frequent and more similar epochs (i.e., a high degree of regularity) lead to lower ApEn values.
A related method but much more accurate than ShanEn or ApEn is Sample entropy (SampEn), which unlike ApEn, does not count self-matches of templates, does not employ a template-wise strategy for calculating probability and is more reliable for short time series. SampEn is the conditional probability that that two short templates of length m that match within an arbitrary tolerance r will continue to match at the next point m+1.
SampEn is calculated by first forming a set of vectors uj of length muj=(RRj,RRj+1, . . . ,RRj+m−1), j=1,2, . . . N−m+1where m represents the embedding dimension and N is the number of measured RR intervals. The distance between these vectors is defined as the maximum absolute difference between the corresponding elementsd(uj,uk)=max[|RRj+n−RRk+n∥n=0, . . . ,m−1]
For each uj, the relative number of vectors uk for which d(uj,uk)≦r is calculated as
            C      j      m        ⁡          (      r      )        =                    nbr        ⁢                                  ⁢                  of          ⁢                                          [                                    u              k                        |                                          d                ⁡                                  (                                                            u                      j                                        ,                                          u                      k                                                        )                                            ≤              r                                ]                            N        -        m        +        1              ⁢          ∀              k        ≠        j            with values of Cjm(r) ranging between 0 and 1. Average of Cjm(r) yields
                    C        m            ⁡              (        r        )              =                  1                  N          -          m          +          1                    ⁢                        ∑                      j            =            1                                N            -            m            +            1                          ⁢                              C            j            m                    ⁡                      (            r            )                                and            SampEn      ⁡              (                  m          ,          r          ,          N                )              =                  -        ln            ⁢                                    C                          m              +              1                                ⁡                      (            r            )                                                C            m                    ⁡                      (            r            )                              
Although the development of SampEn was a major advancement in application of information theory to heart rate dynamics, SampEn has a few significant limitations. What is the optimal value of m? How does one pick r? The usual suggestion is that m should be 1 or 2, noting that there are more template matches and thus less bias for m=1, but that m=2 reveals more of the dynamics of the data. The convention has been that m=2 and r=0.2×SD of the epoch, and these criteria were set on empirical grounds.
COSEn, an optimized form of the SampEn measure, was originally designed and developed at the University of Virginia for the specific purpose of discriminating atrial fibrillation (AF) from normal sinus rhythm (NSR) at all heart rates using very short time series of RR intervals from surface ECGs (i.e., as few as 12 heart beats). As with ApEn and SampEn, smaller values of COSEn indicate a greater likelihood that similar patterns of RR fluctuation will be followed by additional similar measurements. If the time series is highly irregular, the occurrence of similar patterns will not be predictive for the following RR fluctuations and the COSEn value will be relatively large.
Using the same parameters [i.e., length of template or embedding dimension (m)=1], COSEn was subsequently optimized and validated in the Johns Hopkins PROSE-ICD study, requiring only 9 RR intervals before ICD shock to accurately distinguish AF from VT/VF [ROC curve area=0.98 (95% CI:0.93-1.0)] and outperforming representative ICD discrimination algorithms (Circulation Arrhythmia and Electrophysiology, in press).
Because nonlinear metrics such as COSEn have better discrimination ability than other conventional methods and the normal sinus rhythm has been shown to reflect health and disease, it would therefore be advantageous to provide a more accurate method for nonlinearly quantifying the self-similar fluctuation patterns in the RR intervals of NSR for prediction of health and mortality.