In this section, we present a description of the automatic arrhythmia detection algorithm (Section 5.3.1), followed by results for a range of detection response times (Section 5.3.2).

#### 5.3.1 Arrhythmia detection algorithm

The arrhythmia detection algorithm uses thresholds in the level and variance of spectral entropy values observed in the cardiac disorder map to automatically detect and label rhythms in patient event series data. The afdb contains significantly fewer periods of atrial flutter compared to atrial fibrillation and normal sinus rhythm (periods of flutter total 1.27 h, whereas periods of fibrillation total 91.59 h), the typical length of periods of flutter is of the order tens of seconds. Of the eight patients annotated as having flutter, only patients 04936 and 08378 have periods of flutter long enough (i.e., > ?) for analysis by the algorithm.

For this reason we do not include here the flutter prediction method of the algorithm, although extensions including flutter follow a similar principle and are simple in practice to implement. Other studies using the afdb (e.g., Tateno & Glass 2000, 2001) restrict themselves to methods differentiating only between fibrillation and normal sinus rhythm. Additional comments on the practicality of detecting atrial flutter and selected results for flutter will be given in the Discussion section (Section 5.4.1).

The five stages of the algorithm are shown in Figure 5.1. The first three 130 stages have been covered in depth as part of the Data Analysis section, but we include a brief summary here for completeness. We first obtain a binary string representing the dynamics of the heart for a given patient by discretizing the physionet data every ? = 30 ms (stage 1 to stage 2). In stage 3, the spectral entropy measure is applied for windows of duration ? = L? , with L chosen for each patient such that there are on average ten beats within the spectral entropy window, giving ? as 6 s for a typical patient. Using an overlap parameter a (typically 1.5 s), leads to a series of spectral entropy values separated in time by this amount.

Given no prior knowledge of the provided rhythm assessments, we calculate the standard deviation and average magnitude of M spectral entropy values in variance windows of length ? = Ma preceding a given time point. We use the example case of M equal to 20 (giving ? as 30 s for a typical patient). The level and standard deviation thresholds for atrial fibrillation are set consistent with values obtained from the cardiac disorder map, for this case we determine ?f ib = 0.84 and ?f ib = 0.018. Stage 4 generates preliminary predictions for the rhythm state of the heart: we denote as fibrillating (AF) instances where the spectral entropy level is greater than ?f ib and the standard deviation is less than ?f ib, with all other combinations considered to be normal sinus rhythm (N)1 . Setting the overlap of variance windows such that b = a, we obtain a string of rhythm predictions drawn from the set {AF, N} and separated in time by b.

Finally, in stage 5 we apply a rudimentary smoothing procedure to the initial string of rhythm predictions. For a particular prediction, we consider a preceding period ? = 2? + b = (2M + 1)L?/4, leading in this example to a typical length for ? of 61.5 s. We find the modal prediction: the prediction {AF, N} occurring most frequently in ?, labeling the modal prediction {AF? , N?}. We call ? the modal smoothing window. In this form, we understand the windows ? and ? as setting the response time of the algorithm: ? is defined in terms of the number of preceding spectral entropy values required for a given prediction; for ? to register a change in rhythm, over half of the predictions must suggest the new rhythm. The response time is then ? 2 , which is approximately equal to ?. We have the modal smoothing windows overlapping with parameter c = b = a. This results in a final time series of predictions and constitutes the output of the arrhythmia detection algorithm for a given patient. An example of the algorithm output for patient 08378 (including a threshold for atrial flutter) is shown in Figure 5.2.

We apply the above steps, comprising the three data windows (?, ?, ?), to each patient in the afdb. Specifying ? , L and M fixes the remaining parameters, their exact magnitude determined by L. A summary of windowing symbols can be found in Table 5.1. Values for the atrial fibrillation threshold parameters (?f ib and ?f ib) are kept the same for each patient for a given response time. The results obtained from the algorithm are described in the following section.

#### 5.3.2 Algorithm results

We now present the results of the cardiac arrhythmia detection algorithm for atrial fibrillation. The following window parameters were used: ? is set to 30 ms, L is chosen such that ? is expected to contain 10 beats, and M 132 is set to 20, windows have overlap parameters c = b = a = ? 4 (for typical patients in the afdb, ? ? 6s, ? ? 30s, ? ? 61.5s, and a ? 1.5s). Threshold values for fibrillation are set at ?f ib = 0.84 for the spectral entropy level and ?f ib = 0.018 for the standard deviation. Each prediction produced by the algorithm (denoted by a primed symbol) is compared with the rhythm assessment documented in the database and can be classified into one of four categories (Hulley & Cumming 1988): true positive (TP), AF is classified as AF? ; true negative (TN), non-AF is classified as non-AF? ; false negative (FN), AF is classified as non-AF? ; false positive (FP), non-AF is classified as AF? .

Percentages of these quantities for each patient and for the entire afdb are given in Table 5.2. Overall, we obtain a predictive capability (assessed using the percentage of predictions agreeing with the provided annotations) of 89.5%. The sensitivity and specificity metrics are defined by TP/(TP+FN) and TN/(TN+FP), respectively. The predictive value of a positive test (PV+) and the predictive value of a negative test (PV?) are de- fined by TP/(TP+FP) and TN/(TN+FN), respectively. These, and results for other values of ? are given in Table 5.3.

In repeating the algorithm with different values for the variance window, shorter ? represents a quicker response time. We obtain for each ? a new disorder map to determine the relevant threshold values. For the rapid response case, ? typically 6 s, we alter the fibrillating thresholds in the arrhythmia detection algorithm to be ?f ib = 0.855 and ?f ib = 0.016; we find a predictive capability of 85.7%. With ? typically 60 s, the fibrillating thresholds become ?f ib = 0.84 and ?f ib = 0.019; the predictive capability is 90.3%.

### 5.4 Discussion

We begin with an exposition of the results presented in the previous section and the effects of different parameter values on the output of the arrhythmia detection algorithm. This is followed by a discussion, with reference to the electrocardiograms provided as part of the afdb, of disagreements between the provided rhythm annotations, measures relying solely on the heart rate, and the predictions of our algorithm (Section 5.4.1). Having shown that some of the annotations may be unreliable, we comment on situations where the algorithm may still present incorrect predictions (Section 5.4.2). The bene- fits of the spectral entropy measure compared to other fibrillation detection methods is then given (Section 5.4.3). We close the section with a discussion of the systematic windowing errors present in our procedure (Section 5.4.4).

Instances of atrial fibrillation constitute approximately 40% of the afdb. If we consider a null-model where we constantly predict normal sinus rhythm, we would expect a predictive capability of around 60%. In Table 5.3, we observe an improvement in the predictive capability of the detection algorithm when the length of the variance window, ?, is increased from 6 s (85.7%) to 60 s (90.3%) for a typical patient. The choice of shorter ? improves the response time of the algorithm by requiring less data per prediction; values for ? less than 6 s do not incorporate enough data to give meaningful results. Increasing ? beyond 30 s improves the predictive capability very little.

This suggests that additional factors, independent of the specific parameters chosen here, need to be considered. Results in Table 5.2 for the case ? typically 30 s indicates an overall predictive capability of 89.5%. For individual patients, the predictive capability ranges from 60.2% (patient 03665) to 100% (patient 07162). To explain this variation, we investigate the form of patient ECGs during periods of disagreement between annotation and prediction. Examples of the ECGs referred to in Sections 5.4.1 and 5.4.2 are included in Appendix 5.1.

#### 5.4.1 Disagreements with annotations

Rhythm assessments have been questioned before (Tateno & Glass 2000, 2001); here, we give explicit examples where we believe the ECGs to suggest a rhythm different from that given by the annotation. We observe in the ECGs of patients 08219 and 08434 periods of atrial fibrillation that we believe to have been missed in the annotations but are correctly identified by our detection algorithm.2 Cases such as these serve to negatively impact the results of the algorithm unfairly; however, we note that such instances comprise a small proportion of the afdb. Atrial flutter may have been misannotated in patients 04936 and 08219;3 in particular, two considerable periods of flutter may have been annotated incorrectly in patient 04936.

This unreliability of rhythm assessment, compounded with the limited number of periods of atrial flutter in the database, prevents us from drawing meaningful quantitative conclusions regarding the success of the detection algorithm in identifying flutter. Despite this, we believe that the spectral entropy is in principle still capable of identifying flutter (see Figure 5.2). Returning to the two patients with significant periods of flutter, we run the algorithm with the inclusion of a threshold for atrial flutter motivated by each patient’s individual disorder map: ?f l (other parameters as per the Algorithm results section with M = 20). For patient 08378 with ?f l = 0.70, we find 86.3% agreement with the annotations for flutter; for patient 04936 with ?f l = 0.81, we find 66.9% agreement, bearing in mind the points raised above.

Consideration of ECGs demonstrates the inability of measures relying solely on the heart rate and its derivatives to consistently distinguish between fibrillation, flutter and other rhythms. Atrial fibrillation is characteristically associated with an elevated heart rate (100–200 bpm) (Bennett 2002); atrial flutter exhibits an even higher heart rate (>150 bpm) with a sharp transition from normal sinus rhythm.

This expected behavior, whilst found to hold qualitatively for the majority of patients, fails during large periods for patient 06453 and is completely reversed for patient 08215.4 The resting heart rate is also found to differ dramatically between patients in the afdb. The spectral entropy, being less susceptible to variations in the heart rate, is better suited to form the basis of a detection algorithm compared to a measure relying solely on heart rate (for discussions on nonstationarities in heart rate time series, see Bernaola-Galvan et al. 2001; Cammarota & Rogora 2005).

#### 5.4.2 Other rhythms

The unreliability of parts of the annotations still does not account for all false predictions produced by the detection algorithm. We suggest the presence of other rhythms within the afdb to be an additional factor that needs to be considered. Table 5.3 shows the sensitivity metric to be consistently lower for all values of ?, suggesting a bias towards false negatives (FNs occur when AF is classified as non-AF? ). FNs total 6.5% for ? typically 30 s in Table 5.2, and comprise 36.3% of predictions for patient 04936. Given our requirement in the detection algorithm for periods that are classed as AF to satisfy both a spectral entropy level and variance condition, FNs are most likely to arise when one threshold condition fails to be met.

Cases where the variance threshold is not satisfied may be associated with the physiological phenomena of fib-flutter and paroxysmal atrial fibrillation, and would be located right of the standard deviation threshold on the disorder map (Figure 5.3). Fib-flutter corresponds to periods where the rhythm transitions in quick succession between atrial fibrillation and flutter (Horvath et al. 2000), with paroxysmal fibrillation describing periods where atrial fibrillation stops and starts with high frequency. Such behavior naturally causes the variance to increase and one might question whether it is still appropriate to classify those periods as standard atrial fibrillation.

We identify in the ECG of patient 04936 periods of fib-flutter which likely accounts for the high proportion of FN results; by inspecting the patient’s disorder map, we indeed observe points annotated as atrial fibrillation with uncharacteristically high standard deviation, signifying that fib-flutter would be a more accurate rhythm classification. Cases where the spectral entropy level threshold is not met can occur when QRS complexes indicative of atrial fibrillation appear with unusually regular rhythm; such behavior would lie below the level threshold on the disorder map. Owing to the small number of beats contained within each window, such occurrences inevitably arise; the process of modal smoothing lessens the impact of this phenomenon in the arrhythmia detection algorithm.

False positives (FPs occur when non-AF is classified as AF? ), which comprise 4.0% of the afdb for ? typically 30 s, may also have a physiological explanation. During sinus arrhythmia, there are alternating periods of slowing and increasing node firing rate, while still retaining QRS complexes indicative of normal sinus rhythm. These alternating periods increase the irregularity of beats within the spectral entropy window. If the variance threshold is also satisfied, sinus arrhythmia may be incorrectly classified as AF? by the arrhythmia detection algorithm. Sinus arrest occurs when the sinoatrial node fails to fire and results in behavior that is similar in principle to sinus arrhythmia; these two conditions are likely responsible for the high proportion of FPs (14.2%) that are observed in patient 05091.

#### 5.4.3 Comparison to other methods

Vikman et al. (1999, 2005) showed that decreased ApEn values of heart beat fluctuations have been found to precede (at timescales of the order an hour) spontaneous episodes of atrial fibrillation in patients without structural heart disease. We stress that the algorithm presented here is not intended to predict in advance occurrences of fibrillation; rather, it is designed to detect the onset of fibrillation as quickly as possible using only interbeat intervals. Tateno and Glass (2000, 2001) present an atrial fibrillation detection method that is statistical in principle and based upon an observed difference in the standard density histograms of ?RR intervals (the difference in successive interbeat intervals).

A series of reference standard density histograms characteristic of atrial fibrillation (as assessed in the annotations) are first obtained from the afdb. Their detection algorithm is re-run on the afdb by taking 100 interbeat intervals and comparing them to the reference histograms, where appropriate predictions can then be made. The reference histograms rely on the correctness of the annotations in order to determine fibrillation, whereas the thresholds in our algorithm are only weakly dependent on the data set under consideration. Figure 5.3 is an empirical observation, in future analyses we would like to use fibrillation thresholds derived from a data set separate from the one under consideration.

Sarkar et al. (2008) have developed a detector of atrial fibrillation and tachycardia that uses a Lorentz plot of ?RR intervals to differentiate between rhythms. The detector is shown to perform better for episodes of fibrillation greater than 3 min and has a minimum response time of 2 min. By contrast, 140 our method is applicable to short sections of data, enabling quicker response times to be used. We see our algorithm complementing other detection techniques, with the potential for an implementation that combines more than one method. Combining methods becomes increasingly relevant when running algorithms on data sets containing a variety of arrhythmias. As noted by Tateno and Glass (2000, 20010), other arrhythmias often show irregular RR intervals, and previous studies have found difficulty in detecting atrial fibrillation based solely on RR intervals (Pinciroli & Castelli 1986; Slocum et al. 1987; Murgatroyd et al. 1995; Andresen & Br¨uggemann 1998).

#### 5.4.4 Systematic error

There are two intrinsic sources of error in the spectral entropy measure related to the phenomenon of spectral leakage: that due to the “picket-fence effect” (where frequencies in the power spectrum fall between discrete bins, see Salvatore & Trotta 1988) and that due to finite window effects (where, for a given frequency, an integer number of periods does not fall into the spectral entropy window, see Harris 1978; Nuttall 1981).

We attempt to quantify this error by applying the measure (with parameters as per the Data Analysis section) to synthetic event series: a periodic series with constant interbeat interval. For a heart rate range of 50–200 bpm in 1-bpm increments we obtain 150 synthetic time series. We find the average error in the spectral entropy over the 150 time series to be 0.02. The average standard deviation value (with variance windows having M equal to 20 spectral entropy values) over the 150 time series is 0.011 ± 0.009; the average error on these standard deviation values due to windowing is 0.0002.

The presence of some form of error associated with finite windows is unavoidable. We have attempted to minimize such errors by choosing parameters that achieve a balance between usability and error magnitude. There is still scope for fine-tuning parameters—in particular, trying a variety of window shapes to further reduce the affect of spectral leakage. However, we find the general results to be robust to a range of window parameters, implying any practical effect of windowing errors to be minimal when compared to the other issues discussed in this section.

**Next Page – Further Work and Conclusion**

**Previous Page – Rapidly Detecting Disorder in Rhythmic Biological Signals**