throbber
Artificial Intelligence in Medicine (2008) 44, 51—64
`
`http://www.intl.elsevierhealth.com/journals/aiim
`
`Support vector machine-based arrhythmia
`classification using reduced features of
`heart rate variability signal
`
`Babak Mohammadzadeh Asl a,*, Seyed Kamaledin Setarehdan b,
`Maryam Mohebbi a
`
`a Department of Biomedical Engineering, Tarbiat Modares University, Tehran, Iran
`b Control and Intelligent Processing Center of Excellence, Faculty of Electrical and
`Computer Engineering, University of Tehran, P.O. Box 14395/515, Tehran, Iran
`
`Received 14 July 2007; received in revised form 24 April 2008; accepted 28 April 2008
`
`KEYWORDS
`Arrhythmia
`classification;
`Generalized
`discriminant analysis;
`Heart rate variability;
`Nonlinear analysis;
`Support vector
`machine
`
`Summary
`
`Objective: This paper presents an effective cardiac arrhythmia classification algo-
`rithm using the heart rate variability (HRV) signal. The proposed algorithm is based on
`the generalized discriminant analysis (GDA) feature reduction scheme and the support
`vector machine (SVM) classifier.
`Methodology: Initially 15 different features are extracted from the input HRV signal
`by means of linear and nonlinear methods. These features are then reduced to only
`five features by the GDA technique. This not only reduces the number of the input
`features but also increases the classification accuracy by selecting most discriminat-
`ing features. Finally, the SVM combined with the one-against-all strategy is used to
`classify the HRV signals.
`Results: The proposed GDA- and SVM-based cardiac arrhythmia classification algo-
`rithm is applied to input HRV signals, obtained from the MIT-BIH arrhythmia database,
`to discriminate six different types of cardiac arrhythmia. In particular, the HRV signals
`representing the six different types of arrhythmia classes including normal sinus
`rhythm, premature ventricular contraction, atrial fibrillation, sick sinus syndrome,
`ventricular fibrillation and 28 heart block are classified with an accuracy of 98.94%,
`98.96%, 98.53%, 98.51%, 100% and 100%, respectively, which are better than any other
`previously reported results.
`Conclusion: An effective cardiac arrhythmia classification algorithm is presented. A
`main advantage of the proposed algorithm, compared to the approaches which use
`the ECG signal itself is the fact that it is completely based on the HRV (R—R interval)
`signal which can be extracted from even a very noisy ECG signal with a relatively high
`
`* Corresponding author. Tel.: +98 912 4715235; fax: +98 21 88633029.
`E-mail address: b.mohammadzadeh@modares.ac.ir (B.M. Asl).
`
`0933-3657/$ — see front matter # 2008 Elsevier B.V. All rights reserved.
`doi:10.1016/j.artmed.2008.04.007
`
`APPLE 1039
`
`1
`
`

`

`52
`
`B.M. Asl et al.
`
`accuracy. Moreover, the usage of the HRV signal leads to an effective reduction of the
`processing time, which provides an online arrhythmia classification system. A main
`drawback of the proposed algorithm is however that some arrhythmia types such as
`left bundle branch block and right bundle branch block beats cannot be detected using
`only the features extracted from the HRV signal.
`# 2008 Elsevier B.V. All rights reserved.
`
`1. Introduction
`
`Heart diseases are a major cause of mortality in the
`developed countries. Many different instruments
`and methods are developed and being daily used
`to analyze the heart behavior. One of the relatively
`new methods to assess the heart activity and to
`discriminate different cardiac abnormalities is to
`analyse the so-called heart rate variability (HRV)
`signal. HRV signal, which is generated from electro-
`cardiogram (ECG) by calculating the inter-beat
`intervals, is a nonlinear and nonstationary signal
`that represents the autonomic activity of the ner-
`vous system and the way it influences the cardio-
`vascular system. Hence, measurement and analysis
`of the heart rate variations is a non-invasive tool for
`assessing both the autonomic nervous system and
`the cardiovascular autonomic regulatory system.
`Furthermore, it can provide useful
`information
`about the current and/or the future heart deficien-
`cies [1]. Therefore, HRV analysis can be considered
`as an important diagnostic tool in cardiology.
`Several methods have been proposed in the lit-
`erature for automatic cardiac arrhythmia detection
`and classification. Some examples of the techniques
`used include:
`threshold-crossing intervals
`[2],
`neural networks [3—10], wavelet transforms [11],
`wavelet analysis combined with radial basis function
`neural networks [12], support vector machines [13],
`Bayesian classifiers [14], fuzzy logic combined with
`the Markov models [15], fuzzy equivalence relations
`[16], and the rule-based algorithms [17]. Most of
`these studies [2—6,11—13] are based on the analysis
`of the ECG signal itself. In most methods, the various
`features of the ECG signal including the morpholo-
`gical features are extracted and used for classifica-
`tion of the cardiac arrhythmias. This is a time-
`consuming procedure and the results are very sen-
`sitive to the amount of the noise.
`An alternative approach would be to extract the
`HRV signal from the ECG signal first by recording the
`R—R time intervals and then processing the HRV
`signal instead. This is a more robust method since
`the R—R time intervals are less affected by the
`noise. Different HRV signal analysis methods for
`cardiac arrhythmia detection and classification
`were introduced in the past. Tsipouras and Fotiadis
`[8] proposed an algorithm based on both time and
`
`time—frequency analysis of the HRV signal using a
`set of neural networks. Their method could only
`classify the input ECG segments as ‘‘normal’’ or
`‘‘arrhythmic’’ segments without the ability to iden-
`tify the type of the arrhythmia. Acharya et al. [16]
`employed a multilayer perceptron (MLP) together
`with a fuzzy classifier for arrhythmia classification
`using HRV signal. They could classify the input ECG
`segments into one of the four different arrhythmia
`classes. In [17], Tsipouras et al. proposed a knowl-
`edge-based method for arrhythmia classification
`into four different categories. The main drawback
`of their algorithm was the fact that the atrial fibril-
`lation, which is an important
`life-threatening
`arrhythmia, was excluded from the ECG database.
`In this paper a new arrhythmia classification
`algorithm is proposed which is able to effectively
`identify six different and more frequently occurring
`types of cardiac arrhythmia. These arrhythmias are
`namely the normal sinus rhythm (NSR), the prema-
`ture ventricular contraction (PVC), the atrial fibril-
`lation (AF), the sick sinus syndrome (SSS), the
`ventricular fibrillation (VF) and the 28 heart block
`(BII). The proposed algorithm is based on the two
`kernel learning machines of the generalized discri-
`minant analysis (GDA) and the support vector
`machine (SVM). By cascading SVM with GDA, the
`input features will be nonlinearly mapped twice
`by radial basis function (RBF). As a result, a linear
`optimal separating hyperplane can be found with
`the largest margin of separation between each pair
`of arrhythmia classes in the implicit dot product
`feature space.
`GDA is a data transformation technique which
`was first introduced by Baudat and Anouar [18]. It
`can be considered as a kind of generalization to the
`well-known linear discriminant analysis (LDA) algo-
`rithm and has become a promising feature extrac-
`tion scheme [19—24] in recent years. The main steps
`in GDA are to map the input data into a convenient
`higher dimensional feature space F first and then to
`perform the LDA algorithm on the F instead of the
`original input space. By GDA therefore, both dimen-
`sionality reduction of the input feature space and
`selection of the useful discriminating features can
`be achieved simultaneously.
`SVM, which was first proposed by Vapnik [25], has
`been considered as an effective classification
`
`2
`
`

`

`Support vector machine-based arrhythmia classification using reduced features
`
`53
`
`scheme in many pattern recognition problems
`recently [22—24,26,27]. It is often reported that
`SVM provides better classification results than other
`widely used methods such as the neural network
`classifiers [28,29]. This is partly because SVM aims to
`obtain the optimal answer using the available infor-
`mation and in the same time it shows better gen-
`eralization ability on the unseen data.
`In continue the details of the proposed algorithm
`for cardiac arrhythmia classification from the HRV
`signal is presented. Section 2 provides the overall
`block diagram of the proposed algorithm together
`with the details of each block. The results of the
`application of the proposed algorithm to the MIT-BIH
`arrhythmia database are presented in Section 3.
`Section 4, compares the results obtained by the
`proposed algorithm to those obtained by the other
`previously reported techniques. This is followed by a
`discussion on the results and the methods. Finally,
`Section 5 concludes the paper.
`
`2. Materials and methods
`
`2.1. Database
`
`The HRV data used in this work is generated from the
`ECG signals provided by the MIT-BIH Arrhythmia
`Database [30]. The database was created in 1980
`as a reference standard for serving all those who are
`conducting a research on the cardiac arrhythmia
`detection and classification problem [31].
`The MIT-BIH Arrhythmia Database includes 48 ECG
`recordings each of length 30 min with a total of about
`109,000 R—R intervals. The ECG signals were band-
`pass-filtered in the frequency range of 0.1—100 Hz
`and were sampled with a sampling frequency of
`360 Hz. Each of the about 109,000 beats was manu-
`ally annotated by at least two cardiologists working
`independently. Their annotations were compared,
`consensus on disagreements was obtained, and the
`reference annotation files were prepared [31]. The
`reference annotation files include beat, rhythm, and
`signal quality annotations. Due to the lack of the VF
`data in the MIT-BIH arrhythmia database, which is
`needed in the current study, the Creighton University
`Ventricular Tachyarrhythmia Database was added to
`the MIT-BIH data as the VF arrhythmia class after
`resampling it at a rate of 360 Hz.
`
`Finally, a total number of 1367 ECG segments
`each with 32 R—R intervals were selected from
`the above-mentioned database and used in this
`work, which contains all six different arrhythmia
`classes considered in this study. The specialists
`defined rhythm annotations for each segment were
`also considered along with the segments.
`
`2.2. The proposed algorithm
`
`The block diagram of the proposed algorithm is
`demonstrated in Fig. 1. As seen, it comprises the
`four steps of preprocessing, feature extraction,
`GDA-based feature reduction and SVM-based
`arrhythmia classification. In continue, each block
`is described in more details.
`
`2.2.1. Preprocessing
`As a first step, it is necessary to extract the HRV
`signals from the ECG signals within the database. In
`general, this process can be affected by many inter-
`fering signals such as the mains 50 Hz, the interfer-
`ences from electromyogram (EMG) signals and also
`the baseline wandering. The interfering signals are
`effectively eliminated from the input ECG signal
`using a 5—15 Hz bandpass filter. Furthermore, the
`cubic splines are used for baseline approximation,
`which is then subtracted from the signal [32].
`Next, the tachograms are extracted from the
`filtered ECG signals as follows. Initially, using the
`Hamilton and Tompkins algorithm [33,34], a point
`within the QRS complex is detected (QRS point).
`Afterwards, the main wave of the QRS complex, i.e.,
`the R wave, is identified by locating the maximum
`absolute value of the signal within the time window
`[QRS—280 ms, QRS + 120 ms]. The HRV signal is then
`constructed by measuring the time intervals
`between the successive R waves (R—R intervals).
`Plotting the R—R intervals against the time indices
`provides the so-called tachograms. The tachograms
`are then divided into small segments each contain-
`ing 32 R—R intervals and characterized using the
`database rhythm annotation. It must be noted that
`the resulting tachograms are sequences of unevenly
`sampled beat-to-beat intervals. Therefore, for the
`case of the frequency domain analysis in the forth-
`coming Section 2.2.2, the cubic spline interpolation
`method is used at a sampling rate of 4 Hz to produce
`an evenly sampled data. This resampling procedure
`
`Figure 1
`
`Block diagram of the proposed arrhythmia classification algorithm.
`
`3
`
`

`

`54
`
`B.M. Asl et al.
`
`eceptor control and is mediated by both vagal and
`sympathetic systems [1,7]. In this work, the ratio of
`the LF and HF bands power (LF/HF) is used as the
`frequency domain feature of the HRV signal.
`
`is necessary prior to using the well-known methods
`of power spectral density (PSD) estimation which
`are only applicable to the evenly sampled signals.
`
`2.2.2. Feature extraction
`The next step in the block diagram is the feature
`extraction step. In general, the cardiovascular sys-
`tem, hence the HRV signals, demonstrates both
`linear and nonlinear behavior. Different linear and
`nonlinear parameters are defined and used for HRV
`signal description. In this work, a combination of
`both linear and nonlinear features of the HRV signal
`is considered. Time and frequency domain features
`are among the standard linear measures of the HRV
`signals which are strongly recommended in a special
`report published by the Task Force of the European
`Society of Cardiology and North American Society of
`Pacing and Electrophysiology in 1996 [1]. As in most
`previous works, these features are used in the cur-
`rent study.
`
`2.2.2.1. Linear analysis: time domain feature-
`s. Seven commonly used time domain parameters
`of the HRV signal which are also considered in this
`work are as follows:
`
`Mean: This refers to the mean value of the 32 R—R
`intervals within each segment.
`RMSSD: This refers to the root mean square suc-
`cessive difference of the 32 R—R intervals in each
`segment.
`SDNN: This refers to the standard deviation of the
`32 R—R intervals within each segment.
`SDSD: This refers to the standard deviation of
`differences between the adjacent R—R intervals
`within each segment.
`pNN50, pNN10, pNN5: These refer to the number
`of successive difference of intervals which differ
`by more than 50, 10 and 5 ms, respectively,
`divided by 32, the total number of the R—R
`intervals within each segment.
`
`2.2.2.2. Linear analysis: frequency domain fea-
`tures. Although the time domain parameters are
`computationally effective but they lack the ability
`to discriminate between the sympathetic and para-
`sympathetic contents of the HRV signal [35]. As the
`most popular linear technique used in the HRV signal
`analysis, however, the frequency domain analysis of
`the HRV signal has the ability to discriminate
`between the two. In fact, it is generally accepted
`that the spectral power in the high-frequency (HF)
`band (0.15—0.4 Hz) of the HRV signal reflects the
`respiratory sinus arrhythmia (RSA) and thus cardiac
`vagal activity. On the other hand, the low-frequency
`(LF) band (0.04—0.15 Hz), is related to the baror-
`
`2.2.2.3. Nonlinear analysis. Seven different non-
`linear parameters of the HRV signal are used in this
`work, which are listed and described in conti-
`nue.SD1/SD2: Let us consider the HRV signal as a
`time series of the R—R intervals which is denoted by
`RR(i). Now, if each interval RR(n + 1) is plotted as a
`function of the previous interval RR(n), then the
`resulting plot is known as the Poincare´ plot, which is
`a relatively new tool for HRV signal analysis. A useful
`feature of this tool is that it does not require the HRV
`to be considered as a stationary signal [36]. Poincare´
`plot can be seen as a graphical representation of the
`correlation between the successive R—R intervals.
`This plot can be quantitatively analyzed by calcu-
`lating the standard deviations of the distances of the
`points RR(i) from the lines y = x and y = x + 2RRm,
`where RRm is the mean of all RR(i) values. These
`standard deviations are denoted by SD1 and SD2,
`respectively. In fact, SD1 represents the fast beat-
`to-beat variability, while SD2 describes the rela-
`tively long-term variability in the HRV signal [37].
`The ratio SD1/SD2 is usually used to describe the
`relation between the two components.
`ApEn: Approximate entropy (ApEn) is a measure of
`unpredictability of the fluctuations in a time series,
`and reflects the likelihood that particular patterns of
`observations will not be followed by additional simi-
`lar observations. A time series containing many repe-
`titive patterns has a relatively small ApEn, while a
`more complex (i.e., less predictable) process has a
`relatively high ApEn [38]. We have used the method
`proposed in [39] for calculating ApEn for each HRV
`segment by setting the pattern length m = 2 and the
`measure of similarity r = 20% of the standard devia-
`tion of the segment, as proposed in [40].
`SpEn: Similar to ApEn, spectral entropy (SpEn)
`quantifies the complexity of the input time series
`(HRV segment) but in the frequency domain [41].
`The Shannon’s channel entropy is used in this work
`to obtain an estimate of the spectral entropy of the
`process as
`H ¼
`
`p f logð p fÞ
`
`X f
`
`(1)
`
`where pf is the value of the probability density
`function (PDF) of the process at frequency f [7].
`Heuristically, the entropy can be interpreted as a
`measure of uncertainty about the event at fre-
`quency f.
`LLE: Lyapunov exponent is a measure of how fast
`two initially nearby points on a trajectory will
`
`4
`
`

`

`Support vector machine-based arrhythmia classification using reduced features
`
`55
`
`dow size n on a log—log scale. Typically, F(n)
`increases with the window size. The fluctuation in
`small windows can be characterized by a scaling
`exponent (self similarity factor), a, which repre-
`sents the slope of the line relating log F(n) to log n
`[35].
`Sequential trend analysis: Sequential trend ana-
`lysis of the HRV signal evaluates not only the
`sympathetic—parasympathetic balance but also
`provides the spectral analysis of the signal without
`the necessity to consider the signal is stationary. To
`perform the sequential trend analysis, it is neces-
`sary to plot DRR(n) against DRR(n + 1) and divide
`the plane into four quadrants. The points located
`in the +/+ quadrant indicate two consecutive inter-
`val
`increments, which means the heart rate is
`decreasing and the ones in the / quadrant
`indicate two consecutive interval decrements,
`which means the heart rate is increasing. In this
`work the density of the points within the / and
`+/+ quadrants are used as two features that mea-
`sure the sympathetic and parasympathetic activ-
`ities, respectively [36].
`
`2.2.3. Feature dimension reduction by GDA
`Having defined the above-mentioned linear and
`nonlinear features, due to the large variations in
`the HRV patterns of various arrhythmia classes,
`there is usually a considerable overlap between
`some of these classes in the feature space. For
`example, the SSS and NSR classes demonstrate a
`large overlap with each other making it difficult to
`distinguish between the two. In this situation, a
`feature transformation mechanism that can mini-
`mize the within-class scatter and maximize the
`between-class scatter will be very beneficial. GDA
`[18] is such a transform which is employed in this
`work.
`GDA is a nonlinear extension to the ordinary
`LDA. The input training data is mapped by a kernel
`function to a high-dimensional feature space,
`where different classes of objects are supposed
`to be linearly separable. The LDA scheme is then
`applied to the mapped data, where it searches for
`those vectors that best discriminate among the
`classes
`rather
`than those vectors
`that best
`describe the data [47]. In fact, the goal of the
`LDA is to seek a transformation matrix that max-
`imizes the ratio of the between-class scatter to the
`within-class scatter. Furthermore, given a number
`of independent features which describe the data,
`LDA creates a linear combination of the features
`that yields the largest mean differences of the
`desired classes [48]. As a result, if there are N
`classes in the data set, the dimension of feature
`space can be reduced to N 1.
`
`diverge from each other as the system evolves,
`hence providing useful information about the sys-
`tem’s dependency on initial conditions [42]. A posi-
`tive Lyapunov exponent strongly indicates that the
`system is a chaotic one [43,44]. Although an m-
`dimensional system has m Lyapunov exponents,
`however, in most applications it is sufficient to
`obtain only the average largest Lyapunov exponent
`(LLE) as follows. First, a starting point is selected
`within the reconstructed phase space of the system
`and then all those points residing within a neighbor-
`hood of a predetermined radius e from the starting
`point are determined. Next, the mean distances
`between the trajectory of the initial point and
`the trajectories of the neighboring points are cal-
`culated as the system evolves. By plotting the loga-
`rithm of the above-mentioned mean values against
`the time index, the slope of the resulting line
`provides the LLE. To remove the dependency of
`the calculated values to the starting point, this
`procedure is repeated for different starting points
`and the average is taken as the average LLE [45]
`used as a feature to quantify the chaotic behavior of
`the HRV signal.
`DFA: The detrended fluctuation analysis (DFA) is a
`useful parameter to quantify the fractal scaling
`properties of the R—R intervals. This technique is
`a modification to the root-mean-square analysis of
`the random walks applied to nonstationary signals
`[46]. For the detrended fluctuation analysis of the
`HRV signal, the R—R time series (of total length N) is
`integrated using the following equation first:
`ðRRðiÞ RRmÞ
`yðkÞ ¼
`
`Xk
`
`(2)
`
`i¼1
`where y(k) is the kth value of the integrated series,
`RR(i) is the ith inter-beat interval and RRm is the
`average inter-beat interval over the entire series.
`Then, the integrated time series is divided into
`windows of equal length n. In each window of length
`n, a least-square line is fitted to the R—R interval
`data (representing the trend in that window). The y
`coordinate of the straight line segments are denoted
`by yn(k). Next, the integrated time series within
`each window n is detrended. The root-mean-square
`fluctuation of this integrated and detrended time
`series is then calculated as
`
`XN
`
`vuut
`ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
`½yðkÞ ynðkފ2
`
`k¼1
`
`(3)
`
`1 N
`
`FðnÞ ¼
`
`This computation is repeated over all window
`sizes
`(time scales)
`to obtain the relationship
`between F(n) and the window size n (i.e., the
`number of beats within the observation window).
`F(n) is usually plotted against the observation win-
`
`5
`
`

`

`B.M. Asl et al.
`
`T
`
`(5)
`
`!
`fðx prÞ
`
`Xn
`
`p
`
`r¼1
`
`
`fðx pqÞ
`
`Xn
`0@
`
`p
`
`q¼1
`
`n p
`
`XN
`
`p¼1
`
`B ¼ 1
`M
`
`The purpose of the GDA is to find the projection
`vector v such that the inter-classes inertia is max-
`imized and the intra-classes inertia is minimized in
`space F, which is equivalent to solving the following
`maximization problem:
`vT Bv
`v ¼ arg max
`vT Vv
`The projection vector v is the eigenvector of the
`matrix V1B associated with the eigenvalue l = vTBv/
`vTVv. All solutions v lie in the span of f(x). So, there
`exist expansion coefficients ai such that
`
`(6)
`
`v
`
`(4)
`
`Let us assume that the training data set X con-
`tains M feature vectors out of N classes. Let xpq
`denotes the qth HRV feature vector in the pth class,
`np is the class size of the pth class, and f is a
`nonlinear mapping function so that the space X is
`mapped into a higher dimensional feature space: f:
`xi 2 Rf ! f(xi) 2 RF, F  f.
`The observations f(xi) are assumed to be centered
`in space F. Before projecting the training data set X
`into a new set Y by means of the GDA, two matrices of
`the within-class scatter matrix V and the between-
`class scatter matrix B in space F are defined as
`V ¼ 1
`fðx pqÞfTðx pqÞ
`M
`
`Xn
`XN
`
`p
`
`p¼1
`
`q¼1
`
`56
`
`Box-plots of the new five features for different arrhythmia classes (1 = NSR, 2 = PVC, 3 = AF, 4 = SSS, 5 = VF,
`Figure 2
`6 = BII). For more information see the text.
`
`6
`
`

`

`XM
`
`i¼1
`By using the kernel function k(xi, xj) = kij =
`f(xi)f(xj) and performing the eigenvectors decom-
`position on the kernel matrix K = (kij)i=1,. . .,M;j=1,. . .,M,
`M normalized expansion coefficients for each pro-
`jection vector a = a/(aTKa)1/2 is obtained.
`Now, for a feature vector x from the test HRV data
`set, the projection on the ith eigenvector vi can be
`computed as
`yi ¼ viT
`fðxÞ ¼
`
`jkðx j; xÞ
`ai
`
`(8)
`
`XM
`
`j¼1
`
`jfðx jÞfðxÞ ¼
`ai
`
`XM
`
`j¼1
`
`where ai
`j denotes the jth expansion coefficient of
`the ith eigenvector. For the purpose of feature
`dimension reduction, N 1 eigenvectors associated
`with the first largest nonzero N 1 eigenvalues are
`selected to form the transformation matrix WT =
`(v1, . . ., vN1). So, each HRV feature vector is pro-
`jected into a new coordinates using the N 1 pro-
`jection vectors.
`
`Support vector machine-based arrhythmia classification using reduced features
`
`57
`
`It is worth to notice that the optimal number of
`eigenvectors for the data transformation is gener-
`ally equal to N 1 [22]. In this paper, assuming the
`number of classes (different arrhythmia to be iden-
`tified) is 6, the number of original 15 features was
`designed to be reduced to 5 by GDA with a guarantee
`that the performance is comparable to that of the
`non-reduced feature set.
`The box-plots and the feature space plots of the
`new five features for different arrhythmia classes,
`which are generated by this process, are presented
`in Figs. 2 and 3, respectively. As seen the patterns
`related to the different arrhythmia classes are
`located close to each other and relatively well
`separated from the other classes within the fea-
`ture space. Therefore the new reduced feature set
`not only increases the classification procedure in
`the next step but also provides an appropriate tool
`for a better discrimination of the different arrhyth-
`mia classes. To demonstrate the usefulness of the
`GDA technique, a comparative study is carried out
`in the next section comparing GDA to the com-
`monly used feature dimension reduction techni-
`ques of the principal component analysis (PCA) and
`the LDA.
`
`2.2.4. SVM-based arrhythmia classifier
`The next step in the block diagram is the classifica-
`tion of the HRV segments by considering their
`reduced features. Different classification methods
`have been used for cardiac arrhythmia classification
`in the past [2—17]. In this work the SVM scheme is
`used for classification. SVM is a machine-learning
`technique which has established itself as a powerful
`tool in many classification problems [22—24,26,27].
`Simply stated, the SVM identifies the best separating
`hyperplane (the plane with maximum margins)
`between the two classes of the training samples
`within the feature space by focusing on the training
`cases placed at the edge of the class descriptors. In
`this way, not only an optimal hyperplane is fitted,
`but also less training samples are effectively used;
`thus high classification accuracy is achieved with
`small training sets [49].
`Given a training set (xi, yi), i = 1, 2, . . ., l, where
`xi 2 Rn and yi 2 {1, 1}, the traditional SVM algo-
`
`!
`(
`)
`rithm is summarized as the following optimization
`problem:
`wT w þ C
`min
`ji
`i¼1
`w;b;j
`subject to : yiðwT fðxiÞ þ bÞ  1 ji;
`
`Xl
`
`1 2
`
`zi > 0 8 i
`
`v ¼
`
`aifðxiÞ
`
`(7)
`
`Feature space plots of (a) first, second, and
`Figure 3
`third new features; (b) third, fourth, and fifth new fea-
`, PVC:
`, AF:
`tures for different arrhythmia classes (NSR:
`, SSS:
`, VF:
`, BII:
`).
`
`where f(x) is a nonlinear function that maps x into a
`higher dimensional space [50]. w, b and ji are the
`
`(9)
`
`7
`
`

`

`58
`
`B.M. Asl et al.
`
`weight vector, bias and slack variable, respectively.
`C is a constant and determined a priori. Searching
`for the optimal hyperplane in (9) is a quadratic
`programming problem, which can be solved by con-
`structing a Lagrangian and transforming it into a
`dual maximization problem of the function Q(a),
`defined as follows:
`ai 1
`max QðaÞ ¼
`2
`
`Xl
`
`Xl
`
`i¼1
`
`aia jyiy jKðxi; x jÞ
`
`Xl
`Xl
`
`i¼1
`
`j¼1
`
`subject to :
`
`aiyi ¼ 0;
`i¼1
`for i ¼ 1; 2; . . .; l
`0  ai  C;
`(10)
`where K(xi, xj) = f(xi)Tf(xj) is the kernel function
`and, a = (a1, a2, . . ., al) is the vector of nonnegative
`Lagrange multipliers.
`the
`Assuming that
`the optimum values of
`Lagrange multipliers are denoted as ao,i(i = 1, 2,
`. . ., l), it is then possible to determine the corre-
`sponding optimum value of the linear weight vector
`wo and the optimal hyperplane as in (11) and (12),
`respectively:
`wo ¼
`ao;iyifðxiÞ
`
`(11)
`
`(12)
`
`(13)
`
`Xl
`
`Xl
`
`i¼1
`ao;iyiKðx; xiÞ þ b ¼ 0:
`
`
`!
`i¼1
`The decision function can be written as
`ao;iyiKðx; xiÞ þ b
`fðxÞ ¼ sign
`
`i¼1
`
`:
`
`Xl
`
`Although SVM separates the data only into two
`classes, classification into additional classes is pos-
`sible by applying either the one against all (OAA) or
`one against one (OAO) methods. In the OAA method,
`a set of binary classifiers (k parallel SVMs, where k
`denotes the number of classes) is trained to be able
`to separate each class from all others. Then each
`data object is classified to the class for which the
`largest decision value has been determined. The
`OAO method constructs k(k 1)/2 parallel SVMs
`where each SVM is trained on the data from two
`classes [51]. Then, the voting strategy [51] aggre-
`gates the decisions and predicts that each data
`object is in the class with the largest vote. A com-
`parative study is carried out in the next section
`comparing the performances of the OAO and OAA
`decomposition methods in training the proposed
`multiclass SVMs. In continue the results of the
`application of the proposed algorithm to the data
`set are presented.
`
`3. Results
`
`To evaluate the performance of the proposed
`arrhythmia classification algorithm, a total num-
`ber of 1367 HRV segments are used which includes
`835 NSR segments, 57 PVC segments, 322 AF seg-
`ments, 50 SSS segments, 78 VF segments, and 25
`BII segments. The relatively high percentage of the
`NSR segments in the data set is not far from reality
`as ECG recordings usually have a higher percen-
`tage of normal beats compared to arrhythmic
`segments. The HRV signals at each class are ran-
`domly divided into the train and test sets in an
`approximate ratio of 2/3 and 1/3. The exact num-
`ber of the train and test segments for each class is
`shown in Table 1.
`
`In this work, the radial basis function (RBF) is used
`as the kernel function and the parameters — kernel
`width s and regularization constant C — were
`experimentally defined to achieve the best classifi-
`cation result.
`
`Table 1 The correct classification results on the test set for each class by the proposed algorithm together with the
`spread of the erroneous classifications into the other classes (these are the average of 100 train and test procedures)
`
`8
`
`

`

`Support vector machine-based arrhythmia classification using reduced features
`
`59
`
`Table 2 Performance analysis of the SVM classifier on the original features (ORG) and the reduced features (by GDA)
`in terms of the average values of the four commonly used measures in % (numbers inside the parenthesis are the
`standard deviations)
`
`The 15 aforementioned linear and nonlinear
`arrhythmia features are then calculated for each
`HRV segment in the train and test sets. Next, the 15
`original features are reduced to only 5 new features
`by means of the GDA algorithm. A radial basis
`function (RBF) is used as the kernel where a kernel
`width of 7 was chosen empirically. Afterwards, the
`SVM classifier is trained using the reduced feature
`vectors of the training set. To optimize the learning
`cost and the classification performance, the SVM
`classifier parameters, the kernel width s and the
`regularization constant C, have to be chosen appro-
`priately. For this purpose, the training data set itself
`is divided into the train and validation sets. The
`optimum values of the parameters s and C are then
`chosen such that the minimum error is achieved on
`the validation data set. The resulting optimum
`values were 0.08 and 30 for s and C, respectively.
`The test data for each class is then used for perfor-
`mance analysis of the classifier.
`The whole procedure including randomly dividing
`the data set into the train and test sets, training the
`
`classifier and testing it by the test data set was
`repeated 100 times. The average correct classifica-
`tion results over the 100 runs on the test set for each
`class together with the spread of the erroneous
`classifications into the other classes are shown in
`Table 1. As seen, for the NSR, in average only 2.1
`segments are misclassified as the SSS (0.75%). For
`the PVC, in average only one segment is misclassi-
`fied as the AF (5.26%). For the AF, in average 3.8
`segments are misclassified as the PVC (3.52%) and 2
`segments as the SSS (1.11%). For the SSS, in average
`2.8 segments are misclassified as the NSR (14%). For
`the VF and the BII, there is no misclassification to
`any other classes.
`For a m

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket