`
`Classification of cardiac arrhythmia with respect to ECG and HRV signal by
`genetic programming
`
`Article · January 2012
`
`CITATIONS
`12
`
`3 authors, including:
`
`Mohammad Mehdi Ebadzadeh
`Amirkabir University of Technology
`
`111 PUBLICATIONS 1,380 CITATIONS
`
`SEE PROFILE
`
`READS
`885
`
`Some of the authors of this publication are also working on these related projects:
`
`Evolutionary Computing View project
`
`Fuzzy Neural Networks View project
`
`All content following this page was uploaded by Mohammad Mehdi Ebadzadeh on 01 June 2014.
`
`The user has requested enhancement of the downloaded file.
`
`APPLE 1038
`
`1
`
`
`
`Canadian Journal on Artificial Intelligence, Machine Learning and Pattern Recognition Vol. 3 No. 1, January 2012
`
`
`Classification of cardiac arrhythmia with respect to
`ECG and HRV signal by genetic programming
`
`Masih Tavassoli, Mohammad Mehdi Ebadzadeh, Hamed Malek
`
`
`
`
`
`Abstract — Consistent or periodical heart rhythm disorders may
`result cardiac arrhythmias. In this article, heart rate variability
`(HRV) signals are analyzed and various features including time
`domain, frequency domain and nonlinear parameters are extracted.
`Moreover, additional nonlinear features are extracted from
`electrocardiogram (ECG) signals. These features are helpful in
`classifying cardiac arrhythmias. In this paper, genetic programming
`is applied to classify heart arrhythmias using both HRV and ECG
`features. Genetic programming selects effective features, and then
`finds the most suitable trees to distinguish between different types
`of arrhythmia. By considering the variety of extracted parameters
`from ECG and HRV signals, genetic programming can precisely
`differentiate various arrhythmias. The performance of proposed
`algorithm is evaluated on MIT–BIH Database. The results show
`that seven different types of arrhythmia classes including normal
`beat , left bundle branch block beat, right bundle branch beat,
`premature ventricular contraction, fusion of ventricular and normal
`beat, atrial premature contraction and paced beat are classified
`with an accuracy of 98.75%, 98.93% , 99.10%, 99.46%, 99.82%,
`99.46% and 99.82% respectively..
`
`
`
`Key Words — Arrhythmia, Electrocardiogram (ECG),
`Heart Rate Variability (HRV), Genetic Programming (GP),
`Feature Selection
`
`I. INTRODUCTION
`
`Heart is a muscular organ which is responsible to pump
`oxygenated blood throughout blood vessels by rhythmic
`contractions. Any disturbance in the heart rhythm can be
`very dangerous. Although cardiac arrhythmia is one of the
`leading causes of death, it can be treated if detected on time.
`Heart arrhythmia can cause too slow or too fast performance
`of the heart. To detect it, ECG and HRV signals are widely
`used. ECG signal records electrical performance of the heart.
`It contains a lot of important information related to the
`condition of the heart and one of the most important tools in
`detecting heart diseases. A typical ECG signal consists of
`the P-wave, QRS complex, and T-waves. The P wave is the
`result of slow-moving depolarization of the atria. QRS
`complex which is made of Q, R and S waves shows
`ventricular depolarization. The T wave
`represents
`repolarization of the ventricles, and is longer in duration
`than depolarization. HRV signal describes the variations
`between consecutive heartbeats. It
`is a non-offensive
`evaluation method of the nervous system which controls
`cardiovascular system and
`is a measurement of
`the
`interaction between sympathetic and parasympathetic
`activity in autonomic nervous system. HRV signal is a non-
`stationary signal and its changes can be interpreted as a
`current or upcoming disease.
`
`and
`detection
`automatic
`for
`Several methods
`classification of cardiac arrhythmias have been proposed in
`literature, including: artificial immune recognition system
`with fuzzy weighted [1], threshold-crossing intervals [2],
`neural networks [3], fuzzy neural networks [8], fuzzy
`equivalence relations [12], Bayesian classifiers [16], support
`vector machines [17,22], wavelet
`transforms [18-20],
`combined wavelet transformation and radial basis neural
`networks [21], fuzzy logic combined with the Markov
`models [23] and the rule-based algorithms [24]. Some papers
`used
`techniques which are based on ECG segment
`[2,9,11,21,22,23,25]. In these papers, the various features of
`the ECG signal including the morphological features are
`extracted and used for classification of
`the cardiac
`arrhythmias. This is a time consuming procedure and the
`results are very sensitive to the amount of noise. An
`alternative approach would be to extract the HRV signal
`from the ECG signal [12-18,26,29] first by recording the R-
`R time intervals and then processing the HRV signal instead.
`This is a more robust method since the R-R time intervals
`are less affected by noise. One drawback of the proposed
`HRV-based algorithm is that some of the arrhythmia types
`such as the left bundle branch block and the right bundle
`branch block beats cannot be detected using only the heart
`rate variability features.
`In this paper a new arrhythmia classification algorithm is
`proposed which is able to effectively classify seven types of
`arrhythmia. These arrhythmias are namely the normal beat
`(NB), left bundle branch block beat (LBBB), right bundle
`branch beat (RBBB), premature ventricular contraction
`(PVC), fusion of ventricular and normal beat (FUSION),
`atrial premature contraction (APC) and paced beat (PACE).
`In this article, various features from both ECG and HRV
`signal are extracted and given to a genetic programming to
`produce the suitable solution trees to distinguish between
`different types of arrhythmia. From the various identified
`features the proposed method selects the effective ones and
`categorizes the seven classes of heart arrhythmia highly
`precisely.
`The reminder of this paper is formatted as follows:
`section 2 provides the overall block diagram of the proposed
`algorithm. In section 3, the database which is being utilized
`in this paper is introduced. Extracted features from HRV and
`ECG signal are explained in section 4. Genetic programming
`algorithm is introduced in the fifth section. The results are
`shown in section 6. Section 7 discusses about the result
`.Finally, Section 8 concludes the paper.
`
`
`
`1
`
`2
`
`
`
`Canadian Journal on Artificial Intelligence, Machine Learning and Pattern Recognition Vol. 3 No. 1, January 2012
`
`II. THE PROPOSED ALGORITHM
`
`IV. FEATURE EXTRACTION
`
`The block diagram of
`is
`the proposed algorithm
`demonstrated in Fig. 1. As seen, it consists of four steps:
`pre-processing on ECG signals to divide ECG signals to
`eight consecutive RR intervals and extract HRV signal,
`feature extraction from ECG and HRV signal, creating
`seven optimal trees to detect each arrhythmia by a genetic
`programming
`algorithm
`and
`finally
`arrhythmia
`classification. The following sections describe each block of
`this algorithm in more details.
`
`Input Data
`
`•
`
`Extract eight consecutive RR intervals
`from ECG signals
`
`• HRV signal extraction
`
`Extract features from HRV and ECG signals
`
`Create seven trees to distinguish
`between different types of arrhythmia
`genetic programming
`
`Arrhythmia classification
`
`Fig. 1. Block diagram of the proposed arrhythmia classification
`algorithm.
`
`III. DATABASE
`
`In this paper, the proposed approach is tested using the
`MIT-BIH arrhythmia database. This database contains 48
`ECG records. Each record is approximately 30 minutes long
`and
`it
`includes 109000 R-R
`intervals with sampling
`frequency of 360 Hz. Each beat has been annotated
`independently by two cardiologists. Their annotations were
`compared, consensus on disagreements was obtained, and
`the reference annotation files were prepared [27]. In most
`records, the upper signal is a modified limb lead II (ML II)
`and the lower signal is a modified lead V1 (VI). All of the
`ECG records of this article are chosen from lead II and
`include 8 sequential R-R intervals. Extracted records of this
`database include all seven classes of heart beat.
`After dividing the ECG signals into 8 sequential RR
`intervals, HRV signal is extracted from calculating the time
`intervals between every two consequential R-wave in an
`ECG signal (R-wave is located in the maximum absolute
`value of the signal within the time window).
`
`As seen in Fig. 1, features are extracted from both HRV
`and ECG signals. In the following subsections extracted
`features are explained.
`
`A. Extracted features from HRV signal
`
`Since the behavior of HRV signal includes both linear
`and non-linear behavior, a combination of these features is
`considered. These features include time domain, frequency
`domain and nonlinear parameters. Each ECG signal is
`divided into 8 consequential R-R intervals and each segment
`of HRV signal includes time distances between every two
`consequential R-wave in ECG signal.
`1) Time domain features
`Time-domain parameters of HRV are the easiest as they
`are based on common statistical methods. In this paper,
`seven commonly used time domain features are as follows:
`[28]:
`
`•
`
`•
`
`The mean value of the eight R-R intervals within
`each segment (Mean).
`The root mean square successive difference of the
`eight R-R intervals in each segment (RMSSD).
`The standard deviation of the 8 R-R intervals
`within each segment (SDNN).
`The standard deviation of differences between the
`adjacent R-R
`intervals within each segment
`(SDSD).
`The number of successive difference of intervals
`which differ by more than 50, 10 and 5 ms,
`respectively, divided by 8, the total number of the
`R-R
`Intervals within each segment
`(pNN50,
`pNN10, pNN5).
`2) Frequency domain features
`LF/HF: Although time domain features are important in
`classifying arrhythmia, they are not capable of distinction of
`sympathetic and parasympathetic content of the HRV signal
`[29]. For this purpose, HRV signal is transformed into
`frequency domain and the ratio of spectral power in lower
`bound (0.04-0.15Hz) to spectral power in upper bound
`(0.15-0.5Hz) is calculated. The lower bound frequency
`power
`is
`related
`to
`controlling
`temperature
`and
`cardiovascular mechanism and the upper frequency is related
`to the cardiac vagal activity.
`3) Nonlinear features
`The nonlinear properties of HRV can be analyzed using
`such as follow measures:
`
`•
`
`•
`
`•
`
`ApEn: Approximate entropy measures the complexity or
`irregularity of the signal. Large values of ApEn indicate high
`irregularity and smaller values of ApEn implies higher
`regularity [30]. The ApEn is computed as follows.
`For each segment in HRV signal with length N, uj is
`defined as follow:
`
`6
`
`3
`
`
`
`Canadian Journal on Artificial Intelligence, Machine Learning and Pattern Recognition Vol. 3 No. 1, January 2012
`
`u
`
`j
`
`=
`
`,RR(
`j
`
`...
`
`RR,
`
`−+
`1
`mj
`
`j,)
`
`=
`
`1
`,
`
`...
`
`+−
`1
`,mN,
`
`(1)
`
`RRj(ms)
`
`RRj+1(ms)
`
`=
`
`RR(
`
`,...,
`
`=
`
`where m is called the embedding dimension and N is the
`number of measured RR intervals. The distance between
`these vectors is defined as the maximum absolute difference
`between the corresponding elements. For each uj the relative
`number of vectors uk for which d(uj , uk) ≤ r is calculated by
`Eqs. (2) and (3)
`=
`max
`)u,u(d
`j
`k
`
`RR{|
`
`+
`nj
`
`−
`
`RR
`
`+
`nk
`
`n|
`
`=
`
`0
`
`−
`1
`,}m,...,
`
`
`
`
`
`(2)
`
`.
`
`
`
`
`
`(3)
`
`{|
`u
`
`k
`
`|
`
`,
`(
`uud
`j
`
`k
`
`)
`
`≤
`
`|}
`r
`
`−
`mN
`
`+
`
`1
`
`=
`
`m j
`
`)(
`rC
`
`Due to the normalization, the value of
`
`is always
`
`)(rC m
`j
`smaller or equal to 1. Afterward, the mean of natural
`)(rC m
`j
`
`logarithm of each
`
` over j is taken to yield:
`
`
`
`(4)
`
`)5(
`
`
`
`+−
`1
`mN
`.)r(Cln
`
`∑
`
`mj
`
`m
`
`Φ
`
`)r(
`
`=
`
`1
`+−
`mN
`
`1
`
`=
`1
`j
` Finally approximate entropy calculated by Eq. (5)
`1+Φ−
`m
`m
`
`
`
`
`
` )N,r,m(ApEn
`
`Φ=
`
`)r(
`
`).r(
`
`For calculating ApEn for each HRV segment, the value of
`m and r are chosen as m = 2 and r = 0.2SDNN.
`
`Fig. 2. Poincar´e plot analysis with the ellipse fitting procedure. SD1
`and SD2 are the standard deviations in the directions x1 and x2, where
`x2 is the line-of-identity for which RRj = RRj+1.
`Correlation Dimension: This parameter is an index of
`system complexity and indicates the number of independent
`variabilities to describe the behaviour of the system [33].
`Correlation Dimension of chaotic system is always a
`fraction, but it can be either a fraction or an integer number
`for a random system. By considering HRV signal as a time
`series, the correlation dimension can be calculated as
`follows:
`For each segment in HRV signal with length N, uj is defined
`as follow:
`u
`
`j
`
`j
`
`RR
`
`−+
`1
`mj
`
`j),
`
`−
`+
`1
`1
`,mN,...,
`
`
`
` (7)
`
`afterward by calculating d(uj,uk) distances, the number of
`vectors shorter than r is calculated, as in Eqs. (8) and (9).
`
`
`
`(8)
`
`(9)
`
`−
`))l(u)l(u(
`j
`k
`
`2
`
`,
`
`m
`
`∑ =
`
`l
`
`1
`
`
`
`)u,u(d
`j
`k
`
`=
`
`)u,u(d|u{|
`k
`j
`k
`
`≤
`
`|}r
`
`−
`mN
`
`+
`
`1
`
` .k
`∀
`
`=
`
`mj
`
`)r(C
`
`Then the mean value of
`
`)r(Cm
`j
`
` is calculated:
`
`
`
`
`
`(10)
`
`+−
`1
`mN
`.)r(C
`
`∑
`
`mj
`
`m
`)r(C
`
`=
`
`1
`+−
`mN
`
`1
`
`=
`1
`j
`The correlation dimension is calculated by Eq. (11)
`m
`)r(Clog
`
`lim
`
`r
`
`lim
`
`∞
`
`0
`
`N
`
`=
`)m(D
`
`2 A
`
`.
`
`
`
`
`
`)11(
`
`log
`r
`s shown in Fig. 3, in practice, this limit value is
`)r(Clog m
`
`approximated by slope of
`
` versus
`
`
`
`rlog when m
`
`is increased.
`
`7
`
`the HRV signal
`SpEn: Spectral entropy evaluates
`complexity in frequency domain [31]. Shanon channel
`entropy estimates HRV entropy as
`
`H
`
`f∑−=
`
`P
`
`log(
`
`,)P
`f
`
`
`
`(6)
`
`f
`where Pf is the value of the probability density function
`(PDF) of the process at frequency f.
`
`Poincare´ plot: This plot is another technique for analysis
`of HRV signal [32]. It is a graphical representation of the
`correlation between
`successive R-R
`intervals. By
`considering Poincare´plot as a time series of RRi, if each
`interval RRn + 1 is plotted as a function of the previous
`interval RRn, then the resulting plot is known as the
`Poincare´ plot. This plot shows the heart problem and some
`information about short term and long term oscillations. The
`Poincare´plot is derived by calculating SD1 and SD2
`parameters which are standard deviations of RRi distances
`from y=x and y=-x+2(RRm) lines respectively. RRm is the
`relation between
`mean of RRi. SD1/SD2 describes
`parameters. A common approach to parameterize the shape
`is to fit an ellipse to the plot as shown in Fig. 2.
`
`4
`
`
`
`Canadian Journal on Artificial Intelligence, Machine Learning and Pattern Recognition Vol. 3 No. 1, January 2012
`
`
`dc1 to dcn-2 are calculated. LE is approximately calculated by
`Eq. (13).
`
`1
`t
`
`.
`
`
`
`
`
`)13(
`
`t
`
`dd
`
`log
`
`∑−
`
`1
`
`n
`
`=
`
`1
`−
`
`1
`
`n
`
`λ
`
`=
`
`0
`t
`Hurst Exponent: The Hurst exponent is based on Hurst
`investigations to detect the incoming water flow in dam that
`he built on the Nile River. The incoming water flow in dams
`was assumed to be random but Hurst found out a non-
`periodic cycles in incoming flows based on the previous
`data. Hurst test was over generalized to other phenomena
`which seem to be random but may have an organized
`pattern. The Hurst exponent is a measure of the smoothness
`of a fractal time-series based on the asymptotic behavior of
`the rescaled range of the process. The test procedure is as
`follows [34]:
`time series of
`By considering ECG signal as a
`U=u0,u1,…,uT-1, divide this series to a consequential time
`series of length n into a contiguous subperiods. Each
`=
`−
`10
`1
`j
`,
`a,...,
`subperiod Ij is labeled with
` and each element
`
`
`Fig. 3. Approximation of the correlation dimension D2 from the log r,
`m
`
` (r) plot.
`log C
`B. Extracted features from ECG signal
`In this section, ECG signal is divided into 8 consequential
`RR intervals and by employing fractal dimension, lyapunov
`exponent and Hurst exponent, follow nonlinear features are
`extracted.
`
`Fractal Dimension: The concept of fractal dimension that
`refers to a non-integer or fractional dimension originates
`from fractal geometry. This feature has been used in the
`analysis of ECG and EEG to identify and distinguish
`specific states of physiologic function [35]. Box-counting
`
`
`dimension method is used to calculate the fractal dimension.
`The fractal dimension of an ECG signal is based on chaos
`theory and is a quantitative measurement of the roughness of
`that signal. It can be used as an effective parameter in
`categorizing heart arrhythmias. To calculate
`fractal
`dimension, each dimension of the signal is divided to S
`segments and the box which contain ECG signals are
`counted. N is the number of such boxes.
`Nlog
`
`in Ij is labeled N[j][k] such that k = 0, 1,..., n-1. For each
`subperiod Ij, the mean value is calculated and data scale is
`normalized by Eqs. (14) and (15):
`
`
`E
`
`=
`
`j ∑
`
`1
`
`n
`
`n
`
`−
`1
`
`k
`
`=
`
`0
`
` ,]k][j[N
`
` ) 14(
`
`]k][j[X
`
`k
`
`=∑
`
`−
`k),E]k][j[N(
`j
`
`=
`
`−
` .n,...,
`210
`1
`,,
`
`)15(
`
`−
`
`k][j[Xmin(
`
`]).
`
`
`
`) 16(
`
`
`
`
`
`=
`)S(
`
`.
`
`=
`0
`i
`The sample standard deviation calculated for each
`subperiod Ij is:
`=
`R
`max(
`])k][j[X
`jI
`−≤≤
`1
`0
`k
`n
`For each subperiod Ij, the value of R vector is calculated
`by Eq. (17).
`−
`1
`n
`
`1 ∑
`
`−
`)E]k][j[N(
`j
`
`2
`
`21
`)
`
`.
`
`
`
`
`
`
`
`) 17(
`
`0
`k
`The RIj is normalized with respect to a particular length n
`with Eq. (18).
`a
`
`=
`0
`j
`As a result amplitude Rij is always non-negative. By
`increasing n such that T/n is always an integer, this
`procedure is repeated until n=T/2. Hurst offered these
`equations by using the half rule in statistics.
`
`
`
`log(
`
`=
`
`log(
`
`+
`H)c
`
`log(
`
` ),n
`
`)S/R
`n
`in which R is the rescaled amplitude. S is the standard
`deviation of the time series, c is a constant, n is the length of
`the subperiod and the slope of the equation is the estimate of
`the Hurst exponent, H. According to Hurst results, if the
`
`(19)
`
`−
`1
`
`j∑
`
`SR
`
`II
`
` .
`
`)18(
`
`j
`
`
`
`S
`
`I j
`
`=
`
`(
`
`n
`
`
`
`)SR(
`n
`
`=
`
`=
`
`1
`
`a
`
`dim
`box
`
`lim
`
`
`
`
`
`)12(
`
`log
`S
`0
`S
`In practice, this limit value is approximated by slope of
`
`Slog when m is decreased.
`Nlog
` versus
`
`
`
`
`the
`system,
`In a dynamic
`Lyapunov Exponent:
`dependence on initial condition is described by Lyapunov
`Exponent (LE), and calculates the rate of deviating from
`roots. A positive LE shows that the distance of two points is
`increasing exponentially. This means that system tends to
`chaos. A negative LE indicates stable behaviour and a zero
`LE means that two close points on a root, keep their
`distance. By considering ECG signal as a time series of
`X=x0, x1,…,xn-1, it can be calculated as follows [34]:
`The procedure is started at t0 with x0, next the point in the
`time series which is closest to x0 is found and is called xi.
`Then d0 the absolute difference of these two points is
`calculated. dc0 is the absolute difference of the two
`consecutive points of x0 and xi, namely x1 and xi+1. The same
`procedure is repeated for points x1 to xn-1 and d1 to dn-2 and
`
`
`
`8
`
`5
`
`
`
`Canadian Journal on Artificial Intelligence, Machine Learning and Pattern Recognition Vol. 3 No. 1, January 2012
`
`value of Hurst exponent is equal to 0.5 it indicates an
`independent time series. It means that there is no correlation
`between any of the current element and a future element. If
`the Hurst exponent is between 0.5 and 1, it indicates a
`persistent behaviour. It means that if the time series has been
`increasing for a while, it is likely to increase for another
`period, and if the time series has been decreasing, it
`probably continue decreasing. Finally, if the value of the
`Hurst exponent is positive but less than 0.5, it indicates an
`anti-persistent behaviour. It means that if the time-series
`increases, it is more likely to decrease in future, and vice-
`versa.
`
` Fig. 4. Schematical overview of GP.
`
`
`
`
`accuracy of output of genetic programming with respect to
`ideal output.
`
`
`B. Generating new population
`To generate a new population, some processes should be
`repeated until the number of new population becomes equal
`to number of old population.
`
`
`C. Genetic operators
`After defining the fitness function, genetic operators
`should also be defined in order to improve the population.
`Common genetic operators are reproduction, crossover and
`mutation. In reproduction, 10% of population in each
`generation is transmitted to the new population without any
`changes. The procedure of choosing individuals to transmit
`to the new population depends on selecting strategy. In
`crossover, random nodes are chosen from both parent trees,
`and the subtrees starting at these two nodes are swapped
`resulting in two offspring. There is no bias in choosing
`internal or terminal nodes as the crossing sites. In tree
`mutation, a random node is chosen from the parent tree and
`the subtree starting by a randomly generated tree. This new
`random tree is created with the Grow initialization method
`and obeys the size/depth restrictions imposed on the trees
`created for the initial generation.
`
`
`D. Selection strategy
`A genetic programming without a suitable selection
`strategy is not different than random search [36]. Some
`
`TABLE I
` THE RATIO OF TRAINING DATA TO DISTINGUISH EACH ARRHYTHMIA
`FROM OTHERS.
`
`Arrhythmia
` NB
`LBBB
`
`Type
`
`
`RBBB
`
`PVC
`
`
`FUSION
`
`APC
`
`
`PACE
`
`
`
`9
`
`NB
`
`LBBB
`
`RBBB
`PVC
`
`FUSION
`APC
`PACE
`
`
`300
`
`60
`
`30
`35
`
`35
`
`10
`20
`20
`
`
`120
`35
`
`35
`
`10
`20
`20
`
`60
`
`30
`200
`
`
`35
`
`10
`20
`20
`
`60
`
`30
`35
`
`
`
`200
`
`10
`20
`20
`
`22
`
`22
`22
`
`22
`
`30
`20
`20
`
`50
`
`30
`35
`
`35
`
`50
`
`30
`35
`
`35
`
`10
`120
`
`20
`
`10
`20
`120
`
`
`
`
`correctly
`classified
`
`(%)
`
`
`
`99.00
`
`
`
`
`95.86
`95.34
`
`
`
`98.21
`
`
`98.36
`
`98.47
`
`100
`
`V. GENETIC PROGRAMMING
`
`Several evolutionary algorithms have already been
`offered by different researchers. The difference between
`these algorithms is in process of representing chromosomes,
`fitness function and the way that it works. In this article, we
`offer a genetic programming to classify arrhythmias. Genetic
`programming was first introduced by Koza using tree
`representation [36]. A member in a genetic programming has
`tree structure and as a result two types of genes, called
`functions and terminals are defined in this method. In fact,
`intermediate nodes act as functions and the leaves act as
`terminals. In our method, extracted features and some
`random numbers are used as terminals of the tree. The
`initialization of population is random and each tree is
`evaluated by fitness function. Different genetic operators
`like crossover and mutation are used to make new trees and
`the new generation is generated. As shown in Fig. 4, this
`procedure continues until the ending requirements are met.
`The procedure is explained in details in the following sub
`section.
`A. Create an initial population:
`The initial population is created by ramped half-and-half
`method. In this method, both Grow and Full methods are
`used to produce the initial population [36]. For each depth
`level of trees considered, half of the individuals are
`initialized using the Full method, and the other half using the
`Grow method. The population of trees resulting from this
`initialization method is very diverse with balanced and
`unbalanced
`trees of
`several different depths. The
`individual`s terminals are selected from extracted features
`and a set of random numbers. In this paper, we use the
`following operators as function set: add, subtract, times,
`divide, sin, cos and if-then-else structures. While calculating
`X1/X2, if X2 is zero, X1 is returned as the response. The if-
`then-else structure is defined as follows:
`
`
` (21)
`
`<
`0
`a
`Otherwise
`
`
`
`c
`b
`
`)c,b,a(myif
`
`=
`
`
`
`
`When the initial population is created, each individual`s
`fitness is measured. The value of each individual reflects the
`
`
`
`6
`
`
`
`Canadian Journal on Artificial Intelligence, Machine Learning and Pattern Recognition Vol. 3 No. 1, January 2012
`
`common approaches for selecting suitable parents are fitness
`proportionate selection, Greedy selection and Tournament
`selection. In this paper Lexicographic Parsimony Pressure
`Tournament strategy is chosen for selecting parents. In this
`strategy, like the Tournament selection, a random number of
`individuals are selected from the population and the best of
`them is determined. The main difference is that if two
`individuals are equally fit, the tree with fewer nodes is
`chosen as the best. This technique has shown to effectively
`control bloat in different types of problems [37].
`
`VI. RESULT
`
`To classify arrhythmia of Normal beat (NB), Left Bundle
`Branch Block beat (LBBB), Right Bundle Branch Block
`beat (RBBB), Premature Ventricular Contraction (PVC),
`fusion of ventricular and normal beat (FUSION), Atrial
`Premature Contraction (APC) and paced beat (PACE),
`genetic programming produce
`the appropriate
`tree
`to
`distinguish any arrhythmia from others. For avoiding bias,
`the number of training data of each specific arrhythmia type
`has about the same size of the summation of other classes.
`Table I shows the ratio of training data and the percentage of
`correctly classified training data in each class.
`Table II shows the results of test data for each class. Seven
`optimum trees were derived to distinguish each class from
`other classes by genetic programming. The four commonly
`used measures of
`sensitivity,
`specificity, positive
`predictivity, and accuracy are derived for the proposed
`algorithm [6,19]. As seen, the proposed method can
`discriminate the NB with an accuracy of 98.75%, the LBBB
`with 98.93%, RBBB with 99.10%, PVC with 99.46%,
`FUSION with 99.82%, APC with 99.46% and PACE
`99.82%. These results demonstrate the effectiveness of the
`proposed
`arrhythmia
`classification
`algorithm
`in
`discriminating the seven types of arrhythmias.
`
`7. Discussion
`
`In this study, an automatic procedure for detection of
`arrhythmias using ECG and HRV signals has been
`developed. For classification of arrhythmias, a genetic
`programming algorithm proposed which can select effective
`
`ApEn
`
`yes
`
`SpEn
`
`yes
`
`TABLE II
` THE CLASSIFICATION RESULT OF CARDIAC ARRHYTHMIAS.
`Correlation
`yes
`Dimension
`
`Poincare´
`plot
`
`features to detect each type of arrhythmia. Table III shows
`the effective
`features
`that are selected by genetic
`programming to distinguish each arrhythmia from others. As
`it can be seen in Table III, each optimum tree use features
`from time domain, frequency domain and nonlinear feature
`for arrhythmia detection. All of features are used to create
`optimum trees.
`the arrhythmia
`Several researchers have addressed
`detection and classification problem using the ECG or HRV
`signals. A summary of different methods together with their
`reported results in terms of the four commonly used
`measures of sensitivity, specificity, positive predictivity, and
`accuracy is showed in Table IV. Most papers have focused
`on the detection of a single arrhythmia type (mostly the VF
`and AF) within normal sinus rhythms. In [2] the authors
`have detected ventricular fibrillation with four techniques.
`The author of [11] has classified the ECG signals into the
`normal or arrhythmic classes via Adaptive Neuro-Fuzzy
`Inference System (ANFIS) and feature reduction by Linear
`Discriminate Analysis (LDA). In another attempt, different
`heart rhythms were detected and classified into the three
`arrhythmia types using short-time multifractality and fuzzy
`Kohonen network [9]. In another study, arrhythmias are
`classified into 6 classes using feature dimensions reduction
`
`TABLE III
` EFFECTIVE FEATURES THAT ARE SELECTED TO DISTINGUISH EACH
`ARRHYTHMIA FROM OTHERS.
`
`Features NB LBBB RBBB PVC FUSION APC PACE
`
`Mean
`
`Yes
`
`RMSSD
`
`SDNN
`
`SDSD
`
`pNN50
`
`pNN10
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`pNN5
`
`Frequency
` domain
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`
`Arrhythmia type
`
`
`Number of segment
`
`LBBB NB
`
`RBBB
`
`
`
` FUSION PVC
`
`Fractal
`PACE APC
`Dimension
`
`NB
`LBBB
`RBBB
`PVC
`
`FUSION
`APC
`PACE
`
`Average
`
`251
`52
`56
`90
`24
`30
`56
`
`-
`
`247
`3
`49 1
`0 1
`0 1
`0 0
`0 0
`0 0
`
`1
`0
`2
`0
`54
`1
`88 0
`0
`0
`0
`0
`0
`0
`
`-
`
`-
`
`-
`
`-
`
`0
`0
`0
`0
`23
`0
`0
`
`-
`
`0 0
`Lyapunov
`0 0
` Exponent
`0 0
`Hurst
`0 1
` Exponent
`0 1
`29
`1
`56 0
`
`10
`
`
`
`Sensitivity*
`
`yes
`(%)
`98.41
`
`yes
`94.23
`96.43
`97.78
`yes
`95.84
`96.67
`100
`
`
`
`Specificity*
`yes
`
`(%)
`99.03
`yes
`99.41
`99.40
`99.79
`yes
`100
`99.62
`99.80
`
`yes
`
`Positive*
`
`Predictivity
`(%)
`98.80
`
`94.23
`94.74
`98.88
`
`100
`93.55
`98.25
`
`*Sensitivity= TP/ (TP+ FN), Specificity= TN/ (TN +FP), Positive Predictivity= TP/ (TP+FP), Accuracy= (TP +TN)/ (TP+ FN+TN+FP), where TP true
`positive, FN false negative, TN true negative, and FP false positive.
`
`-
`
`-
`
`97.05
`
`99.58
`
`96.92
`
`99.33
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`yes
`
`Accuracy*
`yes
`yes
`(%)
`
`
`98.75
`yes
`98.93
`99.10
`99.46
`yes
`
`99.82
`99.46
`99.82
`
`7
`
`
`
`Canadian Journal on Artificial Intelligence, Machine Learning and Pattern Recognition Vol. 3 No. 1, January 2012
`
`by LDA and support vector machine [22]. In [21] a classifier
`is developed based on wavelet transforms for extracting
`features and then using a Radial Basis Function Neural
`Network (RBFNN) to classify the arrhythmias. The authors
`of [13] have classified the ECG signal segments into the
`normal or arrhythmic classes. It is based on time analysis
`and time-frequency analysis features via neural networks. In
`a recent paper [17] six different types of arrhythmia classes
`are differentiated. It is based on the LDA and Generalized
`Discriminate Analysis (GDA) feature reduction scheme and
`
`the SVM classifier. The effective ECG and HRV based
`algorithm which is proposed in the current work provides a
`better accuracy over a wider range of different types of
`cardiac arrhythmia, comparing
`to other studies. The
`proposed method can select the effective features to create
`each type of arrhythmia. The only signals used in this
`method are selected from lead II. Since the features are
`extracted from both HRV and ECG signals, the proposed
`method is not sensitive to noise and has the ability to classify
`different types of arrhythmias.
`
`11
`
`8
`
`
`
`Positive
`Specificity
`Method
`Dataset
`Arrhythmia type
`Sensitivity
`Author
`(%)
`Predictivity
`
`
`Canadian Journal on Artificial Intelligence, Machine Learning and Pattern Recognition Vol. 3 No. 1, January 2012
`(%)
`
`(%)
`
`
`Accuracy
`(%)
`
`TABLE IV
` SUMMARY OF PERVIOUS WORK.
`
`Clayton et al. [2]
`
`Threshold crossing intervals
`Autocorrelation function
`VF filter leakage
`Signal spectrum shape
`
`ECG
`
`Ventricular fibrillation
`
`Sengur [11]
`
`ANFIS+ feature reduction
`by LDA
`
`ECG
`
`Wang et al. Error!
`Reference source
`not found.
`
`Short-Time Multifractal
`
`ECG
`
`Song et al.[22]
`
`Support Vector
`Machine+feature reduction
`by LDA
`
`ECG
`
`AlFahoum et al. [21]
`
`wavelet transformation and
`radial basis neural networks
`
`ECG
`
`Tsipouras et al.[13]
`
`Time analysis
`t-f analysis
`
`HRV
`
`Mohammadzadeh Asl
`et al.[17]
`
`SVM+linear and nonlinear
`analysis
`
`HRV
`
`Mohammadzadeh Asl
`et al.[17]
`
`SVM+feature reduction by
`GDA
`
`HRV
`
`Proposed method
`
`Genetic programming HRV+ECG
`
`Normal
`Abnormal
`Ventricular Tachycardia
`Ventricular fibrillation
`Atrial fibrillation
`
`Normal sinus rhythm
`APC
` supraventricular
`tachycardia
`PVC
`ventricular tachycardia
`ventricular fibrillation
`
`Normal
`Atrial fibrillation
`Ventricular tachycardia
`Ventricular fibrillation
`
` Normal
`arrhythmatic
`
`Normal sinus rhythm
`PVC
`Atrial fibrillation
`Ventricular fibrillation
`sick sinus syndrome
`2° heart block
`
`Normal sinus rhythm
`PVC
`Atrial fibrillation
`Ventricular fibrillation
`sick sinus syndrome
`2° heart block
`
`NB
`LBBB
`RBBB
`PVC
`FUSION
`APC
`PACE
`
`46
`67
`77
`53
`
`95.9
`
`95.0
`98.3
`98.3
`
`99.657
`88.294
`82.821
`92.157
`91.695
`99.751
`
`92.51
`95.2
`100
`100
`
`87.53
`89.95
`
`97.86
`100
`92.59
`65
`100
`100
`
`99.25
`94.74
`94.63
`86
`100
`100
`
`
`98.41
`
`94.23
`
`96.43
`
`97.78
`
`95.84
`
`96.67
`
`100
`
`72
`38
`55
`93
`
`94
`
`99.2
`96.7
`100
`
`96.215
`99.951
`100
`98.937
`99.632
`99.993
`
`97.5
`85.7
`100
`100
`
`89.48
`92.91
`
`96.72
`98.65
`99.72
`98.19
`100
`100
`
`98.47
`99.14
`99.72
`99.07
`100
`100
`
`
`98.80
`
`94.23
`
`94.74
`
`98.88
`
`
`100
`93.55
`
`98.25
`
`
`97.8
`97.2
`99.4
`
`99.307
`99.274
`99.854
`98.344
`99.441
`99.883
`
`97.41
`98.70
`98.06
`96.76
`100
`100
`
`98.94
`98.96
`98.53
`98.51
`100
`100
`
`
`98.41
`
`94.23
`
`96.43
`
`97.78
`
`95.84
`
`96.67
`100
`
`
`97.86
`76.00
`99.01
`68.42
`100
`100
`
`99.00
`82.57
`99.03
`80.75
`100
`100
`
`
`99.03
`
`99.41
`
`99.40
`
`99.79
`100
`
`
`99.62
`
`99.80
`
`
`8. Conclusion
`
`In closing, for classification of cardiac arrhythmia, each
`ECG signals divi