throbber
Series
`
`
`
`Treating Individuals 2
`
`Subgroup analysis in randomised controlled trials:
`
`importance, indications, and interpretation
`Peter M Rafllwell
`
`Lilla! 2005; 365: 176-86
`Strobe Prevention Rmdl
`
`Unit.11de large pragmatic trials provide the most reliable data about the effects of treatments, but should be designed,
`amm,mam
`analysed, and reported to enable the most effective use of treatments in routine practice. Subgroup analyses are
`Infirmry,0xrford0X16llE,UK
`(PMRothwelFRCP)
`important if there are potentially large differences between groups in the risk of a poor outcome with or without
`mm“
`treatment, if there is potential heterogeneity of treatment effect in relation to pathophysiology, if there are
`practical questions about when to treat, or if there are doubts about benefit in specific groups, such as elderly
`people, which are leading to potentially inappropriate undertreatment. Analyses must be predefined, carefufly
`justified, and limited to a few clinically important questions, and post-hoc observations should be treated with
`scepticism irrespective of their statistical significance. If important subgroup effects are anticipated, trials should
`either be powered to detect them reliably or pooled analyses of several trials should be undertaken. Formal rules
`for the planning, analysis, and reporting of subgroup analyses are proposed.
`
`Introduction
`
`“The essence of tragedy has been described as the
`destructive collision of two sets of protagonists, both
`of whom are correct. The statisticians are right in
`denouncing subgroups that are formed post hoc from
`exercises in pure data dredging. The clinicians are also
`right, however,
`in insisting that a subgroup is
`respectable and worthwhile when established a priori
`from pathophysiological principles.”
`
`A R Feinstein, 1998'
`
`Randomised controlled trials (RCI's) and systematic
`reviews are the most reliable methods ofdetermining the
`effects
`of
`treatments.H However, when
`trials
`
`were first developed for use in agriculture, researchers
`were presumably concerned about
`the effect of
`interventions on the overall size and quality of the
`crop rather than on the wellbeing of any individual plant.
`Clinicians have to make decisions about individuals, and
`
`how best to use results of RCI‘s and systematic reviews to
`do
`this
`has
`generated
`considerable
`debate.”
`Unfortunately. this debate has polarised, with statisti-
`cians and predominantly nonclinical (or non-practising)
`epidemiologists waming of the dangers of subgroup
`analysis and other attempts to target treatment, and
`clinicians warning of the dangers of applying the overall
`results of large trials to individual patients without
`consideration of pathophysiology or other determinants
`ofindividual response. This rift. described by Feinstein as
`a 'clinicostatistical tragedy”,I has been widened by some
`of the more enthusiastic proclamations on the extent to
`which the overall results of trials can properly infbrm
`decisions at the bedside or in the clinic?”
`
`The results of small explanatory trials with well-defined
`eligibility criteria should be easy to apply, but general-
`isability is often undermined by highly selective
`recruitment, resulting in trial populations that are unrep-
`resentative even of the few patients in routine practice
`who fit the eligibility criteria.“ Recruitment of a higher
`proportion ofeligible patients is a major strength oflarge
`pragmatic trials, but deliberately broad and somefimes ill-
`defined entry criteria mean that the overall result can be
`difficult to apply to particular groups}7 and that subgroup
`analyses are necessary ifheterogeneity oftreatment effect
`is likely to occur. Yet despite the adverse effects on patient
`care that can result from misinterpreted or inappropriate
`subgroup analyses (table 1), there are no reviews or
`guidelines on the clinical
`indications for subgroup
`analysis and no consensus on the implications for trial
`design, analysis, and interpretation of subgroup effects,
`and the CONSORT statement on reporting of trials
`includes only a few lines on subgroup analysis.” This
`article discusses arguments for and against subgroup
`analyses, the clinical situations in which they can be
`usefirl, and rules for their performance and interpretation.
`Illustrative examples are taken mainly from treatments
`
`m A
`
`sliitisilefieuiveinseuxnhypuufimofsuoluehmm"
`Amihypertensivetreaunentforprinwy prevemionisinefiectiveitwomen’”
`kamummmmfi
`Angiotensin—corueningenzyme itlibitorsdonotreducemondityand hospitdatinission
`inpatientswithheartfaiurewhoaealsotakingaspirin"
`fiflodusmidfecflnafwmnmrflifiuimindfiflypuphfadhpfiuns
`withinfirinrmyoanflifiuion“
`Humbolysisisineffective>6 houlsafterawternyoardid ’Ilt’artlion‘7
`Whmmyoatflhfamimisiflficfiveorhnfilhm
`withaplwiwsnyoadflinfaaion“
`Tmifenduateisineffediveitwanenwidibmastmeraged<50yulf
`mmmmummswam
`thonlylow—doseaqiitdretomilmdopuafivelisk'
`Amlodipinereducsmottaity in patientswith chronic heartfaiure due to non-isdmmic
`ardiomyopadtybmnotin palientswith isdnemicardiomyqiathy"
`
`ofmrmteflectwfidllusmmmabeflse
`‘8asif:S‘éflfifig
`
`runmdmmmmmmmmm
`
`176
`
`Page 1 of 11
`
`Biogen Exhibit 2070
`.
`.WWW.-
`cetc
`Vol 36
`For personal use. Only reproduce WIth permISSIon rom E seVIerEtla
`Mylan v. Biogen
`IPR 2018-01403
`
`8.2005
`
`

`

`Series
`
`for cerebrovascular or cardiovascular disease but the
`principles are relevant to all areas of medicine and
`surgery.
`
`Arguments against subgroup analysis
`
`“ . . . it would be unfortunate if desire for the perfect (ie,
`knowledge of exactly who will benefit from treatment)
`were to become the enemy of the possible (ie, knowledge
`of the direction and approximate size of the effects of
`treatment of wide categories of patient).”
`
`S Yusuf et al, 19844
`
`The main argument against subgroup analysis is that
`qualitative heterogeneity of relative treatment effect
`(defined as the treatment effect being in different
`directions in different groups of patients, ie, benefit in
`one subgroup and harm in another) is very rare.2–5
`However, this observation is much less reassuring than it
`seems. First, it automatically excludes most treatments
`because they do not have a substantial risk of harm and
`can only be effective or ineffective. Yet use of an
`ineffective treatment can be highly detrimental if this
`prevents the use of a more effective alternative or if
`adverse effects impair quality of life. Second, the
`
`Panel 1: Rules of subgroup analysis: a proposed guideline for design, analysis, interpretation, and reporting
`
`G
`
`G
`
`G
`
`Trial design
`Subgroups analyses should be defined before starting the
`G
`trial and should be limited to a small number of clinically
`important questions.
`Expert clinical input into the design of subgroup analyses is
`needed to ensure that all relevant baseline clinical and
`other data are recorded.
`The direction and magnitude of anticipated subgroup
`effects should be stated at the outset.
`The exact definitions and categories of the subgroup
`variables should be defined explicitly at the outset in order
`to avoid post hoc data-dependent variable or category
`definitions. For continuous or hierarchical variables the cut-
`off points for analysis should be predefined.
`Stratification of randomisation by important subgroup
`variables should be considered.
`If important subgroup-treatment effect interactions are
`anticipated, trials should ideally be powered to detect them
`reliably.
`Trial stopping rules should take into account anticipated
`subgroup-treatment effect interactions and not simply the
`overall effect of treatment.
`If relative treatment effect is likely to be related to baseline
`risk, the analysis plan should include a stratification of the
`results by predicted risk. The risk score or model should be
`selected in advance so that the relevant baseline data can
`be recorded.
`
`G
`
`G
`
`G
`
`G
`
`G
`
`Analysis and reporting
`The above design issues should be reported in the methods
`G
`section along with details of how and why subgroups were
`selected.
`Significance of the effect of treatment in individual
`subgroups should not be reported; rates of false negative
`and false positive results are extremely high. The only
`reliable statistical approach is to test for a subgroup-
`treatment effect interaction.
`All subgroup analyses that were done should be reported—
`ie, not only the number of subgroup variables but also the
`number of different outcomes analysed by subgroup,
`different lengths of follow-up etc.
`
`G
`
`G
`
`G
`
`G
`
`G
`
`G
`
`Significance of pre hoc subgroup-treatment effect
`interactions should be adjusted when multiple subgroup
`analyses are done.
`Subgroup analyses should be reported as absolute risk
`reductions and relative risk reductions. Where relevant the
`statistical significance of differences in absolute risk
`reductions should be tested.
`Ideally, only one outcome should be studied and this
`should usually be the primary trial outcome, irrespective of
`whether this is one outcome or a clinically important
`composite outcome.
`Comparability of treatment groups for prognostic factors
`should be checked within subgroups.
`If multiple subgroup-treatment effect interactions are
`identified, further analysis is needed to check whether their
`effects are independent.
`
`G
`
`Interpretation
`Reports of the significance of the effect of treatment in
`G
`individual subgroups should be ignored, especially
`reports of lack of benefit in a particular subgroup in a trial
`in which there is overall benefit, unless there is a
`significant subgroup treatment effect interaction
`Genuine unanticipated subgroup-treatment effect
`interactions are rare (assuming that expert clinical
`opinion was sought in order to pre-define potentially
`important subgroups) and so apparent interactions that
`are discovered post hoc should be interpreted with
`caution.
`No test of significance is reliable in this situation.
`Pre hoc subgroup analyses are not intrinsically valid and
`should still be interpreted with caution. The false
`positive rate for tests of subgroup-treatment effect
`interaction when no true interaction exists is 5% per
`subgroup.
`The best test of validity of subgroup-treatment effect
`interactions is their reproducibility in other trials.
`Few trials are powered to detect subgroup effects and so
`the false negative rate for tests of subgroup-treatment
`effect interaction when a true interaction exists will usually
`be high.
`
`G
`
`G
`
`G
`
`www.thelancet.com Vol 365 January 8, 2005
`
`For personal use. Only reproduce with permission from Elsevier Ltd
`
`177
`
`Page 2 of 11
`
`

`

`Series
`
`to so-called unanticipated
`observation refers only
`heterogeneity.2–5 As outlined below, there are many
`examples in which qualitative heterogeneity of relative
`treatment effect has been correctly anticipated. Third, the
`observation only applies to single outcome events; it is
`argued that subgroup analyses based on composite
`outcomes are inappropriate.2–5,51 However, since qualitative
`heterogeneity of relative treatment effect is only possible
`for treatments that have a risk of harm, and such
`treatments almost always need a composite outcome to
`express the balance of both risk and benefit, qualitative
`heterogeneity as defined will inevitably be rare—a Catch-
`22, in fact.
`There are several other arguments against attempts to
`target treatment. First, it is said that clinicians already tend
`to undertreat patients,52 and we should not risk effective
`treatments being further restricted. However, one of the
`main purposes of subgroup analysis is to extend the use of
`treatments to subgroups that are not currently treated in
`routine practice. Subgroup analyses in epidemiological
`studies and trials often show that benefit from treatment
`is likely to be more universal than expected and that
`current indications for treatments in routine clinical
`practice are inappropriately narrow, as is now clear, for
`example, with treatment thresholds for blood pressure
`lowering or lipid lowering.53,54 Second, it is argued that
`subgroup analyses are almost always underpowered,55–60
`
`G
`
`Panel 2: The four main clinical indications for subgroup
`analysis
`Potential heterogeneity of treatment effect related to risk
`Differences in risks of treatment
`G
`Differences in risk without treatment
`Potential heterogeneity of treatment effect related to
`pathophysiology
`G Multiple pathologies underlying a clinical syndrome
`Differences in the biological response to a single
`G
`pathology
`Genetic variation
`Clinically important questions related to the practical
`application of treatment
`Does benefit differ with severity of disease?
`G
`Does benefit differ with stage in the natural history of
`disease?
`Is benefit related to the timing of treatment after a
`clinical event?
`Is benefit dependent on comorbidity?
`Underuse of treatment in routine clinical practice due to
`uncertainty about benefit
`G Underuse of treatment in specific groups of patients eg,
`elderly people
`Confinement of treatment according a narrow range of
`values of a relevant physiological
`variable—eg, treatment thresholds for cholesterol level
`or blood pressure
`
`G
`
`G
`
`G
`
`G
`
`G
`
`but this is simply an argument for larger trials and for
`meta-analysis of individual patient data. Third, it has also
`been argued that false positive subgroup effects might be
`more common than genuine heterogeneity,2–5,55–60 and
`these
`false observations might harm patients—
`“subgroups kill people.”61 Subgroup analyses have
`certainly led to mistaken clinical recommendations (table
`1), but these analyses would not have satisfied the rules
`suggested in panel 1. Moreover, not doing subgroup
`analysis can also be harmful. Properly powered subgroup
`analyses most commonly show that relative treatment
`effect is consistent across subgroups and, or, that
`treatments should be used more extensively than is
`currently the case.53,62,63 Without such evidence, unfounded
`clinical concerns about possible heterogeneity or
`inappropriately narrow indications for treatment would
`reduce the use of effective treatments in routine practice.26
`Not doing subgroup analyses has very probably killed
`more people.
`
`Situations in which subgroup analyses should be
`considered
`
`“The tragedy of excluding cogent pathophysiologic
`subgroup analyses merely because they happen to be
`subgroups will occur if statisticians do not know the
`distinction, and if clinicians who do know it remain
`mute, inarticulate or intimidated.”
`
`A R Feinstein, 19981
`
`Subgroup analyses should be predefined and carefully
`justified. Feinstein and others have emphasised the need
`for determination of pathophysiological heterogeneity,
`but there are three other indications for subgroup analysis
`(panel 2), each of which are discussed below, which are
`probably more important.
`
`Heterogeneity related to risk
`Clinically important heterogeneity of treatment effect is
`common when different groups of patients have very
`different absolute risks with or without treatment. The
`need for reliable data about risks and benefits in
`subgroups and individuals is greatest for potentially
`harmful interventions, such as warfarin or carotid
`endarterectomy, which are of overall benefit but that kill
`or disable a proportion of patients. However, evidence-
`based guidelines usually recommend these treatments in
`all cases similar to those in the relevant RCTs.64–66 In
`considering this approach, it is useful to draw an analogy
`with the criminal justice system. Suppose that research
`showed that individuals charged by the police with
`specific crimes were usually guilty. Few would argue that
`they should
`therefore be sentenced without
`trial.
`Automatic sentencing would, on average, do more good
`than harm, with most criminals correctly convicted, but
`any avoidable miscarriages of justice are widely regarded
`as unacceptable. In contrast, relatively high rates of
`
`178
`
`www.thelancet.com Vol 365 January 8, 2005
`For personal use. Only reproduce with permission from Elsevier Ltd
`
`Page 3 of 11
`
`

`

`
`
`Series
`
`
`
`1
`54(1-15-2-33)
`1-27 (099-164)
`1-113 (092-151)
`1-90 (Lu—289)
`M03
`M13
`H30
`M02
`1-13 (050-154)
`097 (04-2-35)
`254(147-4-39)
`597043-1468)
`p-077
`p-095
`p—OOOI
`[KO-(”1
`mmmamhammflmmmwwmmhmmmm
`”mum-mquMIeMIWML >705m'suiymls'fllrfly
`WMmmhu-flrdmxwmdmmm
`
`1 1
`
`TflzflmmqufiflfiMhmmmmmdm
`umwmflmw
`
`relation to underlying pathology is seen with thrombolysis
`for acute ischaernic stroke.“ with aspirin in primary
`prevention of vascular disease (in which benefit may be
`largely confined to men with elevated levels of C—reactive
`protein,“ probably indicating underlying atherosderosis),
`and with blood pressure-lowering in secondary prevention
`of transient
`ischaernic attack and stroke,
`in which
`
`that all patients be treated.”
`suggest
`guidelines
`However, there is clinical concern about patients with
`carotid stenosis or occlusion in whom cerebral perfiision
`is often severely impaired)“ Table 2 shows stroke risk by
`systolic blood pressure in patients with and without flow-
`limiting (270%) carotid stenosis who were randomly
`assigned
`to medical
`treatment
`in
`RCI's
`of
`endartm’ectomy.” Major increases in stroke risk were
`noted in patients with flow-limiting stenosis. but only if
`systolic blood pressure <150 mm Hg: 5-year risk in
`patients with bilateral (270%) stenosis was 64- 3% versus
`24-2% (p=0-002) at higher blood pressures. This
`difference in risk was absent in patients who had been
`randomly assigned to endarterectomy (13-496 Vs 18-396,
`p=0-6). suggesting a causal effect and indicating that
`aggressive blood pressure-lowering would very probably
`be harmful in patients with bilateral severe carotid disease
`in whom endarterectomy was not possible.
`
`Biologicd heterogeneity
`Subgroup analyses can also be usefirl when there are
`predictable differences in the biological response to the
`underlying disease. For example, perioperative admini-
`stration of antilymphocyte antibodies reduces rejection in
`cadaveric renal transplantation by 30%;“ but is expensive
`and has serious adverse effects. Clinical concern that
`
`benefit might depend on preexisting immune sensiti-
`sation prompted a meta-analysis ofindividual patient data
`from five RCl‘s. As predicted,
`treatment was highly
`effective in sensitised patients (hamd ratio for allografi
`failure at 5 years=0-20, 95% CI=0-09—0-47) but was
`ineffective in the remaining 85% (0- 97, 0-71—1 - 32).” The
`subgroup-treatment effect
`interaction was significant
`(p=0-009)—ie, fire effect of treatment was significantly
`different between the subgroups. A similar pre-spedfied
`immunologiral subgroup analysis in a large trial of
`
`treatment-related death or disability (miscarriages of
`treatment) are tolerated by the medical
`scientific
`community precisely because, on average, treatment will
`do more good than harm. In both situations systems need
`to be in place to avoid doing harm. Yet the contrast
`between the effort that is put into the defence of the
`accused in order to avoid wrongfirl conviction and the
`very limited efforts ofthe medical sdentific community to
`identify patients at high risk of harm is obvious.
`Admittedly,determinationofguiltinacriminaltrialis
`based on knOwledge of past events, which can often be
`established with certainty, whereas probable benefit or
`harm fiom medical treatment depends on future events,
`which are usually less certain. However, the probable
`balance of risk and benefit in individual patients can be
`predicted to some extent with subgroup analysis and risk
`models, as has been shown. for example, with carotid
`endarterectomy.""" In view of the fact that treatment
`complications are now a leading cause of death in
`developed countries," effort is needed to more effectively
`target potentially harmful interventions.
`Differences in the risk of a poor outcome without
`treatment
`can also
`lead
`to clinically
`important
`heterogeneity of treatment effect. Trial populations are
`often skewed in terms of control group risk, with a few
`individuals contributing much of the observed risk," and
`treatment may be ineffective or harmfirl in the low risk
`majority. In vascular medicine, this is the case with
`endarterectomy for
`symptomatic
`carotid stenosis,“
`anticoagulation for uncomplicated nonovalvular atrial
`fibrillation," coronary artery bypass grafting," and anti-
`arrhythrnic drugs after myocardial infarction.75 Clinically
`important heterogeneity of relative treatment effect by
`baseline risk has also been shown for blood pressure
`lowering," aspirin,” and lipid lowering7|
`in primary
`prevention of vascular disease, and in treatment of acute
`coronary
`syndromes with clopidogrel,"
`and with
`enoxaparin versus unfiactionated heparin.‘lul There are
`many similar examples in other areas ofmedicine,“ and
`this issue is the subject ofthe next article in this series.
`
`Pathophysiological heterogeneity
`Differences between groups of patients in underlying
`pathology, biology, or genetics can each lead to clinically
`important heterogeneity of treatment effects. Examples
`will probably be identified more frequently as our
`understanding ofthe molecular mechanisms ofdisease is
`enhanced.
`
`Multiple underlying pathologies
`Clinicians often have to treat patients with ill—defined
`clinical syndromes, which probably have many underlying
`pathologies. rather than one disease. Primary generalised
`epilepsy is a typical example in which treatment eflects
`differ between patients, probably because of the different
`underlying molecular pathologies. In vascular disease,
`dinially important heterogeneity of treatment effect in
`
`www.melmcetcom Vol 365 julnryfiloos
`
`Page 4 of 11
`
`For personal use. Only reproduce with permission from Elsevier Ltd
`
`179
`
`

`

`Series
`
`40·0
`
`30·2
`
`17·6
`
`14·8
`
`11·4
`
`8·9
`
`3·3
`
`4·0
`
`0–2
`
`4–12
`2–4
`Weeks from event to randomisation
`
`–2·9
`
`12+
`
`30·0
`
`20·0
`
`10·0
`
`0·0
`
`–10·0
`
`ARR (%), 95% CI
`
`Figure 1: Effect of carotid endarterectomy in patients with 50–69% and
`⭓70% symptomatic stenosis in relation to time from last symptomatic
`ischaemic event to randomisation70
`Numbers above bars indicate actual absolute risk reduction. Vertical bars are
`95% CIs. ARR=absolute risk reduction.
`
`in patients with more marked changes
`benefit
`(interaction p=0·006).107 The stage of disease can also
`determine the effect of treatment of non-vascular
`disease, as is seen in people with cancer,108,109 or
`HIV/AIDS.110–112
`
`Timing of treatment and comorbidity
`Effect of treatment is often critically dependent on
`timing, as shown in figure 1, for benefit from
`endarterectomy
`for recently symptomatic carotid
`stenosis. The risk of a stroke is very high during the first
`few days and weeks after a transient ischaemic attack,113
`especially in patients with carotid stenosis,114 but falls
`rapidly with time, as therefore does benefit from
`endarterectomy.70 Similar time-dependence has been
`shown for benefit from thrombolysis for both acute
`myocardial infarction106 and acute ischaemic stroke.115
`Treatment effects may also depend on comorbidity.
`For example, angiotensin-converting enzyme inhibitors
`and angiotensin II receptor blocking drugs are harmful
`in patients with renovascular disease but highly
`beneficial in other hypertensive patients.116 Benefit from
`diltiazem after myocardial infarction may depend on
`the presence of heart failure because of the negative
`chronotropic and inotropic effects of the drug.117
`
`Underuse of treatment in specific groups
`Treatments that are effective in trials are often underused
`in specific groups of patients in routine practice. For
`example, statins were not used in elderly people for many
`years until the drugs were proved highly effective by
`subgroup analysis in the Heart Protection Study.53 Proof
`of some benefit by subgroup analysis was also needed to
`counter underuse in elderly patients of thrombolysis for
`acute myocardial infarction in elderly people,106 and
`similar underuse of endarterectomy for symptomatic
`carotid stenosis.70 In each case, treatment had already
`been shown to be highly effective overall. Use of
`treatment in routine clinical practice is also often
`inappropriately limited to patients with measurements of
`
`coronary
`after
`placebo
`versus
`roxithromycin
`angioplasty showed that treatment reduced restenosis and
`the need for revascularisation if the titre of Chlamydia
`pneumoniae antibody was high but was ineffective or
`harmful if the titre was low (interaction p=0·006).95
`
`Genetic heterogeneity
`Individuals respond differently to some drugs and this
`tendency can be inherited.96,97 Genotype is an important
`determinant of both the response to treatment and the
`susceptibility to adverse reactions for a wide range of
`drugs.98,99 For example, response to chemotherapy is
`dependent on gene expression in both colon cancer100 and
`breast cancer,101 and HDL cholesterol response to
`oestrogen replacement therapy is highly dependent on
`sequence variants in the gene encoding oestrogen
`receptor ␣.102 In each of these cases, significant subgroup-
`treatment effect interactions have been reported. There is
`also great interest in the effects of genetics on the
`response to treatment in patients with HIV-1.103 Subgroup
`analyses based on genotype have particular methodolo-
`gical problems since many genotypes may be studied and
`analyses will often be post hoc.
`
`Heterogeneity related to practical application
`Many of the arguments used against subgroup analyses
`misinterpret their main function. The main potential
`of subgroup analysis is not in the identification of
`groups that differ in their response to treatment for
`reasons of pathophysiology, but is in answering
`practical questions about how treatments should be
`used most effectively, such as at what stage of the
`disease is treatment most effective, how soon after a
`clinical event is treatment sufficiently safe or most
`effective, or how are the risks and benefits related to
`comorbidity? Subgroup analyses related to questions of
`the practical application of interventions can be vital to
`effective clinical practice.
`
`Severity or stage of disease
`Treatment effects often depend on severity of disease.
`In primary prevention of vascular disease, a pooled
`analysis of RCTs of pravastatin showed that the
`relative risk reduction with treatment increased with
`baseline LDL
`cholesterol
`(interaction p=0·01):
`relative risk reduction=3% in the lowest quintile and
`29% in the two highest quintiles.104 In stroke medicine,
`carotid endarterectomy
`is highly effective
`for
`⭓70% recently symptomatic stenosis, modestly
`effective for 50–69% stenosis, but harmful for <50%
`stenosis
`(interaction p<0·0001).105
`In cardiology,
`thrombolysis
`for acute myocardial
`infarction
`is
`ineffective or harmful in patients with ST segment
`depression, but highly beneficial in patients with ST
`elevation (interaction p<0·01),106 and early invasive
`treatment of unstable angina is of no benefit in patients
`with only minor ST segment change but of major
`
`180
`
`www.thelancet.com Vol 365 January 8, 2005
`For personal use. Only reproduce with permission from Elsevier Ltd
`
`Page 5 of 11
`
`

`

`Series
`
`Events/patients
`Surgical
`
`Medical
`
`ARR (%)
`
`95% CI
`
`p value
`
`7/56
`
`4/66
`
`8/76
`
`8/67
`
`9/75
`
`1/56
`
`6/51
`
`6/41
`
`10/44
`
`6/28
`
`13/47
`
`9/36
`
`6/37
`
`8/41
`
`3·1
`
`16·7
`
`10·5
`
`18·3
`
`12·8
`
`15·1
`
`9·5
`
`12·3
`
`–11·3 to 17·5
`
`3·0 to 30·3
`
`–6·9 to 27·9
`
`2·3 to 34·2
`
`–3·8 to 29·4
`
`2·3 to 27·9
`
`–6·6 to 25·6
`
`0·34
`
`0·008
`
`0·12
`
`0·01
`
`0·07
`
`0·01
`
`0·12
`
`6·5 to 18·1
`
`<0·001
`
`Day of birth
`
`Sunday
`
`Monday
`
`Tuesday
`
`Wednesday
`
`Thursday
`
`Friday
`
`Saturday
`
`Total
`
`43/447
`
`58/274
`
` Heterogeneity: p=0·83
`
`–20
`
`–10
`
`0
`
`10
`
`20
`
`30
`
`40
`
`% absolute risk reduction (95% CI)
`
`Figure 2: Effect of carotid endarterectomy in patients with ⭓70% symptomatic stenosis in ECST126 according to day of week on which patients were born
`
`physiological parameters above specific arbitrary cut-off
`points, such as treatment thresholds for blood pressure
`and total cholesterol in prevention of vascular disease.
`There is increasing evidence from subgroup analysis in
`large trials that such thresholds are inappropriate.53,87
`Proof of the generalisability of benefit is a major function
`of subgroup analysis. However, such analyses should be
`sufficiently powered to detect benefit, and pooled
`analyses of multiple trials will often be needed for
`subgroups such as elderly people who are commonly
`under-represented in trials.26
`
`Estimation and interpretation of subgroup
`effects
`
`“Far better an approximate answer to the right question,
`which is often vague, than an exact answer to the wrong
`question, which can always be made precise.”
`
`J W Tukey, 1962118
`
`Multiplicity, post hoc analyses, and publication bias
`In one trial of ␤ blockers after myocardial infarction,119 146
`subgroup analyses were done,120 several of which showed
`apparent differences in the effect of treatment. However,
`none of the differences were confirmed by subsequent
`studies.40 Pocock reviewed 50 trials published in major
`journals in 1997 and noted that 70% reported a median of
`four subgroup analyses,55 which was little changed from
`10 years previously.121 The reliability of these subgroups
`depends to a great extent on whether they were predefined
`and how many other analyses were done but not reported.
`Selective reporting of post hoc subgroup observations,
`which are generated by the data rather than tested by
`them, is analogous to placing a bet on a horse after
`watching the race. There is certainly evidence of selective
`reporting of significant analyses,122–124 but this is difficult to
`judge when assessing an individual trial. The only
`solution is for a small number of potentially important
`subgroups to be pre-defined in the trial protocol, along
`
`with their anticipated directions. Post hoc observations are
`not automatically invalid (many medical discoveries have
`been fortuitous), but they should be regarded as unreliable
`unless they can be replicated.
`
`Statistical significance
`two ways.
`in
`Subgroup analyses can be wrong
`First, they can falsely indicate that treatment is beneficial
`in a particular subgroup when the trial shows no overall
`effect—the situation in which subgroup analyses are most
`commonly done.56,57 Simulations of RCTs powered to
`determine the overall effect of treatment suggest that false
`subgroup effects will be noted by chance in 7%–21% of
`analyses depending on other factors.58 More commonly (in
`41%–66% of simulated subgroups) simulations can falsely
`indicate that there is no treatment effect in a particular
`subgroup when the trial shows benefit overall.58 Benefit is
`most likely to be absent in small subgroups, which
`probably explains the recurrent and usually mistaken
`finding that treatments are ineffective in women29,32,125 and
`in elderly people,32,35 who tend to be under-represented in
`RCTs.26 The correct analysis is not the significance of the
`treatment effect in one subgroup or the other, but whether
`the effect differed significantly between the subgroups—
`the test of subgroup-treatment effect interaction. For
`example, although endarterectomy for severe stenosis in
`the European Carotid Surgery Trial (ECST)126 was only
`significantly beneficial in patients born on specific days of
`the week (figure 2), this was, of course, due to chance and
`there was no subgroup-treatment effect
`interaction
`(p=0·83). Data from simulation studies have shown that
`tests of subgroup-treatment effect interaction are reliable,
`with a false positive rate of 5% at p<0·05, which is robust
`to differences in the size of subgroups, the number of
`categories, and to continuous data.58 However, although
`testing of subgroup-treatment effect interactions is widely
`recommended,51,55–57,121 Pocock’s review showed that 37% of
`RCTs reported only p values for treatment effect within
`subgroups and only 43% reported tests of interaction.55
`
`www.thelancet.com Vol 365 January 8, 2005
`
`For personal use. Only reproduce with permission from Elsevier Ltd
`
`181
`
`Page 6 of 11
`
`

`

`Series
`
`Events/patients
`
`Month of birth
`
`Surgical
`
`Medical
`
`ARR (%)
`
`95% CI
`
`May–Jun
`
`Jul–Aug
`
`Sept–Oct
`
`Nov–Dec
`
`Jan–Feb
`
`Mar–Apr
`
`6/83
`
`8/84
`
`10/87
`
`6/56
`
`9/73
`
`12/64
`
`18/47
`
`16/58
`
`7/34
`
`9/39
`
`6/43
`
`6/53
`
`Total
`
`51/447
`
`62/274
`
`Heterogeneity: p<0·0001
`
`33·4
`
`20·7
`
`9·6
`
`11·2
`
`0·1
`
`–7·7
`
`11·6
`
`18·2 to 48·6
`
`7·0 to 34·4
`
`–6·2 to 25·3
`
`–5·2 to 27·6
`
`–13·1 to 13·2
`
`–20·8 to 5·3
`
`5·6 to 17·6
`
`–30
`
`–20
`
`–10
`
`0
`
`10
`
`20
`
`30
`
`40
`
`50
`
`% absolute risk reduction (95% CI)
`
`Figure 3: Effect of carotid endarterectomy in patients with ⭓70% symptomatic stenosis in ECST126 according to month of birth in six 2 month periods
`
`Chance
`The effect of chance on subgroup analyses is usually
`illustrated with the ISIS-2 trial example (aspirin vs placebo
`in acute myocardial infarction), in which aspirin was
`ineffective in patients born under the star signs of Libra
`and Gemini (150 deaths on aspirin vs 147 on placebo,
`2p=0·5), but was beneficial in the remainder (654 deaths
`on aspirin vs 869 on placebo, 2p<<0·0001).3–5 The
`significance of this subgroup treatment effect interaction
`has never been reported, but it seems to be p=0·01
`(Breslow Day test). However, Li

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket