Concurrent Validity of the Hamilton Depression Rating Scale
the Beck Depression Inventory versus the ICD-10 Diagnostic
Criteria among Patients with Parkinson’s Disease
Marcos Serrano-Dueñas, MD, MSc;1,2,3 Sabrina Sevilla,
MD;2 Paola Lastra, MD2.
Abnormal Movement Disorder Clinic, Neurological Service,
Hospital Carlos Andrade Marín
Medicine Faculty, Pontifical Catholic University of Ecuador
Mathematical Department, Sciences Faculty, National Polytechnic
Serrano-Dueñas, MD: has made a contribution either to the
conception and design, to the acquisition of data, to the
analysis and interpretation of the data; to drafting the article
and reviewing it critically.
MD and Paola Lastra, MD: has made a contribution to the
acquisition of data, to drafting the article or reviewing it
MD; Sabrina Sevilla, MD and M Paola Lastra, MD has given final
approval of the version of the article to be published and can
certify that no other individuals not listed as authors have
made substantial contributions to the paper.
Kingdom PD Society Brain Bank criteria
of Daily Living
Yahr scale. The HY scale is universally used as a severity
indicator, both in clinical practice and research, and for
selection of patients to be included in clinical trials. It is
made up of 6 points of increasing progression (from 0, No signs
of disease, to 5, Wheelchair bound or bedridden unless aided)
England scale. SES scores are expressed in terms of percentage,
in 11 steps from 100 to 0, where 100% denotes complete
independence and 0% denotes bedridden with vegetative
Pfeiffer’s Short Portable Mental Status Questionnaire. The SPMSQ
is a 10-item scale rated by interview, and is used to detect and
estimate the severity of cognitive impairment. It explores
memory (short- and long-term), orientation (time, place and self),
and basic calculations. The SPMSQ is hardly influenced by
educational level and, as a result, proves quite useful and
reliable for screening general and elderly populations. Five
errors or more are indicative of moderate or severe impairment.
Unified Parkinson’s Disease Rating Scale. The UPDRS, officially
known as UPDRS version 3.0, is the most widelyapplied clinical
rating scale for PD and a gold standard reference scale. UPDRS
consists of the following 4 subscales: 1) Part I (4 items;
scoring: 0-16 points), mentation, behaviour and mood; 2) Part II
(13 items; 0-52 points), activities of daily living (ADL), which
may be scored in “on” or “off” states; 3) Part III (14 items;
0-108 points), motor examination (this section produces 27
scores due to assessment of several signs in different parts of
the body); and 4) Part IV (11 items; 23 points–not applied in
the present study –), assessing for dyskinesias, fluctuations
and other complications. Items belonging to the sections I to
III score from 0 (normal) to 4 (severe), whereas scoring of the
part IV is heterogeneous.
examine the concurrent validity of the Hamilton Depression
Rating Scale and the Beck Depression Inventory for quantifying
depression in patients with Parkinson’s disease, using the ICD-10
Diagnostic Criteria as the gold standard, and to determine if
the somatization items considered are pertinent.
study involved one hundred and forty consecutive PD patients
–102 men and 38 women– with a mean age of 68.7 years and mean
disease duration of 6.7 years. Sensitivity, specificity,
positive and negative predictive values and likelihood ratios
were obtained with a 95% CI. ROC Curves (AUC) were also
Based on ROC measurement of discriminative ability, our results
suggest that both scales were poor at recognizing mild
depression, somewhat better at recognizing moderate depression
and adequate for distinguishing severe depression, though with
poor specificity. Comparisons of HDRS-21, HDRS-12, BDI-21 and
BDI-16 to determine concurrent validity all gave similar results
for each depression level and no important differences between
the complete scales (all 21 items) and abbreviated forms (without
somatic items) were noted.
conclude that both scales possess similar psychometric
properties, but our results cannot be compared with those of
other studies that used DSM-IV criteria as their gold standard.
These observations led to the following conclusions: (1) the
evaluation scales and criteria that comprise them were not
designed for PD; (2) the somatic items observed in our patients
were a product of PD; and (3) as the severity of the illness
increased, so did the number of items that were confused as
elements of depression.
(ii)Parkinson’s disease, (iii)Hamilton Depression Rating Scale,
(iv)Beck Depression Inventory, (v)Somatic items in depression, (vi)Sensitiviy,
the most frequently observed neuropsychiatric symptom in
Parkinson’s disease (PD), with a reported prevalence of 40%.1 In
a community-based study, Tandberg et al.2 found major depressive
disorder in 7.7% of patients, whereas among hospitalized
patients major depression may be present in up to 70%.3
of depression in the course of PD is a critical clinical goal2
due to the impact that this disease has on patients. Its
diagnosis can be difficult, however, because of the similarity
of signs and symptoms that are characteristic of both PD and
depressive illness.4 This may be why depression is often under-diagnosed
in patients with PD, as indicated in a study conducted by
Shulman et al.5 that determined that doctors failed to diagnose
the presence of depression in more than 50% of cases.
Two of the most
commonly used scales to measure the severity of depression in PD
are the Hamilton Depression Rating Scale (HDRS)6 and the Beck
Depression Inventory (BDI).7 These scales were designed to
evaluate the severity of depression in adult subjects without
progressive degenerative neurological illness; they are not
scales specifically designed to evaluate depression in PD.8 The
HDRS has been questioned as a tool even in the area of
depression, with critics citing two potential pitfalls in its
psychometric properties: (1) excessive use of items that measure
somatization;9,10 and (2) lack of unidimensionality. 11,12 Even
so, the HDRS was considered more useful than the BDI to measure
depression in patients with PD,13 while other researchers
concluded that the BDI could also be appropriately used in such
analyzing the psychometric qualities of the HDRS and the BDI
scales to quantify depression in PD patients13,14 used the DSM
IV criteria for major depressive disorders as the gold standard.15
As far as we know, these scale characteristics have not been
analyzed with respect to the ICD-10 criteria16 that classify
depression at three levels –mild, moderate and severe– and note,
as well, the presence or absence of somatic syndrome at each
level. Conversely, the DSM IV criteria use only two categories:
depressed and non-depressed patients.
cross-sectional study at one point in time is designed to
investigate the concurrent validity of the 21-item HDRS (HDRS-21)
and the 21-item BDI (BDI-21) evaluation scales. We evaluated
both scales against the ICD-10 criteria for depressive episodes,
which were used as the gold standard. A second objective of this
study was to determine the impact of somatization on the quality
of each scale.
Materials and Methods
One hundred and
forty (140) patients diagnosed with PD by UKPDSBB17 were
consecutively selected from a pool of outpatients treated at the
Clinic of Movement Disorders of the Carlos Andrade Marin
Hospital (HCAM) Neurology Service in Quito, Ecuador. Patients
who presented any of the following characteristics were
excluded: severe cognitive impairment (evaluated by the Short
Portable Mental Status Questionnaire of Pfeiffer –SPMSQ–,18 with
a score of over 5/10), serious concomitant illness, blindness,
hypoacusis or limb amputation. This study was approved by the
HCAM Dept. of Research and Teaching and all patients involved
provided prior written informed consent.
collected through interviews and examination, and included
demographic and historical data (age, gender, years of illness,
treatment, L-dopa dosage). Patients were always evaluated during
the ON period. All psychometric tools were applied to patients
by the authors of this study.
During testing, patients were assessed using to the following
scales and parameters: Unified Parkinson’s Disease Rating Scale
(UPDRS: sections I, II and III);19 Illness Stage, as per Hoehn
and Yahr Staging (H&Y);20 and ADL: Activities of Daily Living,
as per Schwab and England (S&E).21
In this study, one researcher used either the HDRS-21 (Spanish
version by Bobes et al.22) or the BDI-21 (Spanish version by
Sanz et al.23) to evaluate randomly selected patients. A week
later, the alternate scale was used as an evaluation tool by a
second researcher and, finally, a third researcher evaluated
patients with the ICD-10 criteria. All the authors remained
blind to ongoing test results during psychometric patient
evaluations. The recommendations of Spitzer et al.24 were used
during application of the depression scales and the ICD-10
consistency of the HDRS-21 and BDI-21 criteria (Cronbach’s α or
α C) was statistically analyzed, as were ROC curves used to
evaluate the discriminative capacity of each scale. A cutoff
point was established to determine the maximum sensitivity (Se),
specificity (Sp), positive predictive value (PPV), negative
predictive value (NPV), positive likelihood ratio (LR+) and
negative likelihood ratio (LR-), as well as their 95% confidence
intervals (CI) for each level of depressive episode. We also
measured the concordance level of scales using random
concordance (Cohen’s Kappa).
analyzed statistical differences between the depressed and
non-depressed PD groups in the study using the Student’s t
somatization items removed in the HDRS-21 were early insomnia
(4), middle insomnia (5), late insomnia (6), retardation (8),
somatic anxiety (11), gastrointestinal symptoms (12), general
somatic symptoms (13), genital symptoms (14) and weight loss
(17). In the BDI-21, the following items were removed: sleep
disturbance (16), fatigability (17), loss of appetite (18),
weight loss (19) and loss of libido (20).
Table 1. Descriptive statistical of the
sample was comprised of 140 patients, of whom 38 (27.1%) were
female (Table 1). Based on H&Y stage classification, 8 patients
were stage 1; 68 patients stage 2; 57 patients stage 3; and 7
patients either stage 4 or 5. According to ICD-10 criteria, 62
patients (44.2%) were found to be depressed; of those, 11
patients exhibited signs of mild depression, 28 of moderate
depression and 23 of severe depression. In the non-depressive
patient group, 66.6% were in H & Y stages 1 and 2. Among
severely depressed patients, 69.5% were in the more advanced
stages (between 3 and 5). The somatic syndrome recognized by
ICD-10 was present in 18.2% of the subjects suffering from mild
depression, in 60.8% of patients with moderate depression and
in 95.7% of severely depressed patients.
patients showed significantly higher UPDRS scores than
non-depressed patients in evaluation sections and total score
(Student’s t p< 0.001). They also showed higher scores on the
S&E Scale ( Student’s t p< 0.001) (Table 1). Both scales were
found to be poor at recognizing mild depression, whether or not
somatic items were considered. The HDRS-21 resulted in 0.5
(cutoff 9/10), the highest ROC recorded in the study. The best
LR+ was 1.43 (cutoff 10/11), recorded by BDI-21.
scores for moderate depression were recognized by BDI-21 (cutoff
13/14) and BDI-16 (cutoff 8/9). The highest LR+ of 1.93 was
obtained using BDI-16, but the 95% CI range was too broad.
Patients with severe depression scored a high ROC value of 0.88
on both HDRS-21 (cutoff 22/23) and BDI-21 (cutoff 17/18).
both scales were found to be sensitive, with or without somatic
items, but both provided poor specificity, except in cases of
severe depression (HDRS- 21 >0.79). Concordance values,
following Kappa randomness correction, resulted in a cutoff
point of 22/23 for HDRS-21, equivalent to 56% of the severely
depressed patients (Table 2, Fig. 1).
objective of this study was to assess the concurrent validity of
the HDRS-21 and BDI-21 evaluation scales, often used to detect
and measure depression in PD patients, using the ICD-10 criteria
as a gold standard. Based on ROC measurement of discriminative
ability, our results suggest that both scales were poor at
recognizing mild depression, somewhat better at recognizing
moderate depression and adequate for discerning severe
depression, though with poor specificity.
A comparison of
HDRS-21, HDRS-12, BDI-21 and BDI-16 determinations of Se, Sp,
PPV, NPV, LR+ and LR- provided similar results for each level of
depression (Table 2). No important differences were noted
between the complete scales (using all 21 items) and their
abbreviated forms (without somatic items). We conclude that HDRS
and BDI possess similar psychometric properties; our results,
however, cannot be compared with those of other studies that
have used DSM-IV criteria as their gold standard.
We find that
BDI-21 is inadequate for evaluating depression in PD patients.
Our results are similar to those reported by Leentjens et al. in
2000,13 yet contrary to the conclusion of that same group of
researchers six years later (2006), when they found BDI-21 to be
a legitimate scale for depression evaluation in PD patients,
stating: “The BDI is a valid, reliable, and potential responsive
instrument to assess the severity of depression in PD. However,
an adjusted cutoff is recommended.”29 Our results indicate that
if the cut-off is raised, specificity improves, but sensitivity
to distinguish between mild, moderate and severe depression in
We believe that
the aforementioned contradictory results can be partially
explained by the use of the ICD-10 criteria as the gold standard.
We suggest that this scale is more appropriately used with
“continuity criteria.”30 We also consider it advantageous to
classify patients into groups –mild, moderate or severe–
according to disease severity, since such grouping is likely to
make therapeutic approaches more effective.31,32
conclusion drawn from our results is that the somatic items do
not improve the psychometric qualities of these scales. Their
use made little difference in α C values obtained and results
were not influenced by the number of items in a scale.25,33 The
ROC, Se, Sp, PPV, NPV, LR+ and LR- values were also only
minimally modified by somatic items that did not contribute to
an improvement in patient classification. It appears, therefore,
that an evaluation of these somatic items is an unnecessary,
timeconsuming step in a setting of high healthcare demand such
as an outpatient clinic.34
Clic en la Imagen para verla
en tamaño completo.
Table 2. Cut point for maximun sensibility
We thus conclude that
somatic items should not be evaluated in depressed PD
patients, contrary to the conclusion
reached by Levin et al.,14 who proposed seven somatic
items to be used in patient evaluation.
We observed that the greater
the severity of the illness, the higher the possibility
of suffering from somatic syndrome and the greater the
disability (UPDRS) or detriment to daily activities
(S&E) (p<0.001). These observations led to the following
conclusions: (1) that the evaluation scales and criteria
that comprise them were not designed for PD; (2) that
the somatic items observed in our patients were a
product of PD; and (3) as the severity of the
illness increased, so did the number of items that were
confused as elements of depression.
The problem of somatic items
is complex. Some authors contend that this kind of
symptoms may exaggerate the prevalence of depression in
PD. As an example, Hoogendijk et al.35
used an exclusively diagnostic and etiological
methodology that appeared to reduce the prevalence of
depression in their study from 23% to 13% of their
patients. In addition, the use of somatic items in
depression scales for diagnosing the general population
has been questioned.9,11 An international survey
on depression and somatization36 concluded that the
enormous variability in frequency of somatic items was
determined by cultural differences rather than by the
items themselves. Finally, evaluation scales that do not
incorporate these items have been used convincingly to
gauge therapeutic efficacy in depression among the
The scales in question were designed to quantify
depression intensity. Following Haynes et al.,39
when an evaluation tool for an illness such as
depression is used in a distinct context (situational or
patient group), its validity is likely to be affected.
Consequently, we are concerned about the ambiguity that
arises in diagnosing patients who test as false
positives, especially those receiving high scores on
We believe that some PD patients with significant levels
of depression may not be properly diagnosed due to a
lack of appropriate criteria and the use of an
inadequate gold standard. For example, in another
community-based study on the prevalence of depression in
PD patients, Tandberg et al.2
found that only 7.7% of the sample met DSMIV criteria
for MDD, but 24.1% had scores >18 on the BDI. Such
results generate important questions: What is the
patient’s actual diagnosis and should anti-depressive
treatment be given?
A possible explanation for these results is
inappropriate content of evaluation tools that were used
out of context, on patients different from those they
were designed to evaluate. We think that such results
may be due to PD- pecific symptoms and signs erroneously
attributed to depression35
and we share the belief that depression, as an illness,
may need to be redefined.39
A recent publication on non-motor symptoms in PD40
showed that constipation was present in 46.7% of PD
patients, nocturia in 66.7%, weight problems in 22.0%,
sexual difficulties in 24.4% and insomnia in 40.6%,
among other medical problems. It would seem extremely
difficult to ask patients to determine for themselves if
their own medical problems are symptoms of PD or of a
depressive episode. Their doctors face the same
Finally, we believe that the development of a scale
specifically designed to measure depression in PD is of
vital importance. We also stress the importance of
evaluating patients using diagnostic criteria when they
present numerous or significant depressive symptoms.41
1. Kostic VS, Stefanova E, Dragasevic N, Potrebic S.
Diagnosis and Treatment of Depression in Parkinson’s
Disease. In: Bédard M-A, Agid Y, Chouinard S, Fahn S,
Korczyn AD, Lespérance P, editors. Mental and Behavioral
Dysfunction in Movement Disorders. Totowa: Humana Press;
2003. p. 351-368.
2. Tandberg E, Larsen JP, Aarsland D, Cummings JL. The
Occurrence of Depression in Parkinson’s Disease. A
Community- Based Study. Arch Neurol 1996; 53: 175-9.
3. Mindaham RHS. Psychiatric Symptoms in Parkinsonism. J
Neurol Neurosurg Psychiatry 1970;33:577-83.
4. Kremer J, Starkstein SE. Affective Disorders in
Parkinson’s Disease. International Review of Psychiatry
5. Shulman LM, Taback RL, Rabinstein AA, Weiner WJ.
Non-Recognition of Depression and Other Non-Motor
Symptoms in Parkinson’s Disease. Park Rel Dis 2002; 8:
6. Hamilton M. A Rating Scale for Depression. J Neurol
Neurosurg Psychiatry 1960; 23: 56-62
7. Beck AT, Ward CH, Mendelson M, Mock J, Erbaugh J. An
Inventory for Measuring Depression. Arch Gen Psychiatry
1961; 4: 561-71.
8. Chaudhuri KR, Yates L, Martínez-Martín P. The
Non-Motor Symptom Complex of Parkinson’s Disease: A
Comprehensive Assessment Is Essential. Current Neurology
and Neuroscience Reports 2005; 5: 275-83.
9. Linden M, Borchelt M, Barnow S, Geiselman B. The
Impact of Somatic Morbidity on the Hamilton Depression
Rating Scale in the Very Old. Acta Psychiatr Scand 1995;
10. Leentjens AFG, Marinus J, Van Hilten JJ, Lousberg R,
Verhey FGJ. The Contribution of Somatic Symptoms to the
Diagnosis of Depressive Disorder in Parkinson’s Disease:
A Discriminate Analytic Approach. J Neuropsychiatry Clin
Neurosci 2003; 15: 74-7.
11. Licht RW, Qvitzau S, Allerup P, Bech P. Validation
of the Bech - Rafaelsen Melancholia Scale and the
Hamilton Depression Scale in Patients with Major
Depression: Is the Total Score A Valid Measure of
Illness Severity? Acta Psychiatr Scand 2005; 11: 144-9.
12. Hammond M. Rating Depression Severity in the Elderly
Physically Ill Patient: Reliability and Factor Structure
of the Hamilton and the Montgomery-Asberg Depression
Rating Scales. Int J Geriat Psychiatry 1998; 13: 257-61.
13. Leentjens AFG, Verhey FRJ, Luijckx G-J, Troost J.
The Validity of the Beck Depression Inventory As a
Screening and Diagnostic Instrument for Depression in
Patients with Parkinson’s Disease. Mov Disord 2000; 15:
14. Levin BE; Llabre MM, Weiner WJ. Parkinson’s Disease
and Depression: Psychometric Properties of the Beck
Depression Inventory. J Neurol Neurosurg Psychiatry
1988; 51: 1401-4.
15. American Psychiatric Association. Diagnostic and
Statistical Manual of Mental Disorders. 4th ed (DSM-IV).
Washington DC: American Psychiatric Association; 1994.
16. World Health Organization. Pocket Guide to the ICD-
10 Classification of Mental and Behavioural Disorders.
London: Churchill Livingstone; 1994.
17. Gibb WRG, Lees AJ. The Relevance of the Lewy Body to
the Pathogenesis of Idiopathic Parkinson’s Disease. J
Neurol Neurosurg Psychiatry 1988; 51: 745-52.
18. Pfeiffer E. A Short Portable Mental Status
Questionnaire for the Assessment of Organic Brain
Deficit in Elderly Patients. J Am Geriatr Soc 1975; 23:
19. Fahn S, Elton RL, and Members of the UPDRS
Development Committee. Unified Parkinson’s Disease
Rating Scale. In: Fahn S, Marsden CD, Goldstein M, Calne
DB, editors. Recent Development in Parkinson’s Disease,
Vol 2. New Jersey: McMillan; 1987. p. 153-163.
20. Hoehn MM, Yahr MD. Parkinsonism: Onset, Progression,
and Mortality. Neurology 1967; 17: 427-42.
21. Schwab RS, England AC. Projection Technique for
Evaluating Surgery in Parkinson’s Disease. In:
Gillingham FJ, Donaldson IML, editors. Third Symposium
on Parkinson’s Disease. Edinburgh: E and S Livingstone;
1969. p. 152-157.
22. Bobes J, Bulbena A, Luque A, Dal-Ré R, Ballesteros
J, Ibarra N y el Grupo de Validación en Español de
Escalas Psicométricas (GVEEP). Evaluación psicométrica
comparativa de las versiones en español de 6, 17 y 21
ítems de la Escala de valoración de Hamilton para la
evaluación de la depresión. Med Clin (Barc) 2003; 120:
23. Sanz J, Vásquez C. Fiabilidad, validez y datos
normativos del Inventario para la Depresión de Beck.
Psicothema 1998; 10: 303-18.
24. Spitzer RL, Gibbon M, Williams JB. Structured
Clinical Interview for Axis I DSM-IV Disorders (SCID).
Washington. DC: American Psychiatric Association Press.
25. Streiner DL. Starting at the Beginning: An
Introduction to Coefficient Alpha and Internal
Consistence. Journal of Personality Assessment 2003; 80:
26. Fletcher RW, Fletcher SW. Clinical Epidemiology. The
Essentials. 4th ed. Philadelphia: Lippincott Williams &
27. Norman GR, Streiner DL. Biostatistics. The Bare
Essentials. 2nd ed. Hamilton: BC Decker. Inc; 2000.
28. Streiner DL, Norman GR. PDQ Epidemiology. 2nd ed.
Hamilton: BC Decker. Inc; 1998.
29. Visser M, Leentjens AFG, Marinus J, Stiggelbout AM,
van Hilten J. Reliability and Validity of the Beck
Depression Inventory in Patients with Parkinson’s
Disease. Mov Disord 2006; 21: 668-72.
30. Streiner DL, Norman GR. Health Measurement Scales: A
Practical Guide to Their Development and Use. Oxford:
Oxford University Press; 2003.
31. Mann JJ. The Medical Management of Depression. N
Engl J Med 2005; 353: 1819-34.
32. Leentjens AFG. Depression in Parkinson’s Disease:
Conceptual Issues and Clinical Challenges. J Geriatr
Psychiatry Neurol 2004; 17: 120-6.
33. Schulzer M, Calne S, Calne DB. Present and Future of
Quality of Life Questionnaires in Parkinson’s Disease: A
Statistical Evaluation. In: Martínez-Martín P, Koller WC,
editors. Quality of Life in Parkinson’s Disease.
Barcelona: Masson; 1999. p. 161-180.
34. Pies R, Rogers D. The Recognition and Treatment of
Depression: A Review for the Primary Care. URL: http://Medscape.
Release date, September 30, 2005, [20/10/2005].
35. Hoogendijk WJG, Sommer IEC, Tissingh G, Deeg DJH,
Wolter Ech. Depression in Parkinson’s Disease. The
Impact of Symptom Overlap on Prevalence. Psychosomatics
1998; 39: 416-21.
36. Simon GE, VonKorff M, Piccinelli M, Fullerton C,
Ormel J. An International Study of the Relation between
Somatic Symptoms and Depression. N Engl J Med 1999; 341:
37. McIntyre R, Kennedy S, Bagby RM, Bakish D. Assessing
Full Remission. J Psychiatry Neurosci 2002; 27: 235-9.
38. O’Sullivan RL, Fava M, Agustin C, Baer L, Rosenbaum
JF. Sensitivity of the Six-Item Hamilton Depression
Rating Scale. Acta Psychiatr Scand 1997; 95: 379-84.
39. Haynes SN, Richard DCS, Kubany ES. Content Validity
in Psychological Assessment: A Functional Approach to
Concepts and Methods. Psychological Assessment 1995; 7:
40. Chaudhuri KR, Martínez-Martín P, Schapira AHV,
Stocchi F, Sethi K, Odin P, et al. International
Multicenter Pilot Study of the First Comprehensive Self-Completed
Nonmotor Symptoms Questionnaire for Parkinson’s Disease:
The NMSQuest Study. Mov Disord 2006; 21: 916-23.
41. Marsh L, McDonald WM, Cummings J, Ravina B, and the
NINDS/NIMH Work Group on Depression and Parkinson´s
disease. Provisional Diagnostic Criteria for Depression
in Parkinson’s Disease: Report of an NINDS/ NIMH Work
Group. Mov Dis 2006; 21: 148-58.