Effectiveness of Psychotherapy for Personality Disorders
1999; American Psychiatric Association; Volume: 156; Issue: 9 Linguagem: Inglês
10.1176/ajp.156.9.1312
ISSN1535-7228
AutoresJ. Christopher Perry, Elisabeth Banon, Floriana Ianni,
Tópico(s)Mental Health and Psychiatry
ResumoBack to table of contents Previous article Next article Special ArticleFull AccessEffectiveness of Psychotherapy for Personality DisordersJ. Christopher Perry, M.P.H., M.D., Elisabeth Banon, M.D., and Floriana Ianni, M.D.J. Christopher PerrySearch for more papers by this author, M.P.H., M.D., Elisabeth BanonSearch for more papers by this author, M.D., and Floriana IanniSearch for more papers by this author, M.D.Published Online:1 Sep 1999https://doi.org/10.1176/ajp.156.9.1312AboutSectionsView articleAbstractPDF/EPUB ToolsAdd to favoritesDownload CitationsTrack Citations ShareShare onFacebookTwitterLinked InEmail View articleAbstractOBJECTIVE: The authors examined the evidence for the effectiveness of psychotherapy for personality disorders in psychotherapy outcome studies. METHOD: Fifteen studies were located that reported data on pretreatment-to-postreatment effects and/or recovery at follow-up, including three randomized, controlled treatment trials, three randomized comparisons of active treatments, and nine uncontrolled observational studies. They included psychodynamic/interpersonal, cognitive behavior, mixed, and supportive therapies. RESULTS: All studies reported improvement in personality disorders with psychotherapy. The mean pre-post effect sizes within treatments were large: 1.11 for self-report measures and 1.29 for observational measures. Among the three randomized, controlled treatment trials, active psychotherapy was more effective than no treatment according to self-report measures. In four studies, a mean of 52% of patients remaining in therapy recovered—defined as no longer meeting the full criteria for personality disorder—after a mean of 1.3 years of treatment. A heuristic model based on these findings estimated that 25.8% of personality disorder patients recovered per year of therapy, a rate sevenfold larger than that in a published model of the natural history of borderline personality disorder (3.7% recovered per year, with recovery of 50% of patients requiring 10.5 years of naturalistic follow-up). CONCLUSIONS: Psychotherapy is an effective treatment for personality disorders and may be associated with up to a sevenfold faster rate of recovery in comparison with the natural history of disorders. Future studies should examine specific therapies for specific personality disorders, using more uniform assessment of core pathology and outcome. Individuals with personality disorders have pervasive and long-standing traits affecting their perception and thinking about themselves and others, their impulse and affect regulation, and their interpersonal and other social role functioning. Their long-standing impairment in functioning and their personal distress are well documented (1–4). They are also heavy users of mental health resources (5). Nonetheless, there is a paucity of empirical studies examining the outcome of personality disorders treated with psychotherapy, with few controlled studies. Given current concerns about cost containment and accountability, there is reason to examine the evidence to date for the efficacy of psychotherapy for personality disorders. This need for evidence of efficacy applies also to psychopharmacologic treatments for personality disorders, which have tended to produce exciting findings in early and/or open studies followed by more disappointing or equivocal findings in later and/or more-controlled studies (6). We considered it timely to review the small but growing empirical literature on the outcome of personality disorders treated with psychotherapy. While controlled treatment studies are few, systematic examination of these along with naturalistic treatment studies may reveal consistent findings relevant to the efficacy of psychotherapy for personality disorders.We found 15 studies reporting empirical data on the outcome of personality disorders after either short- or long-term psychotherapy that fitted our criteria for analysis. These studies used a variety of both observer- and self-rated outcome measures, assessing response to treatment at follow-ups of variable durations. Taking these differences across studies into account, we addressed four questions. 1) What is the evidence for improvement in symptoms, social role functioning, or core psychopathology after psychotherapy? 2) Do different patterns of change emerge when one compares observer-rated and self-rated outcome measures? 3) What is the relation between treatment duration and the amount of improvement? 4) What is the evidence that individuals with personality disorders recover following psychotherapy, and how long does it take?There is a widely held belief that personality disorders are intractable to psychotherapy. This review should help clinicians test this belief against the available empirical evidence. We also hope that the findings will stimulate further research to improve the psychotherapeutic treatments currently available.METHODWe collected reports of empirical outcome studies on the psychotherapy of personality disorders that were published from 1974 to 1998, using computer searches (MEDLINE and PsycINFO) supplemented by a manual search. We included studies that 1) used systematic methods to make personality disorder diagnoses, 2) used validated outcome assessments, and 3) reported data that allowed either calculation of within-condition effect sizes or determination of recovery from the personality disorder. Given the potential for systematic differences due to measurement perspective (7), we examined self-report and observer-rated outcome measures separately. Whenever the data were available, we examined the percentage of subjects with personality disorders who recovered versus time in treatment. When multiple reports were available from the same study, we selected the one with the longest posttreatment follow-up. Fifteen studies (8–22) met these inclusion criteria.Because different studies used different outcome measures, durations of treatment, and follow-up periods, we converted their results into effect sizes to facilitate comparing results across studies. In a randomized, controlled treatment trial, the usual way of calculating an effect size for a given measure represents whether the final mean scores of the experimental and control treatment groups differ. The difference between these means at the end of treatment is then divided by the standard deviation of the difference between the pretreatment score and the posttreatment score of all subjects. This number represents the degree to which the results deviate from the null hypothesis. Cohen (23, p. 40) considers the magnitude of such effect sizes to be interpretable as follows: 0.20=small effect, 0.50=medium, and 0.80=large.The 15 studies required us to make the following deviations from the method described above. First, there were only three randomized, controlled treatment trials (10, 16, 17). The remaining studies presented naturalistic observations of patient groups in treatment or comparisons of two active treatments. We adjusted for this by calculating within-condition effect sizes for all studies. We subtracted the pretreatment score from the posttreatment score for each measure and then divided by the standard deviation of the score at intake. As appropriate, signs were reversed so that a positive effect size always indicated improvement. Because this method does not adjust for change that might occur in the control group of a randomized, controlled treatment trial, the magnitude of the effect size may be different than it would be with the use of Cohen’s method.In the randomized, controlled treatment trials (10, 16, 17), because the comparison groups were small and usually of unequal size, the standard deviations at baseline often varied between the comparison groups. Using the smaller of two standard deviations in the denominator would yield a larger effect size. To make the effect sizes more comparable across studies, whenever there was more than one patient group, we calculated a pooled baseline standard deviation, which has also been suggested by Rosenthal (24).We were interested in long-term change and, therefore, examined results at the last follow-up. Whenever studies included multiple measures, we calculated the effect size for each measure separately, then reported the median values of the effect sizes for the self-report and observer-rated measures, thus summarizing the study’s overall results. We then examined the relationships between other variables of interest, such as duration of treatment, and these summary effect sizes, using nonparametric Spearman correlations. Data on recovery from a personality disorder were plotted as a function of treatment duration, and then number of treatment sessions, with the use of simple linear regression to estimate the percentage of each study group that had recovered at follow-up. Survival analysis was not used because the unit of observation was the study group rather than the individual case. Finally, the number of studies meeting our criteria was relatively small, and therefore statistical power was quite low. Not including analyses for potential confounders and follow-up analyses, we planned eight a priori tests for the four hypotheses. The Bonferroni correction would set alpha at p=0.006, although given the interdependence of the outcomes, this may be too conservative. We present statistical trends (e.g., p>0.05) alongside significant findings in order not to overlook potentially important findings and to make the best use of the data for heuristic purposes.RESULTSDescription of StudiesThe 15 studies differed with respect to the characteristics of the patients, the treatments, and the study designs.PatientsFour studies (8, 11, 17, 19) focused largely on borderline personality disorder, one (12) mostly on borderline personality disorder and schizotypal personality disorder, two on other specific types—avoidant (10) and antisocial (9)—and eight (13–16, 18, 20–22) on mixed types from one to all three clusters of DSM personality disorders. Thirteen studies involved outpatients, while one study (8) involved hospitalized patients and one (12) day hospital patients. Most study subjects were self-referred. Potential sources of selection bias were largely unreported.Four studies reported severity of illness at intake as the mean score on the Global Assessment Scale (GAS) or the Health-Sickness Rating Scale (in Karterud et al. [12], the mean Health-Sickness Rating Scale score=40; in Hoglend [14; and personal communication, 1996], the mean Global Assessment of Functioning Scale score=57; in Diguer et al. [15], the mean Health-Sickness Rating Scale score=47.5; and in Linehan et al. [17], the mean GAS score=35). This yielded an overall weighted mean score of 41.7 (table 1), which falls in the low DSM-IV Global Assessment of Functioning Scale score range of 41–50, characterized by “serious symptoms…OR any serious impairment in social, occupational, or school functioning.”Four studies required the presence of an axis I disorder for inclusion, specifically, opiate dependence (9), bulimia nervosa (13), or major depression (15, 21). Four studies (9, 12, 14, 16) listed the overall prevalence of one or more comorbid axis I diagnoses other than those required for inclusion, yielding a mean of 63.8% (SD=21.7%). The most prevalent diagnoses, in descending order, were mood, adjustment, anxiety, substance, somatoform, other, and eating disorders.Treatment modality and durationSix studies (11, 12, 14–16, 18) used dynamic psychotherapy, three (10, 13, 17) used cognitive behavior therapy, and three (8, 9, 21) compared the two. One (22) examined supportive psychotherapy. Two (19, 20) studied interpersonal group therapy, one of which (19) included an individual dynamic control therapy but pooled the results.Among studies of dynamic treatments, Stevenson and Meares (11) looked at the effects of a 1-year course of twice-weekly dynamic psychotherapy based on self psychology theory. Karterud et al. (12) examined the outcome of long-term dynamic psychotherapy in a 6-month day hospital program. Hoglend (14) studied the effects of intermediate-term dynamic psychotherapy lasting an average of 27.5 sessions. Winston et al. (16) examined the differential effectiveness of two dynamic therapies—short-term anxiety-provoking psychotherapy and brief adaptive psychotherapy—each given for 40 weeks—compared with a 15-week waiting-list control condition before treatment for the same patients. Monsen et al. (18) looked at the effect of intensive psychodynamic psychotherapy, based on self psychology and an object relations model, administered for an average of 25.4 months.Among studies of cognitive behavior treatments, Linehan et al. (17) studied the effects of dialectical behavior therapy on parasuicidal women with borderline personality disorder, treated for 1 year, compared with unspecified, community “treatment as usual.” Alden (10) studied three types of short-term behavioral therapy—graded exposure, graded exposure plus social skills training, and graded exposure plus social skills training plus an intimacy focus—administered over 10 weeks, compared with a waiting-list control group. Having found no significant differences among active treatments, she compared the pooled results with those from the control group. Fahy et al. (13) compared the outcome, after 8 weeks of cognitive behavior therapy, of patients with bulimia nervosa with and without a comorbid personality disorder.Liberman and Eckman (8) compared the effects of insight-oriented psychotherapy and behavioral therapy in a 10-day hospitalization program. Woody et al. (9) compared the effects of adding either supportive-expressive or cognitive behavior psychotherapy to drug counseling among opiate addicts for 24 weeks. The two types of psychotherapy were of equal efficacy, so the researchers compared the pooled results with those of drug counseling alone. Hardy et al. (21) compared dynamic/interpersonal and cognitive behavior therapies for outpatients with a major depressive episode, with or without a personality disorder, over either 8 or 16 weeks. The two therapies were generally equally effective.Rosenthal et al. (22) treated individuals with cluster C personality disorders with 40 sessions of supportive psychotherapy.Monroe-Blum and Marziali (19) compared interpersonal group psychotherapy for 35 weeks with open-ended individual dynamic psychotherapy for borderline personality disorder. Budman et al. (20) conducted 72 90-minute sessions of interpersonal group psychotherapy over 18 months.Nine studies (9, 14–17, 19–22) used explicit treatment manuals. Stevenson and Meares (11) used weekly seminars and therapist supervision instead of a manual to increase adherence to a particular type of therapy.Concurrent use of medication was rarely reported. Woody et al. (9) reported that all subjects were on methadone maintenance. Linehan et al. (17) reported less use of psychotropic medication by the experimental group than by the treatment-as-usual group at follow-up.Treatment duration was highly variable, with a median of 28 weeks and a median of 40 sessions (Table 1). Follow-up was done in 14 studies, with a median of 10.5 months. Frequency of sessions varied from daily, for inpatients (8) and day hospital patients (12), to once or twice weekly for outpatient psychotherapies (9–11, 13–22).Study designsThree studies (10, 16, 17) were randomized, controlled treatment trials with a waiting-list or nonspecific treatment condition; three (8, 19, 21) were randomized comparisons of two active treatments, although one (19) pooled the results of the comparison groups. Stevenson and Meares (11) used a patient-as-own-control design, comparing their sample 1 year before and 1 year after active treatment. The other studies (9, 12–15, 18, 20, 22) reported naturalistic observation of treatment groups.Outcome measuresThree studies (13, 16, 22) reported only self-rated measures, and two (14, 20) only observer-rated measures, while the remaining ones (9–12, 15, 17–19, 21) reported both. The most frequently used self-report outcome measures were the Symptom Checklist-90-R, target complaints, the Inventory of Interpersonal Problems, and the Beck Depression Inventory. The most frequent observer-rated measures were the Health-Sickness Rating Scale or the GAS and the Social Adjustment Scale. Two studies (14, 18) measured dynamic change with the use of reliable, valid measures.Substantive FindingsRetention-attritionThe percentages of dropouts varied greatly, with a mean of 21.8% (table 2). The highest percentages of dropouts, 42% and 51%, were found in the two longer-term group therapy conditions (19, 20). The five shorter-duration treatments (16 weeks or less) had fewer dropouts than did the nine longer-duration treatments (8.2% versus 29.3%; t=3.54, df=12, p=0.004). When duration was controlled, dropout rate did not correlate significantly with other study variables, including effect sizes.Effect sizesAt follow-up, active psychotherapies for personality disorder groups yielded unweighted mean effect sizes of 1.11 for self-report measures and 1.29 for observer-rated measures (table 2). These mean effect sizes were significantly greater than zero for both self-report measures (t=10.75, df=10, p=0.0001) and observer-rated measures (t=5.55, df=11, p=0.0002). By contrast, waiting-list or treatment-as-usual control conditions yielded lower unweighted mean effect sizes at follow-up or at the end of the waiting-list period (Table 2). However, these probability estimates did not take into account the fact that some improvement might have been due to regression to the mean. Among the three randomized, controlled treatment trials, the differences in within-condition effect size between psychotherapy and the control condition for self-report measures yielded an unweighted mean difference of 0.75, which is significantly greater than zero (t=13.18, df=2, p=0.006). The mean difference in effect sizes weighted by sample size was 0.78 (t=21.03, df=2, p=0.002). Among the two randomized, controlled treatment trials reporting observer-rated measures, the differences in effect size yielded an unweighted mean difference of 0.50 (t=4.54, df=1, p=0.14); the mean difference in effect size weighted by sample size was 0.57 (t=7.43, df=1, p=0.085). Finally, no differences were attributable to study design for either self-report (F=1.63, df=2, 9, p=0.24) or observer-rated (F=0.14, df=2, 9, p=0.87) effect sizes.One concern is a possible publication bias against studies reporting negative findings, the so-called “file-drawer problem” (24, 25). To consider this, we calculated the effect of potential unpublished studies by assuming a zero difference in effect size between active therapy and the control condition. Adding one such study would diminish our self-report findings to a trend (t=2.93, df=3, p=0.06), which would persist even if the other 12 of our 15 studies had been randomized, controlled treatment trials with unreported null effects (t=1.86, df=14, p=0.08).Table 3 displays the mean effect sizes for measures used in two or more studies. The largest were for self-report target complaints, the Beck Depression Inventory scores in the two depressed samples, and observer-rated global functioning. Next came two self-reports: the Inventory of Interpersonal Problems and general symptoms. The lowest was for ratings of social adjustment (e.g., the Social Adjustment Scale). In addition, Karterud et al. (12) reported change on the Health-Sickness Rating Scale by personality disorder type, in increasing magnitude of effect size: schizotypal personality disorder (–0.03), borderline personality disorder (0.45), other largely cluster C personality disorders (0.96), and patients without a personality disorder (1.46). Diguer et al. (15) also reported higher effect sizes for patients without a personality disorder than for those with a personality disorder. These differences suggest that diagnosis influences change in global functioning. Finally, the two studies requiring major depression reported larger mean effect sizes than the remaining 13 studies for both self-report measures (mean=1.17 versus mean=0.99; t=3.86, df=10, p=0.003) and observer-rated measures (mean=2.29 versus mean=1.10; t=2.20, df=10, p=0.05).Treatment duration and effect sizesThe correlation between duration and effect size was negative for self-reported outcomes (rs=–0.46, N=12, p=0.13), whereas it was positive for observer-rated outcomes (rs=0.14, N=12, p=0.66). However, when length of follow-up was partialed out, self-report effect size correlated with treatment duration (rs=–0.71, N=9, p=0.04), while observer-rated effect size was still nonsignificant. Following up on this, we compared the mean self-report effect sizes of the five shorter-term studies (16 weeks or less) with those of the seven longer-term studies (mean=1.38 versus mean=0.92; t=2.86, df=10, p=0.02). The larger self-report effect size for the shorter-term treatments raises the question of whether self-report measures reflect some transient change that diminishes with longer treatments, something not found with observer-rated measures.Recovery from personality disorderFour studies (11, 14, 18, 20) reported the percentage of subjects no longer meeting criteria for a personality disorder at follow-up (table 2). All used medium- to long-term dynamic/interpersonal therapies. The diagnostic composition of the study groups included cluster B and C patients, with 53% (N=42 of 79) having borderline personality disorder. The mean proportion recovered was 51.8% (t=5.23, df=3, p=0.01) after a mean of 78 sessions over a mean of 67 weeks (1.3 years).We examined percentage recovered as a function of treatment length. Inspection of a scatterplot indicated a relationship between these two variables (somewhat less so with number of sessions). We performed simple linear regressions predicting the percentage of patients in each sample who recovered, weighted by sample size, by entering the number of therapy sessions (model 1a) and treatment duration in years (model 1b). While neither model was statistically significant, they allowed us to calculate the hypothetical values for the treatment duration associated with recovery for a range of 25%–75% of cases, the approximate range of the studies’ observations. We then compared these models with a similar linear regression model derived from five natural history (not treatment) studies of recovery from borderline personality disorder previously published by the first author (1). Table 4 displays the results of this comparison.The natural history studies of borderline personality disorder (model 2) yielded an estimated recovery rate of 3.7% per year (t=3.30, df=3, p=0.05; 95% CI=0.14%–7.28%). On the basis of the four active treatment studies, model 1b produced a recovery rate of 25.8% per year (t=1.98, df=2, p=0.19; 95%CI=–5.8%– 67.4%), a rate seven times greater than that observed in the naturalistic follow-up studies. Model 1a indicated a recovery rate of 0.20% of cases per therapy session (t=0.55, df=2, p=0.64; 95% CI=–0.96%–1.36%). The 95% confidence intervals for models 1a and 1b include a recovery rate of zero. This indicates that while these models can serve heuristic purposes, they should not be accepted as validated. In table 4, models 1a and 1b suggest that 92 treatment sessions or 1.3 years of treatment would yield recovery from personality disorder according to the full criteria in 50% of mixed personality disorder subjects. By comparison, model 2 suggests that 10.5 years of naturalistic follow-up would yield recovery in 50% of subjects with borderline personality disorder. All models included only subjects still in follow-up.DISCUSSIONLimitations of the ReviewThe major limitation of this review is the availability of only 15 studies from which our conclusions are derived. This is especially problematic given differences across studies in diagnoses, severity of illness, design, treatment modality and duration, and assessment methods. However, by using meta-analysis we were able to detect some consistent patterns. Nonetheless, further validation and detection of more specific effects will require substantially more studies. Meta-analysis itself has limitations (25), such as equating studies within broad categories (e.g., dynamic or cognitive behavior therapy), which may obscure meaningful differences within treatment modalities (26).Another concern is generalizability to community populations seeking treatment. Patients not referred to a study, refusing to join, or dropping out before follow-up may differ in some significant way from patients admitted to and continuing in treatment. Any bias would limit generalization from these findings. This may be especially problematic when one is considering the results from a few studies, as we have done in comparing recovery from personality disorders. Few studies reported these data. In one exception, Stevenson and Meares (11), reported that 48 (81%) of 59 eligible patients joined the study, 11 (23%) of the 48 dropped out, and a further seven (15%) were omitted from analyses because they decided to continue treatment beyond the 1-year study period. While intention-to-treat analyses would mitigate the effects of bias due to dropout, patients with personality disorders often drop out from follow-up assessments as well as treatment.Treatment dropouts represent a special case of the potential for bias. The percentage of dropouts was significantly lower for treatments of shorter duration than for those of longer duration. After control for duration of treatment, the percentage of dropouts did not correlate with other study variables, decreasing the likelihood that dropout was a source of bias in our overall results. The overall mean rate of attrition (21%) compares favorably with that of the National Institute of Mental Health (NIMH) Treatment of Depression Collaborative Research Program (27), which had a 31% dropout rate for personality disorders across all treatments, with the largest for clusters B (40%) and A (36%) and the lowest for cluster C (28%). The mean dropout rate of 28% for our longer-duration treatment studies is comparable to the mean dropout rate of 28% for the natural history follow-up studies (1). This suggests that the present treatment studies were at no higher risk for bias due to dropout than these other studies of personality disorders. However, patient characteristics that predict dropout should be examined.It is interesting that subjects with borderline personality disorder who agreed to participate in a randomized, controlled treatment trial comparing group therapy with individual therapy (19) had a high dropout rate even before treatment began, after learning of their random treatment assignment (9% refused individual therapy and 19% group therapy, 28% total), as well as during the course of therapy (39% of those accepting assignment). In both cases the dropout rate was higher for group therapy. Budman et al. (20) reported that 51% of patients dropped out of group therapy, especially those with borderline personality disorder. These investigators subsequently modified their treatment model to include individual sessions for patients with borderline personality disorder, similar to the model of Linehan et al. (17). This suggests that acceptability to patients is a problem for group therapy in comparison with individual treatments. Further study is warranted, given the popularity of the group modality as a response to concerns about the cost of treatment. If limitation of treatment choice results in a high proportion of treatment refusal, especially for patients with borderline personality disorder, then clinical settings may ipso facto exclude patients needing treatment.There was much heterogeneity in sample selection, including differences in personality disorder types, severity of illness, comorbidity, and treatment setting. Generally, cluster A disorders were least represented. Cluster B and C disorders were about equally represented, with cluster C disorders generally involving less impairment. However, the single type most often studied was borderline personality disorder. Thus, most of our conclusions are generalizable to a mix of personality disorder types with a high proportion of borderline patients.The heterogeneity of diagnostic assessments across studies hampers comparison. This is worsened by the demonstrated lack of agreement between most diagnostic instruments when they have been compared (4, 28, 29). However, whenever a high proportion of studies report similar findings despite such differences, it indicates a robust finding or signal, despite the noise. This is the case here.The confounding of personality disorder types with treatment types and duration of treatment makes it difficult to conclude that any one type of treatment consistently demonstrates greater effects than no treatment or a comparison treatment. However, in the randomized, controlled treatment trials, all experimental treatments were superior to waiting-list or control treatment conditions.The studies assessed outcome in a variety of ways, with no single measure used by most studies. While most studies included both self-report and observer-rated measurement perspectives, several used only one. Finally, in many instances it was not clear how clinically significant the results were, or whether the patients improved into a healthy range of scores.Using pretreatment and posttreatment within-condition effect sizes permitted direct comparison of studies with different personality disorder diagnoses, study designs, outcome measures, and treatments, given that most lacked control/comparison groups. Lack of such a strategy makes meaningful summary even more difficult (30). One criticism is that the conclusions
Referência(s)