Evaluating Communication Skills of Geriatrics Fellows: Interrater Agreement of an Objective Structured Clinical Examination

Carta Acesso aberto

Produção Nacional Revisado por pares

Evaluating Communication Skills of Geriatrics Fellows: Interrater Agreement of an Objective Structured Clinical Examination

2016; Wiley; Volume: 64; Issue: 1 Linguagem: Inglês

10.1111/jgs.13918

ISSN

1532-5415

Autores

Marcos Daniel Saraiva, Maria Luiza de Melo Paulo, Thiago Junqueira Avelino‐Silva, Luiz Antonio Gil, Elina Lika Kikuchi, Luciana Farías, Rafael Lyra Rodrigues Alves, G Suzuki, Fábio Cesar Olivieri, Valmari Cristina Aranha, Leonardo da Costa Lopes, Maria Cristina Guerra Passarelli, Júlio César Moriguti, Eduardo Ferrioli, Chao Lung Wen, Daniel Apolinário, Wilson Jacob Filho,

Tópico(s)

Clinical Reasoning and Diagnostic Skills

Resumo

To the Editor: Evaluating communication skills of geriatric fellows is a matter of paramount importance, but there has been limited work on this topic.1, 2 The Division of Geriatrics of the University of São Paulo Medical School conducts an annual Objective Structured Clinical Examination (OSCE), but a concern related to interrater agreement on communication skills has been raised.1 The purpose of this study was to evaluate the interrater reliability of a communication assessment scale. The OSCE model consists of eight stations, each to be completed in 10 minutes. A standardized communication assessment scale was developed in 2009 and has been amended over the years. In 2013, one of the communication stations was video-recorded to analyze its reliability. In this station, the fellow's main task is to address the concerns of a fictitious patient′s daughter. During the hospitalization, the patient had developed delirium and received inappropriate sedative drugs, resulting in drowsiness and pulmonary aspiration. Two members of the faculty played the standardized role of the patient's daughter, who behaved in a defiant and hostile way. Six examiners (five geriatricians, one psychologist) who had no previous contact with the fellows evaluated the video-recorded station. Examiners received written instructions on how to rate the fellow's performance according to a 16-item, 5-point Likert scale (1 = very poor, 2 = poor, 3 = fair, 4 = good, 5 = perfect). Written consent was obtained from all examinees. Interrater agreement for each of the 16 Likert items was calculated using an unweighted Cohen kappa. Ratings were averaged into a global score, and intraclass correlation coefficients (ICCs) were used to estimate interrater reliability. Although the cutoff for defining good interrater agreement is context dependent, there is general agreement that the ICC should exceed 0.70. Finally, a two-way random-effects model with measures of absolute agreement was used. A single-measure ICC was calculated to represent the situation in which the grading system would rely on a single examiner at a time. An average-measure ICC represented the situation in which the grading system relied on all six examiners simultaneously. In addition, medians of average-measure ICCs were calculated using all possible combinations of two, three, four, and five examiners. Analyses were conducted using Stata version 13.1 (Stata Corp., College Station, TX). All statistical tests were two-tailed, and an alpha level of less than 0.05 was used to indicate statistical significance. Thirty-two first-and second-year geriatrics fellows participated in the study. The Cronbach alpha of the 16-item assessment was 0.87, revealing adequate internal consistency. Table 1 describes the interrater agreement for each item. According to an interpretation previously proposed,3 seven of the 16 Likert items did not have significant agreement beyond what would be expected by chance, seven had slight agreement, and two had fair agreement. Interrater agreement of the global score for single measures fell below the a priori criterion proposed for acceptability (ICC = 0.45, 95% confidence interval (CI) = 0.29–0.62). Average-measure ICCs for all six examiners largely exceeded the acceptability threshold (0.83, 95% CI = 0.71–0.91). Medians for average-measure ICCs were 0.61 (95% CI = 0.54–0.68) for two examiners, 0.71 (95% CI = 0.66–0.72) for three examiners, 0.76 (95% CI = 0.73–0.78) for four examiners, and 0.81 (95% CI = 0.77–0.82) for five examiners. This study revealed that evaluating communication skills based on a single examiner results in considerably lower reliability levels than generally considered adequate. Grading systems need to be based on three or more examiners to achieve acceptable interrater agreement. In prior studies, interrater agreement for communication skill assessments varied from 0.38 to 0.74 in different contexts, but none of these studies targeted geriatrics fellows.4, 5 A systematic review designed to investigate the reliability of OSCEs found that communication stations have lower interrater reliability than clinical stations, probably because of the more-subjective criteria used to rate communication.6 Other recent studies have also shown that is challenging to assess communication skills. The Liverpool Undergraduate Communication Assessment Scale, an OSCE-based evaluation, required average measures of four examiners to achieve an ICC of 0.73.7 In this context, this study demonstrated that the video-recorded OSCE is a promising method of providing accurate ratings of communication skills of geriatrics fellows by allowing the participation of multiple examiners.8 Further studies are needed to develop better rating strategies appropriate to a variety of clinical scenarios in geriatric medicine. The authors wish to thank Flávia Campora, MD, and Aline Thomaz Soares, MD, for playing the role of the patient's daughter in our OSCE station. Conflict of Interest: The editor in chief has reviewed the conflict of interest checklist provided by the authors and has determined that the authors have no financial or any other kind of personal conflicts with this paper. Author Contributions: Louzada Farias, Rodrigues Alves, Sayuri Suzuki, Jacob-Filho: study concept and design. Olivieri, Aranha, da Costa Lopes, Guerra Passarelli, Moriguti, Ferrioli, Wen: acquisition of subjects and data. Saraiva, de Melo Paulo, Avelino-Silva, Lika Kikuchi, Apolinário, Jacob-Filho: data analysis and interpretation. Saraiva, de Melo Paulo, Avelino-Silva, Lika Kikuchi, Apolinário, Jacob-Filho: manuscript preparation. Sponsor's Role: None.

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Evaluating Communication Skills of Geriatrics Fellows: Interrater Agreement of an Objective Structured Clinical Examination