An Automated Approach to Identifying Patients with Dementia Using Electronic Medical Records
2017; Wiley; Volume: 65; Issue: 3 Linguagem: Inglês
10.1111/jgs.14744
ISSN1532-5415
AutoresDavid B. Reuben, Andrew S. Hackbarth, Neil S. Wenger, Zaldy S. Tan, Lee A. Jennings,
Tópico(s)Machine Learning in Healthcare
ResumoTo the Editor: With the increased interest in clinical detection and management of Alzheimer's disease and related dementias, health systems and researchers have needed to quickly identify persons with these disorders to enroll them in care programs, recruit them into trials, and study the natural history and outcomes of dementia. This identification is generally done prospectively using a two-step process of screening followed by diagnostic assessment. However, this process is slow and expensive. To efficiently identify persons with dementia who could serve as a comparison group for a dementia management program,1 we created and validated an automated electronic health record dementia identification method. We initially focused on 3 data sources contained in the UCLA electronic health record (an Epic-based record) that would indicate the presence of dementia. These included: (1) a recorded International Classification of Disease-9 (ICD-9) diagnosis of dementia including the following codes: 290.0, 290.1X, 290.2X, 290.3, 290.4X, 290.8, 290.9, 291.1, 291.2, 292.82, 294.0, 294.10, 294.11, 294.20, 331.0, 331.82, 331.11, and 331.19; (2) documentation that the patient was taking medications whose primary indication is to treat dementia (cholinesterase inhibitors and memantine); and (3) natural language processing (NLP) of history and physical notes, consult notes, discharge summary notes, and progress notes for evidence of patient dementia. In the NLP logic, dementia was operationalized as the presence of the terms "dementia" or "neurodegenerative" without the presence of any of the following markers in the same statement: (1) negating words, such as "not", "negative", or "ruled out"; (2) words that referred to the patient's family history or a family member's dementia status, such as "family history", "wife", or "husband"; or (3) words that indicated uncertainty, such as "suspected", "possible", or "risk". We first examined the positive predictive value (PPV), true positives divided by all positives, of various combinations of these elements (all 3 elements and 3 different combinations of 2 elements) in diagnosing dementia compared to the gold standard of identification of documentation of dementia diagnosis by a physician during a medical record review. Physicians reviewed 60 medical records (15 patients meeting each combination, 5 in each of 3 age groups—40–64, 65–84, 85 and older). Early on, we noted a high number of false positives in the younger age group (45–64 years) and restricted the identification method in younger patients to only those who had dementia ICD-9 documented on at least two encounters. Based on initial analyses, the medications element was dropped and we reviewed an additional 64 cases that were identified only by NLP+ICD9 elements. PPV for patients of all ages and for those ≥65 years were weighted by their age representation in the entire dementia sample identified by the algorithm. Lastly, we estimated the sensitivity of the final and 3 element algorithms by applying these to the 989 patients who were enrolled in the UCLA Alzheimer's and Dementia Care (ADC) program with verified dementia. To produce a better estimate of how the algorithm would perform under typical circumstances, the dementia status of the patient was evaluated as of 3 weeks prior to ADC program start; this prevented the algorithm from taking advantage of the large volume of dementia-related documentation generated when a patient joined the program. Findings are presented in Table 1. The 3-element model had a PPV of 87% but the 2-element models that included medications were considerably lower (27% and 47%). Based on this examination, we dropped medications from the method. When analyzed by age group, the approach was much less accurate among those 40–64 years; when this younger group was excluded, the PPV was high (93%). When the final algorithm was tested on patients of all ages with verified dementia, the sensitivity was 63% but only 35% if all 3 elements were required. In summary, using a combination of ICD diagnoses and NLP, we were able to create a method to identify older persons who have dementia that has high positive predictive value – especially among older patients – and reasonably high sensitivity. Although initially we thought that dementia medications might be a useful indicator, this was not the case, perhaps because physicians may be prescribing these medications for "off-label" conditions (e.g., cholinesterase inhibitors for mild cognitive impairment, memantine for migraine headaches). Moreover, requiring all 3 elements reduced the sensitivity considerably. We also learned that the method was less accurate for younger patients, who may have neurologic disorders or conditions (e.g., HIV) that affect cognition but may not be dementia and who as an age group have a lower prevalence of dementia. There are several advantages of such an automated dementia identification system. First, health systems aiming to implement programs aimed at older persons with existing dementia can readily identify appropriate patients. Second, researchers can quickly identify potential research participants and then confirm eligibility criteria. Finally, data on patients identified can be used for observational studies or to create a comparison group for studies that are not randomized clinical trials. The authors would like to acknowledge Emmett Keeler, PhD for comments on drafts of the manuscript, Robin Clarke, MD for facilitating access to data, and Anthony Yaney for project management. Conflicts of interest: Dr. Hackbarth: The database code used to implement the dementia discovery algorithm, which is owned by the University of CA, is licensed to Ursa Health, a commercial organization in which I hold equity and serve as an officer. Dr. Wenger: Grants currently affiliated with: HRSA, ABIM, Unithealth. All other authors report no conflicts of interest. Author Contributions: All authors: study conceptualization, study design, data acquisition, interpretation, and drafting of this manuscript. ASH, LAJ: statistical analysis. All authors: critical revision of the manuscript for important intellectual content. Sponsor's Role: N/A.
Referência(s)