Artigo Acesso aberto Revisado por pares

Using Common Table Expressions to Build a Scalable Boolean Query Generator for Clinical Data Warehouses

2014; Institute of Electrical and Electronics Engineers; Volume: 18; Issue: 5 Linguagem: Inglês

10.1109/jbhi.2013.2292591

ISSN

2168-2208

Autores

Daniel R. Harris, Darren W. Henderson, Ramakanth Kavuluru, Arnold J. Stromberg, Todd R. Johnson,

Tópico(s)

Data Quality and Management

Resumo

We present a custom, Boolean query generator utilizing common-table expressions (CTEs) that is capable of scaling with big datasets. The generator maps user-defined Boolean queries, such as those interactively created in clinical-research and general-purpose healthcare tools, into SQL. We demonstrate the effectiveness of this generator by integrating our study into the Informatics for Integrating Biology and the Bedside (i2b2) query tool and show that it is capable of scaling. Our custom generator replaces and outperforms the default query generator found within the Clinical Research Chart cell of i2b2. In our experiments, 16 different types of i2b2 queries were identified by varying four constraints: date, frequency, exclusion criteria, and whether selected concepts occurred in the same encounter. We generated nontrivial, random Boolean queries based on these 16 types; the corresponding SQL queries produced by both generators were compared by execution times. The CTE-based solution significantly outperformed the default query generator and provided a much more consistent response time across all query types (M = 2.03, SD = 6.64 versus M = 75.82, SD = 238.88 s). Without costly hardware upgrades, we provide a scalable solution based on CTEs with very promising empirical results centered on performance gains. The evaluation methodology used for this provides a means of profiling clinical data warehouse performance.

Referência(s)