Methods for chronic disease epidemiology: longitudinal data and case definitions

Loading...
Thumbnail Image
Date
2024-07-02
Authors
Hamm, Naomi
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Administrative health data, such as physician billing claims and hospital discharge abstracts, are routinely used to produce chronic disease estimates (i.e., incidence and prevalence). Incidence and prevalence measures are estimated using case definitions, a set of rules for identifying disease cases, including disease-specific diagnosis codes. Creating valid case definitions can be challenging due to errors in diagnosis codes and changes in their use and meaning over time. Using multiple years of data that capture an individual’s longitudinal health history may be advantageous for constructing accurate case definitions. The goal of this research was to develop and evaluate chronic disease case definitions using longitudinal administrative health data. Four related studies were conducted. The first study assessed case definition sensitivity to changes in data quality using control charts, which were originally developed to monitor out-of-control processes in manufacturing. Control charts were applied to juvenile diabetes (JD) incidence and prevalence trends that were estimated using previously validated case definitions. Frequency of out-of-control observations, which may be influenced by nonrandom errors in data, was compared across case definitions using McNamar’s test with a Holm-Bonferroni adjustment and control limits based on Cohen’s effect size. No differences in incidence and prevalence trends were detected.
The second study applied control charts to multiple sclerosis (MS) incidence and prevalence trends to determine if control limit calculations to identify out-of-control observations could be generalized across diseases. Similar to JD, there were no differences in incidence and prevalence trends across case definitions. However, results indicated wider control limits may be more appropriate for MS compared to JD. The third study developed and evaluated the performance of model-based case definitions for MS that relied on trends in healthcare use to identify cases. Dynamic classification, which ascertains cases and non-cases annually, was used to estimate the average trend needed for case classification. A trend-based case definition resulted in similar estimates of validity compared to a deterministic case definition of three or more MS contacts; an observation period of unlimited duration was used to ascertain cases. However, the trend-based case definition had higher sensitivity than the deterministic case definition when the number of data years used for classification was reduced to five years, which was the estimated average trend needed. In the fourth study, I created and validated MS and JD model-based case definitions that incorporated a reclassification exit rule to account for biased prevalence trends due to misclassification. Case probabilities were calculated annually and used to reclassify individuals when the probabilities dropped below a cut-off criterion. Comparisons of prevalence trends obtained from the exit rule case definition to trends obtained from existing national case definitions revealed differences in slope trends for MS, but not for JD. This research contributes to the literature on use of administrative health data for health research and surveillance. It demonstrates the importance of considering how changes over time in data-, disease-, and case definition-based factors can be incorporated into chronic disease case definition development and application. Findings are beneficial to epidemiologists and researchers who rely on the Public Health Agency of Canada’s Canadian Chronic Disease Surveillance System to routinely and systematically monitor population health using administrative health data.

Description
Keywords
Administrative Health Data, Population Health Research, Epidemiology Methods
Citation