Improving accuracy of disease prevalence estimates by combining information from administrative health records and electronic medical records

Thumbnail Image
Al-Azazi, Saeed
Journal Title
Journal ISSN
Volume Title
Administrative health records (AHRs) and electronic medical records (EMRs) are the two main sources of population-based data for chronic disease surveillance in Canada. Misclassification errors exist in both databases, which can bias estimates of disease prevalence and incidence. The objectives were to evaluate the accuracy of rule-based and probabilistic-based methods to combine error-prone sources using computer simulation and to demonstrate how to use these methods with a numeric example. Four data-combining methods were compared: rule-based ‘OR’ method, rule-based ‘AND’ method, rule-based sensitivity-specificity adjusted (RSSA) method and probabilistic-based sensitivity-specificity adjusted (PSSA) method. The methods were demonstrated using linked AHRs and EMRs to ascertain cases of hypertension. The ‘OR’ and ‘AND’ methods are recommended when there is sufficient overlap between measures of disease status. The RSSA method depends on the choice of sensitivity and specificity estimates. The PSSA method performs well when true prevalence is high and correlations amongst covariates are low.
Prevalence, Misclassification, Data source, Data-combining methods