Validation of algorithms to identify human immunodeficiency virus cases using administrative data in Manitoba

Thumbnail Image
Anderson, Alexandrea
Journal Title
Journal ISSN
Volume Title
Introduction: Administrative data are valuable for describing Human Immunodeficiency Virus (HIV) cases and their health outcomes, but it is important to first validate these data to assess their accuracy. To date, most HIV and Acquired Immunodeficiency Syndrome (AIDS) algorithms were validated using USA Medicare/Medicaid data. Two Canadian studies (Ontario and British Columbia) validated HIV algorithms, but the prescription and laboratory data used varied from those available in Manitoba and the reference standard in Ontario and BC were not population- based. The objective of this study was to validate algorithms consisting of physician visit, hospitalization, and antiretroviral prescription data against positive confirmatory HIV laboratory tests to identify Manitobans living with HIV. Methods: The validation cohort consisted of Manitobans with a valid Personal Health Identification Number and at least three years of continuous health coverage between 2007 and 2018. Positive confirmatory HIV tests from Cadham Provincial Laboratory were the reference standard. Fifteen algorithms requiring two or three years of data were evaluated. Seven measures of accuracy were calculated for each algorithm: sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), Youden’s J statistic, kappa statistic, and area under the receiver operating characteristic curve (AUC). Four sensitivity analyses were also completed. Results: The validation cohort included 1,454,010 individuals, of which 1,589 were HIV cases and 1,452,421 were HIV non-cases. Algorithm sensitivity ranged from 81.1% to 96.5%. PPV ranged from 44.1% to 96.0%. Specificity and NPV were very high for all algorithms. Youden’s J Statistic ranged from 0.81 to 0.96. Kappa ranged from 0.61 to 0.91. AUC ranged from 0.91 to 0.98. The sensitivity analyses produced similar results. Conclusion: Different HIV algorithms performed best under different scenarios. One or more physician visits for HIV, one or more hospitalizations for HIV, or two or more antiretroviral prescriptions in two years was best to identify all possible HIV cases, without concern for the number of false positives. Six or more physician visits in two years was best to identify as many true positive HIV cases as possible with minimal false positives. Three or more physician visits in two years most accurately distinguished between HIV cases and HIV non-cases. This study demonstrates that administrative data in Manitoba can accurately identify people living with HIV who interact with the healthcare system.
HIV, AIDS, Validation, Algorithm, Administrative data, Health services