Bias Associated with Mining Electronic Health Records
Abstract
Large-scale electronic health record research introduces biases compared to traditional manually curated retrospective research. We used data from a community-acquired pneumonia study for which we had a gold standard to illustrate such biases. The challenges include data inaccuracy, incompleteness, and complexity, and they can produce in distorted results. We found that a naïve approach approximated the gold standard, but errors on a minority of cases shifted mortality substantially. Manual review revealed errors in both selecting and characterizing the cohort, and narrowing the cohort improved the result. Nevertheless, a significantly narrowed cohort might contain its own biases that would be difficult to estimate.
Full Text:
PDFDOI: http://dx.doi.org/10.5210%2Fdisco.v6i0.3581