Unlocking the Value in Unstructured Data
Advances in natural language-processing and the ability to read large volumes of data are creating new insights for clinicians.
This article first appeared in the November 2015 issue of HealthLeaders magazine.
Steve Morgan, MD
Until now, the electronic health record has been largely about structuring data to drive initiatives such as value-based care and population health. But now technology is unlocking unstructured information from clinical narratives, making it useful and actionable.
"It's still a work in progress, but a necessary piece of technology that we need to learn to leverage to fully get information out of our EMRs," says Steve Morgan MD, senior vice president and chief medical information officer for Carilion Clinic, an integrated delivery network headquartered in Roanoke, Virginia.
Carilion's first effort to tap unstructured EHR information was to attempt predictive analytics on its congestive heart failure population, Morgan says. Using a previously developed algorithm, the clinic engaged with EHR vendor Epic and data analytics vendor IBM to perform regression analysis on patients with specific risk factors, to see if they later developed heart failure, and then applied these risk factors to the larger cohort patients who may not yet have symptoms, as a way to establish a larger at-risk cohort.
When the analysis confirmed that the patients treated had, indeed, developed heart failure based on the early indicators, Carilion and its vendors began to refine the model.
To increase the model's accuracy, Carilion, Epic, and IBM together developed a proof of concept using the natural language-processing capabilities of IBM's Watson technology, which is capable of quickly reading large volumes of unstructured documents and producing assessments.
By doing so, Carilion was able to add 3,000 more patients into its predictive model for a total of 8,000 patients, Morgan says.
"It was fairly significant," he says. "It was not unanticipated, because we knew, based on where we were at the time when we looked back over that data, that one of the key elements that we did not and now do capture discretely was a piece of data called ejection fraction, which really tells you about heart function. It did give us some guidance on where we might be able to change workflows to capture the data discretely going forward."