Skip to main content

Unlocking the Value in Unstructured Data

 |  By smace@healthleadersmedia.com  
   December 08, 2015

Advances in natural language-processing and the ability to read large volumes of data are creating new insights for clinicians.

This article first appeared in the November 2015 issue of HealthLeaders magazine.

Steve Morgan, MD

Until now, the electronic health record has been largely about structuring data to drive initiatives such as value-based care and population health. But now technology is unlocking unstructured information from clinical narratives, making it useful and actionable.

"It's still a work in progress, but a necessary piece of technology that we need to learn to leverage to fully get information out of our EMRs," says Steve Morgan MD, senior vice president and chief medical information officer for Carilion Clinic, an integrated delivery network headquartered in Roanoke, Virginia.

Carilion's first effort to tap unstructured EHR information was to attempt predictive analytics on its congestive heart failure population, Morgan says. Using a previously developed algorithm, the clinic engaged with EHR vendor Epic and data analytics vendor IBM to perform regression analysis on patients with specific risk factors, to see if they later developed heart failure, and then applied these risk factors to the larger cohort patients who may not yet have symptoms, as a way to establish a larger at-risk cohort.

When the analysis confirmed that the patients treated had, indeed, developed heart failure based on the early indicators, Carilion and its vendors began to refine the model.

To increase the model's accuracy, Carilion, Epic, and IBM together developed a proof of concept using the natural language-processing capabilities of IBM's Watson technology, which is capable of quickly reading large volumes of unstructured documents and producing assessments.

By doing so, Carilion was able to add 3,000 more patients into its predictive model for a total of 8,000 patients, Morgan says.

"It was fairly significant," he says. "It was not unanticipated, because we knew, based on where we were at the time when we looked back over that data, that one of the key elements that we did not and now do capture discretely was a piece of data called ejection fraction, which really tells you about heart function. It did give us some guidance on where we might be able to change workflows to capture the data discretely going forward."

Parveen Chand, MHA

Carilion has ongoing research to determine better ways to approach patients in the larger at-risk cohort. "We would like to be able to release this list to our providers and say these are folks we have identified as being at risk," Morgan says. "Here are the areas that you need to focus on with these individuals. Could these patients have different medications that could be added? Could these patients be treated in a different way? We know that a lot of these folks, and no surprise, needed some behavior modification in their lifestyle such as smoking cessation and weight reduction."

The staffing investment on Carilion's part in the unstructured data initiative was minimal. "We had one analyst that worked directly with IBM," Morgan says. "There was also an analyst from Epic that was involved. We also had, just to be able to feed the information, some of our database administrators involved, and to be able to generate the reports, but it was a true collaborative between the three entities, trying to figure out how we can leverage this type of technology."

Carilion is preparing to have discussions with IBM about using this technology to find specific diagnoses in unstructured text in pathology reports, as a foundation for building decision-support systems to live inside the Epic EHR for providers, Morgan says.

Morgan notes that he often discusses the untapped potential of unstructured data with other CMIOs and CIOs. "We all struggle with the same problem with EMRs," he says. "Do you try to push physicians and providers and nurses into documenting into discrete data blocks, or do you give more leeway to be able to do transcription or voice recognition?"

Personal insights
Eskenazi Health, anchored by a 315-bed campus in Indianapolis, recently piloted an initiative using Watson to analyze the contents of call center conversations, not just for the words being said but also "the duration of the phone call, the sentence structure that's being used, the pitch and tone of the voice—to basically look at all that information against Watson analytics and what Watson has categorized as personal insights or persona insights," says Parveen Chand, MHA, FACHE, chief operating officer of Eskenazi Health.

Shannon Werb

"If we can potentially discern whether or not you are a patient that's at risk of leaving our system, and I know that because we've got a voice profile on you, I may want to tweak my conversation a little differently," Chand says.

Chand says that with a payer mix that is highly government- or self-pay-dependent, Eskenazi is looking for new ways to engage its community. "We see unstructured data as really a new pathway for us to mine and better refine our current relationships we actually have with our patients, as well as exploring new markets," Chand says.

With Patient Insights, an application by Indianapolis-based hc1.com that uses Watson's data, Eskenazi can also identify and address at-risk populations.

One sign of such populations turns out to be no-show rates for appointments. With such rates previously running as high as 40%, "you can already imagine what that means for staffing, what that means for physician resources and clinical resources overall," Chand says.

Using hc1 since February to analyze daily incoming admission, scheduling, discharge, and transfer feeds, Eskenazi identified such patients and ways to engage them via text blasts and reminder notifications. "We have seen at least a 10% to a 15% drop in our no-show rates," Chand says. "In some of our clinics we're down to 25% no-show rates now. And a few others, we're actually down to 8%." The urgent need to get this no-show rate under control was emphasized in January when Indiana passed Healthy Indiana Plan 2.0, the state's equivalent to Medicaid expansion, he adds.

Other potential data sources for Watson to feed hc1 include analysis from social media accounts that patients have chosen to make public on services such as Twitter, Tumblr, Pinterest, or Facebook, Chand says. Such analyses could trigger invitations from Eskenazi to patients for classes of special interest to them, he says.

Part of utilizing such assembled knowledge includes properly safeguarding aggregated patient data as part of HIPAA, Chand notes. "The pure number of lab results that are sitting out there in all these information exchanges across the country, that's an endless potential for things like fast-tracking drug delivery and moving clinical trials along faster," he says.

Imaging insights
Radiology reports represent a huge repository of unstructured data, and a national virtual radiology practice is using natural language processing from SyTrue to improve quality and identify unnecessary imaging orders.

Based in Eden Prairie, Minnesota, Virtual Radiologic supplies virtual radiology services to more than 2,000 healthcare facilities, mostly in the United States. With more than 350 radiologists working as independent contractors, vRad is able to provide specialized radiology services, says CIO Shannon Werb.

"Overutilization of radiation and overutilization of imaging in the ER is a common topic, especially with pediatric patients."

"We're estimating to complete close to 6 million examinations this year, so we ingest a significant amount of information—order information metadata, images, prior reports—and present that to the radiologist," he says. "Ultimately what they produce by work product is that report. Historically that report has been highly unstructured and it hasn't been able to be leveraged. That's what we're trying to do with natural language processing."

To produce meaningful reports for its clients, vRad needs to ensure all procedure data is consistent—for example, all of its client hospitals use the same name for an "x-ray of the foot"—which is not usually a common practice. One hospital's "x-ray of the foot" could be a "lower-extremity study" at the hospital across the street, Werb says. Normalized, standard data is the essence of producing meaningful reports, and vRad's internally developed data normalization tool called vCoder provides that consistency, he says.

The SyTrue technology and vCoder were able to create a determination if a particular imaging referral was warranted, Werb says.

"The radiology industry and hospitals are grappling with things like clinical decision support and training the ER physicians around appropriateness criteria," he says. "Overutilization of radiation and overutilization of imaging in the ER is a common topic, especially with pediatric patients."

One result: Healthcare organizations can increase their training of physicians in appropriateness criteria for referrals for advanced imaging.

Werb acknowledges that some vendors already utilize natural language processing purely around billing, but he asserts that the vRad practice is part of a new generation of virtual services that focus more on the clinical informatics side.

Analyzing the radiology report's unstructured text, vRad can distinguish between incidental findings such as a benign cyst in a patient's lung, and an acute critical finding such as an intracranial hemorrhage in the patient's brain, Werb says.

From its internal data warehouse, vRad can help its healthcare customers "tell stories about their internal performance, maybe our performance for the services we're offering them, or their performance for the services that they're offering their hospital customers."

The investment in natural language processing technology isn't cost-free, but Werb believes it pays for itself over time.

"Are we seeing a return on our investment I can measure in actual dollars? No. But are we seeing the results associated with the implementation, and how it's driving a better experience from our clients? Absolutely."

Perhaps another telling indication of vRad's success: In May, the 14-year-old company was acquired for $500 million by Mednax, Inc., a publicly traded company that provides maternal-fetal, newborn, pediatric subspecialty, and anesthesia physician services.

"We recognize that improving quality of radiology services could result in a downturn in volume," Werb says. "We make decisions in our business based upon providing the best possible patient care, so we believe solutions like this allow us to continue to drive toward that goal. Now, imaging volume and imaging procedure volume continue to be on the increase, but as we move to more of a fee-for-value-based model, I think solutions like NLP are going to help us significantly.

"They may have thousands of images hung on the screen—the current study," he says. "They may have literally tens of thousands of images of prior studies. They may have many or dozens of prior reports and clinical history in front of them. And they need to evaluate that in an emergent situation in a very rapid time and, of course, produce very high-quality, accurate diagnoses for patients. We think when you put NLP in the front, we could interpret that information for them and drive their eyes to the most relevant prior reports and the most relevant elements in the clinical history. We think that would allow our physicians to provide a better service with a more rapid turnaround time."

Reprint HLR1115-8

Scott Mace is the former senior technology editor for HealthLeaders Media. He is now the senior editor, custom content at H3.Group.

Tagged Under:


Get the latest on healthcare leadership in your inbox.