The rise of NLP in the healthcare industry

Nivash J P

The wide adoption of electronic health record (EHR) frameworks in healthcare produces enormous certifiable information that opens up new possibilities to direct clinical research. EHR contains patient information, which is a combination of both structured and unstructured data. Structured data recorded in an EHR follows a predefined format, whereas the unstructured data is the information that is stored without any specific format. Structured healthcare data includes the patient vitals, demographics, allergy information, existing health issues etc. According to NCBI, about 80% of data remains unstructured and is sometimes referred to as “the text blob”, which is stored in EHRs. The unstructured data includes clinical narratives and written texts, which are recorded by clinicians in natural language.

When it comes to understanding natural language, it becomes complex for rule based systems to identify the information even by using artificial intelligence (AI) based systems. Advanced natural language processing (NLP) engines utilize deep learning methods, which is a combination of traditional rule based mechanism and advanced machine learning (ML) algorithms to train datasets. This method enables NLP systems to learn how to classify new records based on the accumulated knowledge of all historical records.

In a research study published in JAMA, led by Harvey J. Murff and associates, NLP is advocated as an incredible asset to unlock unstructured information from EHRs. The outcome of the research proves that utilization of NLP in healthcare isn’t restricted to data inquiries; it uncovers NLP as an approach to identify what went wrong in negative occasions after a medical procedure. This utilization case is only a hint of something larger with regards to the estimation of NLP advances for healthcare.

The problem with the huge amount of unstructured data is that all the valuable clinical details are inaccessible. Industry experts anticipate that the NLP technique will play an incremental role in bringing life to the unstructured data, which leads to better clinical decisions in improving patient care and reducing healthcare costs.

The NLP cascade of benefits

As a lot of important clinical data is locked in clinical accounts, in this blog we discuss:

  • How NLP technique, as an artificial intelligence approach, is utilized to extract information from clinical stories in electronic health records
  • Various NLP use cases – mainstay, emerging and next gen

What is natural language processing (NLP) used for?

Natural language is the language used by humans to communicate, which is complex and diverse in nature. In this context, NLP is a process that allows computers to analyze natural language data, interpret it in a meaningful way and communicate back with humans. NLP facilitates machines in reading texts, deciphering voice data and identifying the relevant parts that bring structure to the unstructured data.

To build an NLP system, the data needs to be preprocessed, which includes:

Data cleaning: lower or upper casing, punctuation removal, HTML tag removal, stop word removal, spell check

Normalization: stemming, lemmatization, segmentation

Once the data is preprocessed, NLP pipelines can be built to extract desired outcomes. Here is an overview of an NLP pipeline:

Image courtesy: Matt Fortier

Here is a list of significant tasks that can be achieved with NLP:

  • Summarizing lengthy blocks of narrative text, by identifying key concepts or phrases present in the source material.
  • Mapping data elements present in unstructured text to structured fields, to improve data integrity
  • Converting machine-readable formats into natural language for reporting and educational purposes
  • Answering unique free-text queries that require the synthesis of multiple data sources
  • Engaging in optical character recognition to turn images, like PDF documents or scans of care summaries and imaging reports, into text files that can then be parsed and analyzed
  • Conducting speech recognition to allow users to dictate notes or other information that can then be turned into text

All these factors show that NLP is capable of bringing in clarity to unstructured data.

Key drivers of NLP in healthcare

Thanks to the modernization efforts in the healthcare industry, availability of large datasets is one of the factors that has led to the growth of NLP in healthcare. New pop health, clinical and operational use cases are evolving with the growth of NLP. The critical drivers of NLP in healthcare are:

  • Understanding and analyzing the unstructured information effectively to verify and identify medical codes in medical billing
  • Improving the quality of services offered, due to the rising needs of the value-based care (VBC)
  • Reducing the workload of medical practitioners

Based on these concepts, let us see some of the NLP-driven healthcare use cases listed in a leading NLP market scan research report. The use cases are grouped by their levels of maturity in the commercial space.

Mainstay NLP use case in healthcare

These are the commercially available solutions from multiple vendors that offer a proven return on investment.

Speech recognition

Speech recognition (SR) systems built using NLP algorithms are becoming an integral part of a medical practitioner’s gadgets. By using voice recognition devices, medical practitioners directly dictate their notes into the healthcare solutions and EHRs. There are two types of SR systems. The back-end SRs enable the voice data recording and the voice-to-text processing happens after the dictation is complete. Here the transcriptionists are expected to proofread the content. In the front-end SRs, real-time conversion of voice to text happens, which allows the clinicians to complete a patient report faster than having to wait for a transcriptionist. While the front-end SR offers faster report generation, it still requires the dictators to verify the recording.

As SR systems rely on dictionaries for voice-to-text conversions, healthtech companies have started providing an option to add custom medical dictionaries for different medical specializations like general medicine, pathology, and CT/MRT. These types of systems, with rich medical vocabulary, can iron out issues arising from misconstrued transcriptions, verbal communication or notes exchanged between experts.

Clinical documentation improvement

Clinical documentation is at the center of care delivery. It includes all the sensitive health records of a patient. As the information stored is meant for treating patients and future referencing, the details captured need to be exact, convenient for retrieval, and reflect the scope of services provided. To facilitate this process, the American Health Information Management Association (AHIMA) advocates clinical documentation improvement (CDI) programs that encourage the conversion of a patient’s clinical status into coded data. The coded data can be used to generate quality reporting, physician report cards, reimbursement, public health data, and many more.

AHIMA states that the “CDI has a direct impact on patient care by providing information to all members of the care team, as well as those downstream who may be treating the patient at a later date.’’ To make this consistent, NLP is widely used in identifying the key information from unstructured and semi-structured written and voice data, and automatically assigning it with appropriate medical codes.

Data mining research

Data mining is a knowledge discovery process that is based on identifying patterns in large datasets. In healthcare, this is routinely used to discover potentially useful and understandable correlations and recognize patterns that can benefit all parties involved in the healthcare industry. At one end of the spectrum, data mining can help medical practitioners identify effective treatments, which effectively leads to affordable healthcare services for the patients. On the other end, it helps healthcare organizations and health insurers analyze the effectiveness of treatments and predict the potential of readmissions, which leads to better customer relationships and reduced fraud and abuse.

Computer-assisted coding

Introduction of EHR into the medical coding process like ICD and CPT can be hard. Computer assisted coding (CAC), powered by NLP algorithms, connects the documentation in EHR with transcription systems and financial systems in the healthcare field. With NLP, CAC applications look for specific phrases and associate them with the right medical codes directly from clinical documents. It can make accurate correlations and improve the efficiency of the coding process.

Emerging NLP healthcare use cases

These are the NLP Healthcare use cases that are likely to gain momentum in the coming years.

Clinical Trial Matching

Clinical Trial Matching helps physicians to locate a list of clinical trials for an appropriate patient more conveniently and quickly. It also helps identify patients who are suitable for all trials in a particular clinical trial office. Improving screening quality and more efficient patient recruitment will help increase the goals and incentives for clinical trial enrolment.

Manual eligibility screening (ES) typically involves a laboratory-intensive examination of patient records using various tools. Automated assessment of electronic health records (EHR) can help, but a large proportion of the information is not computable as it is in free text. It would be best to boost the efficiency of a physician’s decision-making in clinical trial enrolment by using state-of-the-art natural language processing (NLP) and information extraction (IE) technologies. Typically, a predictive algorithm will classify patients who meet the core eligibility criteria of the clinical trial in order to dramatically minimize the pool of possible personnel screening candidates.

Prior authorization

One of the most challenging problems for doctors is securing prior authorizations from payers, that is, insurers. As per a recent survey by the American Medical Association (AMA), 84% of providers think prior authorizations are very burdensome, and 91% consider they have a detrimental effect on patient outcomes. However, the main worry is not the idea but the manual or partially automated process of prior authorizations. It consumes a lot of time and involves a lot of complexities regarding duplication of data, review requirements, etc. that impact both providers and patients.

The present solutions in the market don’t do justice in removing the pain-point completely. Since prior auths consist of a lot of repetitive and complex tasks, NLP can be very conveniently used to deal with data and make recommendations efficiently. IBM Watson and Anthem are already working on developing an NLP framework to automate prior authorizations.

Clinical decision support

Clinical Decision Support (CDS) offers information, insights and expertise to aid physicians, staff, patients etc. in optimizing decision-making — through computerized notifications, alerts and reminders. As manual data entry into CDS systems has its own shortcomings, CDS systems are incorporated into EHRs to streamline workflows through clinical guidelines, condition-specific order sets, patient data reports and summaries, templates for documentation, diagnostic assistance, context-relevant reference details, etc.

Today, a major portion of patient clinical notes are in free text form. NLP can be instrumental in using these free-text informations to drive CDS, representing clinical knowledge and CDS interventions in standardized formats, and leveraging clinical narrative.

Risk adjustment and hierarchical condition categories

Coding for Hierarchical Condition Category (HCC) is a risk-adjustment model originally developed to predict potential patient healthcare costs. As the world transitions to value-based payment models, HCC models are designed to forecast health outlays for a particular population of patients. Alongside, risk adjustment factor (RAF) ratings are used to change the quality and cost metrics. It is possible to calculate the quality and cost output more accurately by accounting for variation in patient complexity.

As opposed to manual coding, NLP-based coding can lower coding costs per chart and accelerate the time from chart retrieval to final data output. With NLP, there’s no need to code a chart to know whether there is an HCC mapped diagnosis code in it. Running NLP before coding not only gives better information about the financial value of each chart but also helps in chart segmentation. Thanks to NLP, coding accuracy reviews are more efficient, which means higher coding accuracy can be obtained at a lower cost. For RAF reviews, NLP can enhance risk adjustment programs by bringing more improvements in coding operations.

Next-generation NLP healthcare use cases

Finally, these are the NLP Healthcare use cases that are on the horizon.

Ambient virtual scribe

While the introduction of EHRs into physician practices over the past few years has had positive effects on several areas of patient care, this technological revolution has had its own share of drawbacks: patients often find their physician’s attention focused on a monitor or laptop in doctor’s examination rooms these days. Some physicians have employed scribes or assistants to make notes in the computer as the doctor is talking to the patient; while some write the details themselves as they speak to the patient.

With an ambient virtual scribe, this issue gets largely resolved. The system works through microphones in the examining room that record the conversation between the patient and doctor. The NLP kicks in by automatically transcribing the conversation into an EHR for follow-up treatments.

Computational phenotyping and biomarker discovery

The distinguishing proof of genomic biomarkers is a key advance towards improving indicative tests and treatments. The appearance of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has given a phenomenal chance to identify the subatomic markers of infection forms. Nonetheless, it has also confounded the issue of separating important atomic marks of organic procedures from these complex datasets. The procedure of biomarker disclosure and portrayal results in modern ways to coordinate, measure and master information-based methodologies. NLP may also be used to evaluate speech patterns and display predictive potential when it comes to neurocognitive disorders such as Alzheimer’s, autism, or other neurological or cardiovascular diseases.

Disaster surveillance

Disaster surveillance is the systematic collection, review, and evaluation of data on death, injury, and disease that allows for the detection of adverse health effects in the population. However, traditional models can only leverage the heterogeneous nature of population data.

Using NLP to better identify relative risks of patient populations based on EHRs, socio-economic and lifestyle factors is the way forward. NLP helps in the transition from basic descriptive analytics to predictive modeling and insights. It has the ability to identify and extract specific details from both structured and unstructured data.

NLP-based surveillance of disasters helps stakeholders to recognise risk factors, monitor disease patterns, recognise action items and plan interventions. This also assists to determine the effects of a catastrophe on public health and to identify future risks with an aim to prevent them.

Perfecting the NLP experience

Everything said and done, NLP in healthcare is still at a nascent stage. But the industry is eager to make strides in the effort. Semantic big data analytics and semantic processing ventures of NLP foundations are seeing major healthcare investments from some recognised players. In healthcare and life sciences, the global NLP market size will rise from $1.5 billion in 2020 to $3.7 billion by 2025, with a CAGR of 20.5%. NLP’s key growth drivers in the healthcare and life sciences industry include the growing urge for predictive modelling to minimise risks and optimise critical health issues, and the need for enhancing the usability of EHR data to boost patient treatment.

The key is perfecting the NLP experience in healthcare. NLP makes it easier for healthcare professionals to interact with machines without having to learn programming or technical skills. It is up to the healthcare providers to improve their strategic use of NLP, whether in improving their value-based care, billing or workloads. With the NLP landscape changing at an unprecedented speed, leaders in the healthcare industry are overwhelmed to make the right choices from the available options. To thrive in this context, having the right industry knowledge may not be sufficient. The need of the hour is a strong, able and expert technology partner that comes with a proven track record. This will help them in building resilient systems that can tap into the right opportunities and withstand uncertainties. Doing this will ultimately lead to more positive outcomes for key stakeholders in the healthcare industry.


Your email address will not be published. Required fields are marked *