Unlocking EHRs – The Nirvana for Automated Underwriting

Unlocking EHRs – The Nirvana for Automated Underwriting

Technology driven innovations have optimized several pieces of the insurance value chain, yet customers continue to be less than thrilled. Applicant drop-out rates are still as high as 40%, and the fully automated digital sale of life insurance policies is a distant dream. Traditional methods of gathering risk relevant data, like attending physician's statements (APS) and lab results, are hovering on the most critical path of achieving any further automation. Recent explosion of digital data and access to new risk relevant data-points has made the underwriting process more time-consuming and costly. Accelerated underwriting initiatives have attempted substituting APS with less invasive ways of gathering clinical data, but the results are less than optimal.

Life risk data has been increasingly complimented by new sources of digital data (from aggregators) with a goal of striking the right balance between risk evaluation and underwriting efficiency. One of the more recent trends is driven by the emergence of lifestyle data generated by wellness devices/apps. Insurers are finding ways to meaningfully integrate it with the underwriting process in order to deliver a more personalized offering to their customers.

No alt text provided for this image

Electronic health data combined with lifestyle data holds tremendous potential to anticipate, manage and improve life and disability risks. Reinsurance giant Swiss Re is leading the research on leveraging the lifestyle data to personalize risk assessment, to help customers manage their health and wellbeing and to improve the customer's journey. Read more on Swiss Re's "Big Six" lifestyle factors and Personal Resilience Score model which are destined to transform underwriting, risk pricing and new product design in the digital era.

No alt text provided for this image

Digital transformation of insurance value chain has helped automating several tasks that once required humans. Advanced technologies, including AI and ML, show tremendous potential to fully automate processes and achieve ultimate sophistication in the form of a touchless function (i.e., discover, analyze, design, automate, measure, monitor, reassess). Digitization of healthcare data and the rapid emergence of health ecosystems/platforms provide enormous opportunities for life insurers to take the giant leap forward in achieving touchless underwriting.

No alt text provided for this image

What are EHRs? Life insurers have been exploring ways to gather Electronic Health Record (EHR) data, which is an electronic version of a person's medical history. EHR is maintained by the healthcare provider over time, and may include all of the key administrative clinical data relevant to the person’s care under a particular provider, including demographics, progress notes, problems, medications, vital signs, past medical history, immunizations, laboratory data, and radiology reports. The US healthcare system is a highly complex, decentralized system with multiple private and public players. Typically, we see 5 players in the healthcare system, each having their view (version) of a person's EHR based on their interest (incentives):

  1. Providers: Interested in providing healthcare to their patients. Obtain and protect EHR data which is heavily regulated. Providers include medical professionals (physicians, nurses, emergency services, etc.) and organizations (hospitals, labs, rehabs, etc.)
  2. Insurers: Interested in managing healthcare risk. Obtain healthcare service/cost data needed to pay providers as per insurance contract.
  3. Pharmacies: Interested in providing drugs to their patients. Obtain and protect prescription drug data.
  4. Government: Interested in understanding patterns of disease, healthcare cost and regulating services. Also keeps track of hospital and physician licenses.
  5. Patients: Interested in keeping themselves healthy. Share their personal sensitive data with other players.
No alt text provided for this image

Risk relevant data comes to insurers from different sources for different reasons, with different issues. Patients may see a physician in different practices resulting in multiple systems owning a piece, and not having one system containing a complete record of the care provided to the patient. The average person sees more than 18 providers in their lifetime. This is the key issue with provider EHRs—not having care information across multiple organizations. This issue of data fragmentation has led to the rise of a third-party aggregator or Health Information Exchange (HIE) who collects the medical records from various data sources and aggregates them to facilitate the data collection for the insurance carriers. Fear, confusion, and misunderstanding around data security and privacy have artificially increased the burden to obtaining data access to clinical text. Thanks to modern technology, we have ways to anonymize/de-identify/redact personal data in a compliant manner.

Types of Healthcare data in EHRs

The healthcare data in EHRs varies along several dimensions, occurs over different timescales, and is generated at different points in time of patient's journey, with different possible values and different patterns of missing values. It can be broken down into two categories:

  1. Structured data (~20% of EHR): Primarily includes demographics, medications, lab tests, etc. This data, which is machine readable for further processing and analytics purposes, is organized into tabular structures.
  2. Unstructured data (~80% of EHR): Primarily includes clinician notes written, with a maze of acronyms, images and signals from sensors. The unstructured data is growing by ~ 60% annually, and needs heavy processing before it becomes machine readable.
No alt text provided for this image

Reading EHR is a knowledge extraction task where a combination of structured and unstructured data must be processed, including medical coding classifications, images, lab test results and narrative text. Data produced by a healthcare entity might be inaccurate or biased depending on their own incentives to treat a patient. Diagnostic and procedure codes are documented for generating a bill of service and may be influenced by reimbursement considerations. One way to overcome some of these data issues is to combine data sources however its technically challenging and may result into data biases and errors.

Healthcare Knowledge Graphs

One of the first challenges in making the EHR data useful is to organize a timeline view of patients’ data from all the different data types that are generated at various places in the healthcare ecosystem. Healthcare happens over time, and the timeline view explicitly captures when the patient experienced each event. The timing of events is important for insurers because the correlation between the customer's age and the sequence of events plays a crucial role in accurate risk assessment. The timeline view also helps in learning exposure (something that could happen) and outcome (condition happening usually at some point after the exposure), which fuels the predictive power of modern algorithms.

No alt text provided for this image

Healthcare knowledge graphs are extremely useful in understanding the clinical data and processing clinical text. Knowledge graphs, also known as ontologies, declare entities (and synonyms) and their relationships. This helps in identifying different terms with the same meaning. Numerous knowledge graphs exist in the medical science domain; the UMLS (United Medical Language System) Metathesaurus is a union of over 140 knowledge graphs, along with declarations of relationships between these knowledge graphs. Here are some of the important knowledge graphs that play a vital role in processing EHRs:

  1. The International Classification of Diseases (ICD) is the official system of assigning codes to diagnoses and procedures associated with hospital utilization in the United States. The ICD-10 is the 10th version of ICD used to code and classify mortality data. The National Center for Health Statistics (NCHS) and the Centers for Medicare and Medicaid Services are the U.S. governmental agencies responsible for overseeing all changes and modifications to the ICD.
  2. Logical Observation Identifiers Names and Codes (LOINC) is a database and universal standard for identifying medical laboratory observations. LOINC applies universal code names and identifiers to medical terminology related to electronic health records. The purpose is to assist in the electronic exchange and gathering of clinical results (such as laboratory tests, clinical observations, outcomes management and research).
  3. Current procedural terminology (CPT) is a set of codes, descriptions, and guidelines intended to describe procedures and services performed by physicians and other health care providers. CPT is maintained by the American Medical Association.
  4. Systematized Nomenclature of Medicine (SNOMED) is a systematic, computer-processable collection of medical terms in human and veterinary medicine. It provides codes, terms, synonyms, and definitions that cover anatomy, diseases, findings, procedures, microorganisms, substances, etc. It allows a consistent way to index, store, retrieve, and aggregate medical data across specialties and sites of care.
  5. RxNorm provides normalized names for clinical drugs and links its names to many of the drug vocabularies commonly used in pharmacy management and drug interaction software, including those of First Databank, Micromedex, and Gold Standard Drug Database. By providing links between these vocabularies, RxNorm can mediate messages between systems not using the same software and vocabulary. RxNorm is provided by the US National Library of Medicine.
No alt text provided for this image

Healthcare Data Exchange Standards

The Health Information Technology for Economic and Clinical Health Act (HITECH) created incentives for using EHR systems among providers. The EHR systems worked well for siloed provider organizations, until cross provider communication needs surfaced for delivering a coordinated care to patients. Interoperability of healthcare data remains a big challenge for the industry, and it's probably the biggest tech hurdle in unlocking EHRs. Harmonization of data formats/structures of different providers has accelerated the evolution and adoption of healthcare data exchange standards like Health Level Seven (HL7). The HL7 standards provide a set of international standards for transfer of clinical and administrative data between software applications used by various healthcare providers. The HL7 standards are produced by Health Level Seven International, an international standards organization, and are adopted by other standards issuing bodies such as American National Standards Institute and International Organization for Standardization.

The HL7 Clinical Document Architecture (CDA) is an XML-based markup standard intended to specify the encoding, structure and semantics of clinical documents for exchange. The Continuity of Care Document (CCD) specification, based on CDA, is the U.S.’s specification for sharing patient summary. The CDA specifies that the contents of the document consist of a mandatory textual part (which ensures human interpretation of the document contents) and optional structured parts (for machine processing). The structured part is based on the HL7 Reference Information Model and provides a framework for referring to concepts from coding systems, such as the SNOMED or the LOINC. The Consolidated CDA (C-CDA) 2.1 is the latest version of the standard.

No alt text provided for this image

Technical challenges in unlocking EHR value

Digital health data available in EHR holds the key to faster, cheaper, and more accurate risk assessment for life insurers. Some medical conditions are accurately represented by codes while many others need detail analysis of the text in clinical notes in order to derive the hidden medical condition. Natural language processing (NLP) is a branch of Artificial Intelligence which helps computers understand, interpret, and manipulate human text language. NLP techniques have the capability to capture unstructured data, analyze the grammatical structure, determine the meaning of the information and summarize the information. For EHR use cases, NLP techniques are used to extract key information, which include the diagnoses, the recommendations, the timeline, the symptoms, and event discovery. The real power of NLP and ML models extend beyond data extraction, into enabling predictive analytics, fulfilling the ultimate wish of an underwriter.

NLP successes remain largely in the lab since developing highly accurate NLP programs is still a very difficult problem. Processing free text in a clinically meaningful way faces several challenges and needs a powerful combination of computational linguistic and healthcare domain knowledge. EHR data needs cleansing, normalization, and enrichment before it can be used for an automated underwriting process. The normalization process should aim for rendering structured health data which is in full compliance with national document standards like C-CDA 2.1. Here are some common, yet tricky technology challenges in extracting information (truth) from clinical text:

  1. Content recognition - Word sense disambiguation for acronym and abbreviations, e.g., BK could mean below knee or a virus, SOB for shortness of breath or cry.
  2. Context recognition – Condition that the patient had before or a family member has or had, in other words making distinction between family history and personal history, e.g., heart disease on father side of family.
  3. Named entity recognition – Identifying words/phrases of interest, e.g., problems, tests, treatments, diagnosis, symptoms, etc.
  4. Relation extraction between entities or events- Recognizing the subject and object in a sentence; “wife helps applicant with meds” and “applicant helps wife with meds".
  5. Negation identification – Narrative clinical reports use no, denied, and negative to indicate the absence of a condition. ~40% of content in clinical text is stated as a negation.
  6. Timescale association to the right concept- Patient suffering headache since last three days; this would need additional context to interpret accurate timeline.
  7. Conflicting information spread across multiple documents/results - Linking information across documents for deriving correct clinical terminology/codes - For example, the doctor prescribed medication that the patient didn't buy or use.

Even though NLP has challenges to resolve, insurers can still benefit while the capabilities evolve, starting with more attainable goals and moving to more complex use cases. Some areas where NLP is already effective are in flagging applicants who have a specific medical condition (cancer or diabetes) or who are at an increased risk of a specific condition based on their family history or other available information.

No alt text provided for this image

The nirvana for automated underwriting

Digital health ecosystems are on the rise with healthcare services centered on customers. Ecosystems are advancing the concept of healthcare interoperability and trying to aggregate data, making it available to the caregiver in a meaningful and timeline fashion. NLP technologies are showing some maturity for targeted use cases involving insight generation from unstructured clinical data. Insurers should bolster NLP proficiency by first deploying it in low-risk scenarios, which could help build up confidence in the capabilities while avoiding the risk of adverse events. Smart integration of NLP with underwriting workflow is critical to realizing any productivity gains. If we don't integrate it with workflow, we're just creating more work without more value.

Hit rates on EHRs for life & disability insurance applicants are growing as more and more healthcare providers are implementing platforms or joining a digital health ecosystem. EHR system interoperability (exchanging and interpreting shared data) is improving with the adoption of standards and technological advancements. Widespread adoption of EHRs will replace APS (attending physician statement) with real-time digital health information, which can be automatically processed (using standard ICD/LOINC codes) for achieving automated risk assessment, thereby significantly improving customer experience.

Getting started on your EHR based underwriting journey could involve substantial efforts, but it also gives you the opportunity to influence data aggregators in developing solutions and achieve competitive advantage. It's just a matter of time before healthcare information will be available as a digital data stream for insurers to achieve their accelerated underwriting aspirations.

EHRs hold tremendous potential to be that missing link for life & disability insurers to achieve nirvana for automated underwriting. Traditional insurers must start developing programs that leverages EHRs and improves customer experience or else be prepared to face competition from InsurTech start-ups (Health API, InnovAccer, etc.) and tech giants (Apple, Google, etc.) who aren’t leaving any stone unturned to capitalize on the EHR opportunity.

Kevin Smith, AALU CLU FLMI

Adventurer / Full Time Papa

4 年

I fully agree with Chris. This article takes all the pieces and the challenges together in one well written article. Well done!

Deepak Krishna Nayak

Technologist @ICE Bonds at Intercontinental Exchange

4 年

Thanks for sharing

Chris Behling

Executive Leadership Group, Northwestern Mutual

4 年

Easily the most info on EHRs and life insurance I’ve seen in one place. Thanks Rajendra Prasad!

要查看或添加评论,请登录

Rajendra Prasad的更多文章

  • Focus 5 to Lead the Insurance Revolution

    Focus 5 to Lead the Insurance Revolution

    One of the common themes of the recent digital transformations is a clear shift from product centricity to customer…

    3 条评论
  • Digital Ecosystems and Platforms – The Future Home for Insurers

    Digital Ecosystems and Platforms – The Future Home for Insurers

    The race to become a digital insurer is in full swing, and the age-old notion "insurance is sold and not bought" is…

    2 条评论
  • The Rise of InsurTech – Real Threat or Fake Opportunity?

    The Rise of InsurTech – Real Threat or Fake Opportunity?

    InsurTech is a term used for Insurance Technology, which typically refers to technology driven insurance solutions…

  • The AI Road to Business Value for Insurers

    The AI Road to Business Value for Insurers

    Tech enabling innovation in insurance is not a new thing. However, the insurance industry is lagging behind when it…

    3 条评论
  • 1 Step at a Time, 55,374 Times

    1 Step at a Time, 55,374 Times

    Let me start with the climax first: I just finished my second consecutive NYC marathon last Sunday, and I've never been…

    21 条评论

社区洞察

其他会员也浏览了