Insurance claims automation

Case study

Insurance claims automation

In the document driven insurance industry, manually processing invoices and receipts on a daily basis is highly monotonous. Furthermore, it is tedious for human resources to toggle through and access multiple applications to find specific, client related claim information.

Right from entering the claims information till reimbursing the claim amount, claim processing has become highly time consuming and prone to data errors. Automation helps insurers to quickly extract, categorize, and analyze data. It also helps them to reduce costs, improve operational efficiencies, and enable faster claim processing for their customers.

Our client is one of the top rated pet insurance providers in the USA. They provide insurance cover for dogs and cats, which helps pet parents lessen their financial burden and obtain the best possible medical care for their pets.


From minor skin lesions to advanced cancer treatments, our client has to process more than 3000 pet insurance claims everyday. These claims contain digital copies of medical invoices, received from various hospitals and clinics in different formats.

The claims processors had to manually look up for individual accounts in Salesforce and update claims invoice data. This took an average of 5 to 15 minutes to process a single claim invoice, which created a backlog resulting in processing delays by 2-3 days. To reduce operational costs, increase operational efficiencies, and lower claims Average Handling Time (AHT), our client required an automated process that can:

  • Extract data from invoices
  • Update the extracted information in Salesforce against the respective pet account
  • Extract data from 80% of the total volume of invoices (coverage)
  • Maintain 90% accuracy in data extraction


  • The medical invoices are from any of the 24000 veterinary hospitals spread across the USA, where invoices are in different formats and are varied in terms of number of pages.
  • Unnecessary documents, such as CC slip, medical report, and so on, are attached along with the invoice.
  • Lack of predefined formats (placement of invoice data such as logo, treatment provided, cost incurred, etc) for invoices or claim forms across hospitals.
  • Pet and parent names in the claims/medical invoices may not always match with the names in the policy account.
  • The scanned copies of the medical invoices were not of high quality; that is either blurred, distorted, folded, or of low resolution.
  • Claims submitted via email and fax can be difficult to link to a pet account.


Imaginea came up with a solution to implement an RPA bot, coupled with an OCR (Optical Character Recognition) tool. This enabled the complete automation of the data capture and validation process.

Tech stack

How our solution helped

Accuracy in data

Reduction in claims
processing time

Reduction in claims
cycle turnaround time

Overall approach

Imaginea came up with a strategy to categorize the invoices based on similar invoice templates. Even though invoices were received from 24000 hospitals, billing softwares and templates used to create these invoices were limited to a few hundred. We analysed the master data set using the DBSCAN clustering algorithm to scan the 24,000 invoice samples and categorize them into 1000 formats of invoices. We refined the clustering algorithm and further categorised these 1000 formats into 170 formats.

To further shortlist the number of invoice formats, we came up with the idea of implementing the 80/20 rule. The proposed solution was to identify invoice formats from the top 20% of the hospitals, as they contribute 80% of the total invoice volume. Equipped with this methodology, we analysed the top 20% – 30% of the veterinary hospitals and found that these invoice formats contributed to 80% of the total invoice volume. By implementing the 80/20 rule, we were able to further reduce the number of formats from 170 to to 39. In reality, these 39 formats covered about 90% of the total volume.

We then created the document definition for the 39 invoice formats in the ABBYY OCR tool to extract the required data. It helps the tool to identify the format first and then locate the required data in the invoice. The solution relies on the labels present in the invoice, their location, and relational location with respect to each other.

Another major hurdle was account mapping. So we developed a unique string comparison algorithm to map an invoice with a pet account in Salesforce.

For audit and reporting purposes, we created a variance report to monitor the coverage and accuracy of the system.

The diagram below illustrates how the invoice data is extracted and processed through our solution:


  • Reduced claims processing backlog time from 2-3 days to same day processing
  • Substantial decrease in data entry time of invoice information from 15 mins per claim to 45 seconds per claim
  • Improved account mapping accuracy of 85% to 90% 
  • Increased process efficiency
  • Improved outcome quality
  • Reduced transaction time and costs

Talk to us