- The medical invoices are from any of the 24000 veterinary hospitals spread across the USA, where invoices are in different formats and are varied in terms of number of pages.
- Unnecessary documents, such as CC slip, medical report, and so on, are attached along with the invoice.
- Lack of predefined formats (placement of invoice data such as logo, treatment provided, cost incurred, etc) for invoices or claim forms across hospitals.
- Pet and parent names in the claims/medical invoices may not always match with the names in the policy account.
- The scanned copies of the medical invoices were not of high quality; that is either blurred, distorted, folded, or of low resolution.
- Claims submitted via email and fax can be difficult to link to a pet account.
Imaginea came up with a solution to implement an RPA bot, coupled with an OCR (Optical Character Recognition) tool. This enabled the complete automation of the data capture and validation process.
How our solution helped
Accuracy in data
Reduction in claims
Reduction in claims
cycle turnaround time
Imaginea came up with a strategy to categorize the invoices based on similar invoice templates. Even though invoices were received from 24000 hospitals, billing softwares and templates used to create these invoices were limited to a few hundred. We analysed the master data set using the DBSCAN clustering algorithm to scan the 24,000 invoice samples and categorize them into 1000 formats of invoices. We refined the clustering algorithm and further categorised these 1000 formats into 170 formats.
To further shortlist the number of invoice formats, we came up with the idea of implementing the 80/20 rule. The proposed solution was to identify invoice formats from the top 20% of the hospitals, as they contribute 80% of the total invoice volume. Equipped with this methodology, we analysed the top 20% – 30% of the veterinary hospitals and found that these invoice formats contributed to 80% of the total invoice volume. By implementing the 80/20 rule, we were able to further reduce the number of formats from 170 to to 39. In reality, these 39 formats covered about 90% of the total volume.
We then created the document definition for the 39 invoice formats in the ABBYY OCR tool to extract the required data. It helps the tool to identify the format first and then locate the required data in the invoice. The solution relies on the labels present in the invoice, their location, and relational location with respect to each other.
Another major hurdle was account mapping. So we developed a unique string comparison algorithm to map an invoice with a pet account in Salesforce.
For audit and reporting purposes, we created a variance report to monitor the coverage and accuracy of the system.
The diagram below illustrates how the invoice data is extracted and processed through our solution:
- Reduced claims processing backlog time from 2-3 days to same day processing
- Substantial decrease in data entry time of invoice information from 15 mins per claim to 45 seconds per claim
- Improved account mapping accuracy of 85% to 90%
- Increased process efficiency
- Improved outcome quality
- Reduced transaction time and costs