Cure for the Common Code: San Francisco Startup Uses AI to Automate Medical Coding

Inception program startup accelerates natural language processing with NVIDIA GPUs to improve the speed and accuracy of medical coding.
by Isha Salian

Doctors’ handwriting is notoriously difficult to read. Even more cryptic is medical coding — the process of turning a clinician’s notes into a set of alphanumeric codes representing every diagnosis and procedure.

Although this system is used in over 100 countries worldwide, accurate coding is of particular significance in the U.S., where medical codes form the basis for the bills doctors, clinics and hospitals issue to insurance providers and patients.

More than 150,000 codes are used in the U.S.’s adaptation of the International Classification of Diseases, a cataloging standard developed by the World Health Organization.

The diagnostic code for a pedestrian hit by a pickup truck? V03.10XA. Type 2 diabetes diagnosis? E11.9. There are also a set of procedural codes for everything a doctor might do, like put a cast on a patient’s broken right forearm (2W3CX2Z) or insert a pacemaker into a coronary vein (02H40NZ).

After every doctor’s appointment or procedure, a clinician’s summary of the interaction is converted into these codes. When done by humans, the turnaround time for medical chart coding — within a healthcare organization or at a private firm — is often two days or more. Natural language processing AI, accelerated by GPUs, can shrink that time to minutes or seconds.

San Francisco-based Fathom is developing deep learning tools to automate the painstaking medical coding process while increasing accuracy. The startup’s tools can help address the shortage of trained clinical coders, improve the speed and precision of billing, and allow human coders to focus on complex cases and follow-up queries.

“Sometimes you have to go back to the doctor to ask for clarification,” said Christopher Bockman, co-founder and chief technology officer of Fathom, a member of the NVIDIA Inception virtual accelerator program. “The longer that process takes, the harder it is for the doctor to remember what happened.”

Fathom uses NVIDIA P100 and V100 Tensor Core GPUs in Google Cloud for both training and inference of its deep learning algorithms. Founded in 2016, the company now works with several of the largest medical coding operations in the U.S., representing more than 200 million annual patient encounters. Its tools can reduce human time spent on medical coding by as much as 90 percent.

Deciphering the Doctor

At any doctor’s appointment, emergency room visit or surgical procedure, healthcare providers type up notes describing the interaction. While there are some standardized formats, these medical records differ by hospital, by type of appointment or procedure, and by whether the note is written during the patient interaction or after.

Medical coders make sense of this unstructured text, categorizing every test, treatment and procedure into a list of codes. Once coded, a healthcare provider’s billing department turns the reports into an invoice to collect payments from insurance providers and patients.

It’s a messy process — for a human or an AI. Human coders agree with each other less than two-thirds of the time in key scenarios, studies show. And research has found that half or more medical charts have coding errors.

“The challenge for us is these notes can vary quite a bit,” Bockman said. “There’s a push to standardize, but that tends to make the doctor’s job a lot harder. Human health is complex, so it’s hard to come up with a format that works for every case.”

Coding an AI that Codes

As a machine learning problem, medical coding shares elements of two kinds of tasks: multilabel classification and sequence-to-sequence NLP. An effective AI must understand the text in a doctor’s note and accurately tag it with a list of diagnoses and procedures organized in the right order for billing.

Fathom is tackling this challenge, aided by tools such as NVIDIA’s GPU-optimized version of BERT, a leading natural language understanding model. The team uses the TensorFlow deep learning framework and relies on the mixed-precision training provided by Tensor Cores to accelerate the large-scale processing of medical documents that vary widely in size.

Using NVIDIA GPUs for inference allows Fathom to easily scale up to process upwards of millions of healthcare encounters per hour.

“While lowering costs matter, the ability to instantly add the capacity of thousands of medical coders to their operations has been the game-changer for our clients,” said Andrew Lockhart, Fathom’s co-founder and CEO.

Relying on NVIDIA GPUs on Google Cloud helps the team ramp its usage up and down based on demand.

“We have very bursty needs,” Bockman said, referring to the team’s fluctuating computational workload. “Sometimes we might be trying to retrain different variants of the same large model, while other times we’re doing a lot of experimentation or just doing inference. We might need a single GPU or many dozens of them.”

The startup chose Google Cloud, Bockman said, in part because the data is encrypted by default — one of the requirements for compliance with HIPAA and SOC 2 privacy requirements.

While medical coding is the main activity done today with doctor’s notes, unlocking the information contained in these health records could enable a wide range of use cases beyond billing and reimbursement, Bockman says.

AI that quickly and accurately analyzes medical charts and appointment records at scale can help doctors spot patient illnesses that may otherwise have been missed, predict likely patient outcomes, suggest treatment options — and even identify promising patient candidates for clinical trials.