Skip to Main Content

Blueprint to Build: AI Leverage for Healthcare Operations

Meet the author
Daniel KluesingVP of Product, Medallion

Daniel Kluesing is a product expert known for simplifying complex processes. As the vice president of product at Medallion, he focuses on making provider operations smoother and faster by streamlining credentialing and enrollment for healthcare organizations.

Before joining Medallion, he worked at Stripe, where he scaled global payments, and Dropbox where he helped scale the mobile and desktop clients to hundreds of millions of users.

Daniel holds a Bachelor of Science in Electrical Engineering and Computer Science from the University of California, Berkeley, and a Master of Science in Computer Science from Stanford University. He lives in San Francisco.

Let's connect on:

Healthcare operations teams are versatile problem solvers: you can throw almost anything their way, and they’ll figure out how to make it work.

Their superpower is flexibility, adapting to everything from client escalations to internal processes.

What often holds them back is software.

Rigid tools designed for specific use cases don’t easily adapt to others. This is why spreadsheets are so ubiquitous—they're the duct tape of operations, offering just enough freedom to allow custom workflows without being locked into a developer's assumptions.

What if healthcare operations teams could access tools more powerful than spreadsheets—tools that allow them to build custom workflows as easily as writing an email?






Enter the new generation of AI tools like ChatGPT, Anthropic, Gemini, and Replit

These platforms are transforming the landscape, offering the ability for teams to rapidly create tailored SaaS-like workflows, without a formal engineering background.

In this multi-part series, we’ll use these new models to build an automated quality assurance system that an operations team can use to check the accuracy of files, tailored and customized to the exact needs of teams, all with no engineering required.

We’ll start by building a simple workflow in ChatGPT to check the accuracy of common documents filled out by operations teams.

Then, we’ll show how it can be embedded in Excel or Google Sheets to enable project management for a full workflow with multiple team members. 

Finally, we’ll cover how to automate execution, monitor accuracy metrics and create an audit trail for regulatory compliance. 

No technical or coding experience is necessary, but by the end of the series, you’ll have built a custom workflow tool to meet the QA needs of your organization.

This is part of a new wave of software that builds software, allowing anyone—regardless of technical skill—to create software that meets exact, specific and personal needs. 

Quality assurance

At the end of any operations workflow comes quality assurance (QA)—hopefully. Despite its importance, QA is often where operations teams are forced to cut corners.

Pressure from deadlines and staffing or sometimes just complacency from things having worked well up to that point lead teams to skip QA.

However, having a QA process acts as an important step to keep delivery running in time and at high accuracy.

If we could automate a QA check, teams could run it all the time. 

We’ll start with a simple example of validating W-9 forms, and by the end of this multi-part series, you’ll have built a fully functional QA tool, done entirely without the need for engineering resources.

W-9 validations

The first thing we need is to collect the form and details about how it should be filled out.

For this example, we’ll use IRS Form W-9. Form W-9 is used to collect tax identification information from vendors and contractors, a common requirement in healthcare when dealing with numerous service providers.

We can get a copy of the IRS Form W-9 directly from their website: https://www.irs.gov/pub/irs-pdf/fw9.pdf

As forms go, the W-9 is a fairly straightforward form to QA. Our approach here is to list out the “rules” we want to verify for each W-9 submitted to our operations team in a table.

The rules listed here are a combination of validation rules from the document, and some rules our organization might choose to require as part of our operational review.

For instance, only allowing U.S. addresses isn’t a strict requirement of the W-9 form, but your organization might only expect to work with U.S. entities, and  would want to flag and carefully inspect any foreign addresses provided.

Each organization may have different rules that are important to check based on the business.

As long as we can write them down in a table format like this, we can check them against the forms we are QA-ing.

Building our QA prompt

We now need to build the prompt that we’ll use to instruct a foundation model acting as our QA assistant.

One of the major advantages of using AI is its ability to understand and respond to natural language.

This means you can ‘talk’ to your QA Agent and adjust its rules or parameters just by typing a question or instruction, no coding necessary.

You are a helpful Quality Assurance (QA) agent for a healthcare operations team. Your job is to read and evaluate IRS Form W-9 submissions and identify missing, inaccurate, ambiguous or contradictory information.

You must carefully read the instructions on the form and evaluate the user-provided input against those instructions.

Additionally, you must check the following specific rules against the data provided on the form.

  • Box 1 is required.
  • Box 2 is optional.
  • Box 3a is required. Only one box should be marked. If LLC is selected, the tax code must be “C”, “S”, or “P.”
  • Box 5 is required.
  • Box 6 is required.
  • The data entered in Box 5 and Box 6 must combine to be a valid U.S. address.
  • Only one  Social Security Number or Employer identification number should be filled out.
  • If provided, Social Security Number should be 9 numeric digits in the format XXX-XX-XXXX.
  • If provided, Employer identification number should be 9 numeric digits in the format XX-XXXXXXX.
  • There must be a signature on the Signature of U.S. Persons line.
  • The date of signature must be within the last 6 months from today’s date.

After carefully checking each field of the form, if there are any issues or concerns, respond with either  “QA Check Failed” (if any problems were found with the form) or “QA Check Passed” (if no problems were found.) You must start all responses with either “QA Check Failed” or “QA Check Passed.” Any type of problem, including blank forms, missing forms, or a non-W-9 form, must be first labeled as “QA Check Failed”.

If the QA Check Failed, provide a short, concise list of the issues found and your reasoning for why an issue failed the check. You must provide the reason for any failure case. You must not provide any reasoning for fields that pass the QA check.

The user filled out form is attached. Please run QA against it.

Testing

We can now test our “QA Agent” and see how well it performs. We’ll be using ChatGPT-4o for testing. Simply copy and paste the above prompt into the web interface, and attach a PDF to test.

For the simplest test, just upload the blank W-9 form, and our agent should correctly flag it as a blank form.

Try some more complex tests by filling out the form with incorrect information, and your new QA Agent should correctly flag forms with bad data.

We can quickly catch obvious “fat-finger” errors. For instance, if a user accidentally submitted a Form W-4 instead of a Form W-9. (Bad W-4 example)

Providing an otherwise valid Form W-9 with a Canadian address flags the non-U.S. address for further follow-up. (See bad address Form W-9 example)

Finally, we can check that a valid Form W-9 passes our QA agent. (valid W-9)

Next

This quick example shows the power of having foundation models sitting alongside operations team specialists. But, this also highlights the effort required to set up and explicitly list all the conditions to evaluate. For a simple task like checking Form W-9, we can list out all the possible conditions to check in a quick list of rules. However, for other documents, the list of rules could be hundreds of items long and take hours to write out.

In our next blog post, we’ll build a more advanced QA system that uses a set of reference documents to learn the rules instead of requiring our team to list out every one. We’ll then start to show examples of how these systems can be built into the workflow an operations team is using to support high-scale operations.