Validate, Review, Reimburse: Automating Desk Reviews with AI Coding Agents (Part 2)

Right now, the most powerful way to use AI tools in this space isn’t chasing some generic software platform or product; it’s building small, custom tools that match your workflows exactly, and that you actually own.

I've seen a lot of state run programs where the state would pay hundreds of thousands of dollars to a accounting and consulting firms to:

Collect over hundreds (400+) cost report spreadsheets from districts or organizations
Validate the data
Perform desk reviews on the data based on program rules
Sign off that the reports met certain limits and ratios

Two core steps: validation and desk review

In many cost report processes that have a reimbursement mechanism, there are usually two big steps after you get the spreadsheets:

Validation - Should we accept the cost report?

Do percentages add up to 100%?
Are there names in the name fields?
Are there numbers in the numeric columns?
Are dates actually valid dates?

Desk review - Are there any problems with the data in the cost report?

Does the data conform to policy limits or thresholds?
Did the district charge a payroll tax rate above the 1.45% Medicare rate?
Did they charge executive salaries over a threshold?
Are benefit percentages (healthcare, retirement) within allowed ranges?

Depending on the results, we would might send it back to the district to resolve or write up a finding and adjustment to correct the amounts.

These requirements:

Are clearly defined
Apply equally to every cost report
Take a long time when you have hundreds of reports

Important work, but also highly manual, repeatable, and rule-driven. That’s exactly the kind of process that can be automated in 2025. These kinds of automations are within reach if you can learn a bit about how to work with AI coding agents like Codex.

Why use an AI coding agent like Codex?

I’m using OpenAI's Codex (an AI coding agent) for this project, but the pattern works with any strong coding agent.

The reason tools like this are powerful isn’t that they “do the work” for you in some mysterious way. It’s that they let you:

Harness the full power of computer programming
Without spending months learning every detail of a language and its ecosystem
While ending up with a tool you can run locally, securely, and on your own terms

Instead of paying every month for a black box SaaS you don’t control, you:

Spin up a simple Python project
Point your coding agent at your real cost report files
Describe your validation and desk review rules
Let it generate the code and tests
Own the script, database, and CSV outputs at the end

Walkthrough

In the first video in this series, I showed how to use Codex to:

Take a folder of synthetic test cost reports
Extract the data
Compile it into a single dataset
Store it in a small database and CSV export

That replaced a manual data extraction workflow I’ve seen replayed for years in real jobs.

In this second step, we’re layering on:

A validation step:Check that name fields are non-empty textAmount fields (salary, state share, federal share, healthcare, retirement, etc.) are numeric and non-nullPercent fields are numeric, between 0 and 1, and sum to 1Date fields parse as valid dates

Each cost report either: Passes validation, or Fails, gets a clear explanation of what went wrong, and gets flagged to send back to the district

A desk review step: Calculate per-employee totals and percentages (total payroll costs, state/federal portions, healthcare %, retirement %)Apply simple policy rules such as:Salaries over a threshold (e.g., $60,000 of state-related payroll costs) generate a finding and adjustmentHealthcare costs over a threshold percentage (e.g., 7% of salaries) generate a finding and adjustment

The exact thresholds and details aren’t the main point. I’m using small round numbers and simple rules. The important part is:

If you can define the rules you want applied to the data, you can encode your own rules, for your own program, as code you understand and control.

Why this matters for government programs

In the private sector, AI often gets pitched as a way to grow revenue: serve more customers, launch new products, and so on.

In government programs, the win is usually more straightforward:

Reduce the cost of necessary or excessive administrative work
Free human staff for tasks that actually require human judgment and communication
Make repeatable processes faster, more consistent, and more transparent

A validation and desk review workflow like this checks all the boxes:

High volume, repetitive tasks
Clear rules and thresholds
Low professional judgment once the rules are set
Huge time savings when automated, especially at scale

And most importantly: with a coding agent and a bit of guidance, you can build these tools yourself, on top of spreadsheets and rules you already understand.

You don’t need a massive IT team or project. You need a series of small, repeatable wins—and a mindset shift from “we should buy a tool for this” to “we can build a lightweight tool we own.”