All writing
Governance

The Examiner's Checklist: What Financial Regulators Actually Look for in AI Governance

A former bank examiner explains what financial regulators actually look for in AI governance — documentation trails, named accountability, monitoring evidence, and the FS AI RMF's 230 control objectives.

I spent four years as a bank examiner at the FDIC. I sat across from CEOs, CROs, and compliance officers at dozens of institutions and told them what was wrong. Sometimes it was lending concentrations. Sometimes it was BSA gaps. Sometimes it was the look on a chief credit officer's face when I asked for documentation he knew didn't exist.

That job teaches you one thing above all else: the difference between what an institution says it does and what it can prove it does. That gap is where findings live. And with AI now embedded in credit decisioning, fraud detection, customer service, and risk modeling across the banking sector, that gap is about to get a lot of attention.

Here's what I looked for then — and what I'd look for now.

text { font-family: sans-serif; } FS AI RMF Examination Readiness Framework Five priority areas examiners actually focus on 1 Documentation Trail Can you trace from business intent to model selection to testing to deployment to monitoring? 2 Accountability Chain Who signed off on each governance decision? Named individuals, not committees. 3 Monitoring Evidence Not dashboards -- evidence that someone reviewed data, made a decision, and documented it. 4 Third-Party Governance If you use vendor AI, how do you govern what you didn't build? 5 Policy-Practice Alignment The policy says quarterly reviews. When was the last one? Can you produce minutes? Examination Readiness Score Treasury FS AI RMF (Feb 2026) | OCC Bulletin 2025-26 | FDIC Examination Procedures | NIST AI RMF

The documentation trail

Examiners don't start with the technology. They start with the paper. Can you trace a clean line from business intent to model selection to testing to deployment to monitoring?

That means: Why did the institution decide to use AI for this function? Who evaluated which model or vendor to use, and what were the alternatives? What testing was performed before deployment, and what were the results? Who approved deployment, and on what basis? What monitoring is in place, and what has it shown?

If any link in that chain is missing, that's a finding. Not because the examiner is being difficult — because the institution can't demonstrate that it understood and controlled the risk it took on. That's the definition of an unsafe or unsound practice, and it hasn't changed just because the risk is algorithmic instead of credit-based.

The institutions that pass examination aren't the ones with the fanciest AI. They're the ones with the cleanest trail.

Ownership — a name, not a committee

Every examiner has had this conversation. "Who's responsible for this?" And the answer is a committee. The Model Risk Management Committee. The AI Governance Board. The Technology Oversight Working Group.

Committees don't own risk. People do. When I examined a bank's loan portfolio and found a concentration issue, I didn't cite the Credit Committee. I identified who approved the loans, who was supposed to be monitoring the concentration, and who failed to escalate when limits were breached. Names.

AI governance is the same. For every AI system in production, an examiner wants to know: Who approved this for deployment? Who is monitoring its performance? Who is accountable if it produces discriminatory outcomes? If the answer to any of those questions is "the committee," the examiner will note that accountability is diffuse — and diffuse accountability is a control weakness.

SR 11-7 has always required clear accountability for model risk management. The FS AI RMF makes it explicit for AI systems. Examiners will test it.

Evidence of monitoring — not the dashboard, the decisions

I've reviewed monitoring programs at institutions where the dashboards were impressive. Color-coded risk heat maps. Real-time performance metrics. Exception tracking with drill-down capability.

Then I'd ask: When was the last time someone looked at this dashboard and made a decision based on what they saw? Show me the decision. Show me the documentation.

Silence.

A dashboard that nobody acts on is not monitoring. It's decoration. Monitoring means: defined thresholds, documented reviews at a defined cadence, evidence that threshold breaches triggered action, and evidence that the action was appropriate. An examiner will pull the monitoring logs and check whether anyone responded when the metrics moved. If nobody did, the institution has monitoring infrastructure but not monitoring practice — and it's the practice that matters.

This is especially critical for AI systems because of drift. A model that performed well at deployment can degrade silently as data distributions shift. If no one is watching — really watching, not just technically capable of watching — the institution is running uncontrolled risk.

The gap between policy and practice

Every institution I examined had policies. The question was never whether the policy existed. The question was whether anyone followed it.

The policy says model validation occurs annually. When was the last validation? The policy says the AI Governance Board meets quarterly. Produce the minutes from the last four meetings. The policy says high-risk AI use cases require a documented impact assessment before deployment. Show me the impact assessment for this system that went live in November.

Examiners check. They pull the calendar against the policy. They compare what was supposed to happen with what actually happened. And the gap between those two things — the policy-practice gap — is one of the most common sources of findings in any examination.

With AI governance, the policy-practice gap is often enormous, because institutions adopted frameworks and published policies under pressure without building the operational infrastructure to execute them. The policy looks good. The practice doesn't exist yet. Examiners will find that gap.

Third-party governance

This is where most banks will stumble hardest. The institution uses a vendor's AI for credit scoring, fraud detection, or customer interaction. The vendor built it, trained it, and hosts it. The institution deployed it.

Show me how you govern what you didn't build.

Can you explain how the vendor's model was developed and validated? Do you have access to testing results? Do you have contractual rights to audit? Are you monitoring the vendor's model performance independently, or are you relying on the vendor's own reporting? If the vendor updates the model, how do you know, and what review process triggers?

Most banks can't answer these questions. They treated vendor AI the way they used to treat vendor software: buy it, deploy it, trust it. But AI systems aren't static software. They change behavior as data changes. A vendor model that performed well at purchase can perform differently six months later, and if the institution isn't independently monitoring, it won't know until the harm has occurred.

OCC guidance on third-party risk management already requires this oversight. The FS AI RMF's 230 control objectives make it specific and testable. Examiners will have a checklist.

The FS AI RMF changes the game

Before the Financial Services AI Risk Management Framework, examiners assessed AI governance against general principles: SR 11-7 for model risk, third-party guidance, and the institution's own policies. The assessment was somewhat subjective. A well-prepared institution could argue its approach was reasonable.

Two hundred thirty control objectives change that calculus. The FS AI RMF gives examiners a specific, granular checklist. The question shifts from "do you have governance?" to "can you evidence these specific controls?" That's a fundamentally different examination. It's the difference between a conversation and a test.

Institutions that haven't mapped their current controls against the FS AI RMF's 230 objectives are preparing for the wrong examination. They're preparing for the conversation. The test is coming.

What examiner-ready documentation actually looks like

Compliance vendors will sell you AI governance documentation packages. Templated risk assessments. Pre-built policy libraries. Dashboard-ready monitoring frameworks.

I've reviewed vendor-produced examination prep materials. The pattern is consistent: they look professional, they check surface-level boxes, and they collapse under the first substantive question.

Examiner-ready documentation is not about volume or polish. It's about specificity and evidence. A risk assessment that says "this system may produce biased outcomes" is not examiner-ready. A risk assessment that says "we tested this system for disparate impact across these protected classes, using this methodology, on this date, with these results, and took these specific actions in response" — that's examiner-ready.

The difference is the difference between describing risk and demonstrating that you managed it.

The banks that prepare now

Financial regulators are building their AI examination capabilities. OCC, FDIC, and the Fed are training examiners, developing examination procedures, and aligning on the FS AI RMF as a reference framework. The examination cycle is coming. It's not a question of whether — it's a question of when your institution is in scope.

The banks that prepare now will pass examination. They'll have the trail, the names, the evidence, and the specificity that examiners require. The ones that wait will explain to their board why they didn't — and that's a conversation no executive wants to have.

I've sat on the examiner's side of that table, and I've built the systems that get examined. Prepare now.