The AI Inventory Problem: Most Companies Don't Know What They're Running
The first time someone at a Fortune 50 tried to build an AI inventory, they sent an email. The response rate was about 30%. The teams that responded were the ones already doing governance well.
The first time someone at a Fortune 50 tried to build an AI inventory, they sent an email. A polite, well-formatted email to every VP and director, asking teams to self-report any AI tools, models, or APIs currently in use. The response rate was about 30%.
The teams that responded were the ones already doing governance well. They had documentation. They had model cards. They could name the datasets, the owners, the downstream consumers. They were the teams that didn't need the email.
The teams that didn't respond were the problem. And nobody had a mechanism to find out what they were running.
Shadow AI isn't theoretical
Shadow IT has been a governance challenge for twenty years. Shadow AI is the same problem with higher stakes and less visibility. A marketing team signs up for a content generation tool using a corporate credit card. A product team embeds an API call to a third-party model inside a microservice. An analyst downloads an open-source model and runs it on a laptop against production data.
None of these show up in the IT asset inventory. None of them go through procurement review. None of them get a security assessment or a data protection impact analysis. They exist in the gap between what the organization officially deploys and what individuals and teams actually use.
I watched this happen in real time. A team embedded a summarization API into an internal tool. It worked well. People liked it. Six months later, during a routine data flow audit, someone realized the tool was sending customer support tickets — including customer names, account numbers, and complaint details — to a third-party API endpoint. The terms of service for that API included the right to use submitted data for model training.
Nobody had done anything malicious. Nobody had even done anything unusual by the standards of how software gets built today. They found a tool that solved a problem, they integrated it, and they moved on. The gap wasn't intent. It was visibility.
The self-report trap
The email approach fails for predictable reasons. Self-reporting requires teams to know what counts as AI. That sounds obvious, but it isn't. Does a rules-based recommendation engine count? What about a vendor product that uses ML internally but presents itself as a SaaS analytics tool? What about an Excel plugin that calls a model? The boundary of "AI" is blurry, and teams on the wrong side of that boundary don't self-identify.
Self-reporting also requires teams to see the value in responding. If the governance team is perceived as a compliance function that slows things down, the rational response from a shipping team is silence. They know that reporting a tool might trigger a review. The review might take months. The tool might get blocked. The project has a deadline. The calculus is straightforward.
The teams most likely to be running unvetted AI are the ones with the least incentive to report it. This isn't a communication problem. It's a structural one.
What an actual inventory looks like
The organizations that have solved this — and there aren't many — didn't start with a survey. They started with infrastructure.
Network traffic analysis reveals API calls to known AI service endpoints. Cloud billing records show which teams are paying for compute that maps to model training or inference. Code repository scans identify imports of ML libraries, API client integrations, and model artifacts. Procurement records surface vendor contracts that include AI capabilities, even when the primary product isn't marketed as AI.
This is unglamorous work. It requires coordination between security, infrastructure, procurement, and finance teams that rarely operate in concert. It produces a spreadsheet, not a dashboard. But it produces something that reflects reality rather than self-perception.
One organization discovered, through this process, that they had over 60 distinct AI-adjacent tools and models in use across the enterprise. Their self-reported inventory had captured 11.
The catalog is the foundation
Every downstream governance activity depends on knowing what exists. Risk assessment requires a catalog. Compliance mapping requires a catalog. Incident response requires knowing which systems use AI so you can assess blast radius. Data protection impact assessments require knowing which models touch personal data.
Without the inventory, governance is performative. You can write policies, build frameworks, assign roles, and publish principles. None of it connects to reality until you know what's actually running, where it's running, what data it consumes, and who is responsible for it.
This is the part that doesn't make it into the AI governance maturity model slide deck. The maturity model assumes you know what you're governing. The first real step is finding out.
The uncomfortable question
Most organizations, if they're honest, can't answer a basic question: how many AI systems are currently operating in your enterprise?
Not "how many did we officially deploy." How many are running — authorized or not, procured or downloaded, managed or forgotten. The number is always larger than leadership expects. Sometimes dramatically larger.
The inventory won't build itself. The email won't work. The only path is to instrument the infrastructure, audit the procurement records, scan the code, and accept that the number you find will be uncomfortable.
That discomfort is the starting point. Everything else in AI governance is built on top of it, or built on nothing.