The AI Inventory Problem: Most Companies Don't Know What They're Running

The first time most large enterprises try to build an AI inventory, they send an email. The response rate hovers around 30%. The teams that respond are the ones already doing governance well.

The first time most large enterprises try to build an AI inventory, they send an email. A polite, well-formatted email to every VP and director, asking teams to self-report any AI tools, models, or APIs currently in use. The response rate hovers around 30%.

The teams that respond are the ones already doing governance well. They have documentation. They have model cards. They can name the datasets, the owners, the downstream consumers. They are the teams that don't need the email.

The teams that don't respond are the problem. And nobody has a mechanism to find out what they're running.

Shadow AI isn't theoretical

Shadow IT has been a governance challenge for twenty years. Shadow AI is the same problem with higher stakes and less visibility. A marketing team signs up for a content generation tool using a corporate credit card. A product team embeds an API call to a third-party model inside a microservice. An analyst downloads an open-source model and runs it on a laptop against production data.

None of these show up in the IT asset inventory. None of them go through procurement review. None of them get a security assessment or a data protection impact analysis. They exist in the gap between what the organization officially deploys and what individuals and teams actually use.

Consider a team that embeds a summarization API into an internal tool. It works well. People like it. Six months later, during a routine data flow audit, someone realizes the tool has been sending customer support tickets — including customer names, account numbers, and complaint details — to a third-party API endpoint. The terms of service for that API include the right to use submitted data for model training.

Nobody did anything malicious. Nobody even did anything unusual by the standards of how software gets built today. They found a tool that solved a problem, they integrated it, and they moved on. The gap wasn't intent. It was visibility.

The self-report trap

The email approach fails for predictable reasons. Self-reporting requires teams to know what counts as AI. That sounds obvious, but it isn't. Does a rules-based recommendation engine count? What about a vendor product that uses ML internally but presents itself as a SaaS analytics tool? What about an Excel plugin that calls a model? The boundary of "AI" is blurry, and teams on the wrong side of that boundary don't self-identify.

Self-reporting also requires teams to see the value in responding. If the governance team is perceived as a compliance function that slows things down, the rational response from a shipping team is silence. They know that reporting a tool might trigger a review. The review might take months. The tool might get blocked. The project has a deadline. The calculus is straightforward.

The teams most likely to be running unvetted AI are the ones with the least incentive to report it. This isn't a communication problem. It's a structural one.

What an actual inventory looks like

The organizations that have solved this — and there aren't many — didn't start with a survey. They started with infrastructure.

Network traffic analysis reveals API calls to known AI service endpoints. Cloud billing records show which teams are paying for compute that maps to model training or inference. Code repository scans identify imports of ML libraries, API client integrations, and model artifacts. Procurement records surface vendor contracts that include AI capabilities, even when the primary product isn't marketed as AI.

This is unglamorous work. It requires coordination between security, infrastructure, procurement, and finance teams that rarely operate in concert. It produces a spreadsheet, not a dashboard. But it produces something that reflects reality rather than self-perception.

Organizations that complete this process routinely discover 60+ distinct AI-adjacent tools and models in use across the enterprise. Their self-reported inventories typically capture a fraction of that.

The catalog is the foundation

Every downstream governance activity depends on knowing what exists. Risk assessment requires a catalog. Compliance mapping requires a catalog. Incident response requires knowing which systems use AI so you can assess blast radius. Data protection impact assessments require knowing which models touch personal data.

Without the inventory, governance is performative. You can write policies, build frameworks, assign roles, and publish principles. None of it connects to reality until you know what's actually running, where it's running, what data it consumes, and who is responsible for it.

This is the part that doesn't make it into the AI governance maturity model slide deck. The maturity model assumes you know what you're governing. The first real step is finding out.

The uncomfortable question

Most organizations, if they're honest, can't answer a basic question: how many AI systems are currently operating in your enterprise?

Not "how many did we officially deploy." How many are running — authorized or not, procured or downloaded, managed or forgotten. The number is always larger than leadership expects. Sometimes dramatically larger.

The inventory won't build itself. The email won't work. The only path is to instrument the infrastructure, audit the procurement records, scan the code, and accept that the number you find will be uncomfortable.

That discomfort is the starting point. Everything else in AI governance is built on top of it, or built on nothing.