While AI Demos May Succeed, Deployments Fail Without Architecture-Led Strategy

There is a phrase making the rounds in enterprise engineering that nobody wants to hear but everybody recognizes: "Do AI." It arrives in Slack messages from the C-suite. It lands on engineering leaders as a mandate that assumes the hard part is building the model, when the hard part is everything else.

The pattern that follows is well-documented. A team builds a demo. Leadership sees it work on a small dataset. They assume it can be productionized. The demo gets pushed toward enterprise scale, and then it either stalls in UAT or crashes because nobody thought about security, compliance, or how the system fits into the architecture that already exists.

Chris Obermeier has watched this pattern play out from inside the kinds of organizations where the stakes are highest.

Obermeier has spent 18 years in software engineering, including nearly a decade leading engineering leaders. He came up through GE Healthcare, Johnson Controls, and Rocket Mortgage, where he directed the modernization of mission-critical capital markets systems and embedded ML models into pricing and risk workflows. Now, as SVP of Engineering at Xpanse, he leads over 100 engineers across four divisions building an AI-driven document validation platform for mortgage lenders and financial institutions, one of the most tightly regulated corners of enterprise software.

Treating agents like employees

The question of AI agent access control is straightforward in theory and deeply complex in practice. Obermeier describes an approach that mirrors traditional identity management: giving agents role-based permissions the same way you would a human employee. While the principle isn't new, the implementation breaks down when agents take indirect paths to conclusions.

"What do you do if it takes a roundabout way and then comes to the conclusion that it needs this access?" Obermeier says. "It becomes very intricate very quickly, how you're going to handle people requesting data and then having multiple agents being the arbiters of whether or not they can."

The result is a growing density of human-in-the-loop checkpoints. That tradeoff between innovation speed and security is manageable when the number of agents is small. At enterprise scale, the governance infrastructure required to support autonomous agents may rival the complexity of the agents themselves, a challenge that zero-trust architectures are increasingly being adapted to address.

The risk you already have

Obermeier is focused on agent governance, but he's clear about what he considers the larger threat: the humans.

Shadow AI has become the operational risk enterprise leadership is least prepared for. "If you don't provide a solution that's going to work for people at your company, and you don't completely lock it down, you should expect that they're going to use it," Obermeier says. "You literally have a human that could accidentally copy and paste all of your key financial data or trade secrets and just upload it."

The alternatives are binary: lock down third-party tools entirely and lose the force-multiplier effect, or build sanctioned alternatives fast enough that employees don't need unsanctioned ones. Obermeier's view is pragmatic: the second option is the only realistic one. The shadow AI problem is not a technology problem. It is a leadership problem that manifests as a technology risk.

On the coding side, Obermeier's teams use AI heavily, but the ownership model is non-negotiable. "There's no bucket that says, 'this is owned by Claude,'" he says. "There's still an engineer that has to go and push that code and sign off for it. And then there's going to be another engineer or two that do code reviews."

The GenAI and ML ops teams have broader access to models and experimentation sandboxes. Everyone else uses AI as an augment: IDE plugins, code assistants, generative tools for scaffolding. But the same review standard applies across every team. The friction he watches for is junior engineers who lean on AI without understanding what it produces, a velocity metric that looks great right up until something breaks and nobody can explain why.

Two to three x, not ten

Obermeier is blunt about the productivity numbers. "I would say it's more of a two to three times. A lot of people like to say ten. I would really push back on that." He thinks the honest number might be closer to two, because responsible adoption means slowing down enough to catch the mistakes AI introduces. The failure modes are predictable: engineers get caught in loops where the model fixes the symptom instead of diagnosing the disease.

"It doesn't give you the right answer. It gives you what it thinks you want to hear," Obermeier notes. "If you say 'fix my import statements' and there's a different problem, it's not going to be like your engineer friend that says, 'Why are you doing that?' It's just going to do it."

His advice maps to the same principle CIOs are applying to broader AI strategy: be prescriptive, use test-driven approaches, and never treat velocity as a proxy for quality.

Zero to negative one

The strongest pattern in Obermeier's experience is the failure mode at the boundary between demo and production. The trigger is almost always top-down pressure to ship AI without architectural grounding. "They try to bolt these things on, and it might work in that demo. It might work for a hundred or a thousand unique visitors," he says. "And then you put it into enterprise scale and it crashes because all the vetting around security, compliance, risk, auditability, none of those things were ever thought about."

The teams that succeed take the opposite approach. They start with the existing architecture, identify where AI creates the highest-leverage improvement, and treat it as one tool among many. "If the car is pointed in the wrong direction, you're just going to go the wrong way faster," Obermeier says. "But if you have good architecture and it's part of your operating model, then it will absolutely speed you up to your destination."

On KPIs, his position is that AI should not get its own scorecard. "If your goal is to drive revenue and that's what it's always been, why would it change?" The one early-stage metric he endorses is adoption, the leading indicator that tells you whether a tool is earning its place in the workflow. Everything else should revert to the KPIs that existed before AI entered the conversation.

The enterprise AI landscape is littered with demos that worked and deployments that didn't. The distance between the two is not the model, the framework, or the executive enthusiasm behind the initiative. It is whether the organization treated AI adoption as an architecture discipline or a feature request.

All articles

While AI Demos May Succeed, Deployments Fail Without Architecture-Led Strategy

Chris Obermeier, Senior Vice President of Engineering at Xpanse, explains why top-down mandates to "do AI" produce fragile demos that collapse in production, and why durable gains come only from treating AI as a governed tool within existing architecture.

AI is a tool in the toolbox. If you’re trying to ‘do AI’ as a standalone thing, you’re already setting yourself up to fail because it ends up as a bolt-on that works in a demo but breaks in production.

Chris Obermeier

Chris Obermeier

Treating agents like employees

The risk you already have

Two to three x, not ten

Zero to negative one

All articles

Enterprise AI

While AI Demos May Succeed, Deployments Fail Without Architecture-Led Strategy

Chris Obermeier, Senior Vice President of Engineering at Xpanse, explains why top-down mandates to "do AI" produce fragile demos that collapse in production, and why durable gains come only from treating AI as a governed tool within existing architecture.

AI is a tool in the toolbox. If you’re trying to ‘do AI’ as a standalone thing, you’re already setting yourself up to fail because it ends up as a bolt-on that works in a demo but breaks in production.

Chris Obermeier

Chris Obermeier

Treating agents like employees

The risk you already have

Two to three x, not ten

Zero to negative one

Related Stories

Rushing Into AI Leaves Enterprises Paying a Long-Term Complexity Tax

Upfront Requirements Discipline and Leadership Alignment Set the Standard For Enterprise IT Success

As AI Compresses Execution, Value Shifts To Context Engineering And Output Validation

Curing AI Bias Will Require Sacrificing Speed For Verifiable Reasoning

An Engineer's Guide to Surviving Over-Engineered AI Stall-Outs