The generative AI honeymoon is winding down, and the reality of enterprise IT plumbing is setting in. As companies try to scale up their early experiments, many discover their AI initiatives stall out under the weight of tangled infrastructure. The friction in moving from a proof of concept to a real-world deployment rarely stems from the models themselves. More often, the bottleneck is the chaotic tooling and undocumented processes wrapped around those models. Engineering teams tend to assume modern cloud AI services are ready to ship out of the box, but turning a local experiment into a reliable enterprise system requires rigorous, boring IT discipline.
Sanjar Sapar spends his days deep in these exact infrastructure trenches. Currently a Senior Site Reliability Engineer at Early Warning and the Founder of both Adams Intelligence Consulting and the upskilling academy Remoder, he has co-authored over 120,000 lines of IaC code and managed more than 100 AWS accounts for Fortune 100 companies. From his vantage point, the teams that successfully push AI into production are the ones that treat it exactly like traditional software: with strict architecture, slow rollouts, and a heavy focus on cost and reliability.
When tasked with building AI agents for his own organization, Sapar deliberately resists the urge to design an all-encompassing system on day one. "Start small, think big—incremental AI deployments win where over-engineering fails," he says. "When you try to architect the entire system upfront and deploy it all at once, that's where everything starts to break."
Skip the magic switch: To explain why so many teams struggle to deploy a simple agentic AI platform, he points back to the early days of DevOps. Back then, enthusiasm for new toys outpaced operational discipline, leaving organizations with overlapping platforms that broke deployment workflows. Today, he watches AI teams repeat the exact same pattern. "This whole idea that it's going to be overnight, or just take it and put it in production, it doesn't work that way. It has to be a well-architected framework," he added.
Industry observers frequently blame the human element for stalled rollouts, but Sapar sees a 50/50 split between people and technology. When engineers hastily stack proprietary models to compress timelines, they create massive cognitive overload and scatter their focus, and this isn't helped by a lack of maturity.
The ten-tool pileup: The market is flooded with immature solutions, and the current state of enterprise AI forces developers to constantly chase moving targets. Sapar notes that out of ten tools, only one or two mature properly. "The rest just scatter the whole thing. If a company starts using the other eight tools, it makes it very hard for the rest of the engineering world to start implementing and making bug fixes."
That tool sprawl inevitably intersects with older, revenue-generating legacy infrastructure. Rather than ripping and replacing entire stacks or forcing a massive cloud migration, Sapar advocates for a decoupled, incremental approach. Teams should break applications into microservices and modernize in thin slices. That strategy requires making deliberate, sometimes contrarian choices about what actually belongs in the public cloud. For workloads that are already stable and cost-effective, he actively encourages keeping them hybrid or on-premise.
Slice, don't shift: Sapar describes his approach to large migration projects: "If I have 20 apps, I start with the first one or two, then slowly transition the rest of them. If we try to move all 20 in two months, I'm not really fond of that because you don't know what kind of problems it's going to bring you."
Penny-pinching on-prem: He pushes back on cloud-first absolutism as a microcosm of dogmatic decision-making that impacts budgets and scalability. "Why do they need to be on-premise? Because it's way cheaper. Why would I take something that's working perfectly fine, saving my client money, and bring it to the cloud just because it's modern tech?"
Financial pragmatism sits at the very center of Sapar's philosophy. The friction in transitioning from a pilot to full deployment is often an economic hurdle masquerading as a technical one. Architects frequently weigh the upfront time investment of open-source stacks against the instant gratification of proprietary services. By choosing open-source sandboxes, engineering teams can build the same capabilities while maintaining control over their architecture and their wallets.
Billing shock, part two: Sapar points out that the current AI market is overpriced, leading companies to burn budgets over time, and compares this to the cloud boom from previous decades. "Everyone started migrating to the cloud, and after five years, they saw the bills going up. That's what's happening with AI, too. Organizations are overpaying for a lot of stuff."
Ultimately, surviving the scale-up phase means enforcing a standard multi-stage pipeline. Sapar leans heavily on guidelines such as the AWS Well-Architected Framework to enforce discipline in the process. Before writing a single line of production code, teams must draft how security, performance, identity management, and cost optimization will apply to their agents, especially when scaling Kubernetes clusters for AI workloads.
Pillars to production: Rather than hunting for specialized AI researchers, Sapar suggests that organizations empower their existing cloud engineers to run these standard architecture reviews. "If you have a good team, it takes about one to two months to put a really good draft of all these six pillars and how to run all these AI agents. Start one slice at a time."
The home Wi-Fi rule: As an SRE, Sapar knows 100 percent uptime is a myth. He prefers his manager's analogy for deploying reliable infrastructure: "As long as it's like Wi-Fi at home and it operates, that's good enough."
Sapar leaves teams navigating the leap from experimentation to execution with a straightforward baseline. Success isn't necessarily a radical replatforming. It just requires treating AI agents like any other piece of enterprise software. "You have to account for the well-architected framework within the agentic world," he says. "That would be the main thing." The unglamorous work of securing, scaling, and stress-testing before a single agent touches production is what separates initiatives that compound from those that collapse.