The illusion of correctness: How AI's functional prowess can hide security flaws

While large language models are rapidly improving their ability to write functional code, their capacity for writing secure code has remained alarmingly stagnant. According to new research, nearly 45% of code generated by AI contains at least one major security flaw. The growing chasm between functionality and safety is creating a new class of risk, driven by a culture of "vibe coding" where speed outweighs security and code is judged only by whether it runs. The problem is that code that runs perfectly can still be a ticking time bomb, especially when the person who generated it lacks the expertise to spot the hidden dangers.

To understand the real-world implications of the trend, we spoke with Yotam Perkal, Senior Manager of Threat Research at Avalor Security, a Zscaler Company. With a career built on vulnerability research, data science at PayPal, and multiple patents in cybersecurity, Perkal has spent years on the front lines of digital security. He argued that the core of the problem lies not just in the technology itself, but in a critical and dangerous perception gap among its rapidly growing user base.

"You tell the model to write your code, the code works, and you're happy. But for the security aspects, not a lot of the users even have the technical experience to understand that the code that was generated has security issues."

For Perkal, the illusion of correctness is the central threat. The satisfaction of seeing AI-generated code run successfully masks underlying vulnerabilities that most users are ill-equipped to identify. "Even the ones that do, often it's not top of mind," Perkal noted. "It's very easy to neglect the security aspect and focus on the correctness." This blind spot is exacerbated by the very benchmarks used to measure AI progress, which overwhelmingly prioritize functional accuracy over security resilience.

It's not an entirely new problem—human developers have been writing insecure code for decades. Perkal explained that the difference is that the traditional software industry has spent years building a protective ecosystem of processes and tools. AI shatters that model by democratizing code creation without providing the corresponding safety net.

A lowered bar for risk: "When you have regular users that don't necessarily know how to code generating code, if you don't have the guardrails and processes in place to ensure that code is secure, you get insecure code. The bar is now lowered for creating software." The danger, he added, is amplified by the immaturity of the ecosystem itself. "And it's new. You still don't have all of the support system around it to make sure that the security aspects are taken into account."

With the guardrails missing, the responsibility must shift from the end-user back to the source. Perkal argued that the onus is on the creators of foundational models to build security in by default, rather than treating it as an optional add-on that users must remember to request.

Responsibility at the source: "Most users won't know to put in these security instructions, so we can't leave it up to them. As a foundational model provider, like OpenAI, Gemini, or Microsoft Copilot, it's your responsibility to ensure that even if a user doesn't supply those instructions, the model already takes security into account."

But in the current AI gold rush, the primary incentives for vendors are speed to market and business value, leaving security in the backseat. According to Perkal, this dynamic is unlikely to change without a significant external catalyst.

The catalyst for change: "It will take either regulation stepping in to force a baseline, or a hit to reputation. The first major breach or embarrassing incident, where a model creates insecure code that causes critical systems to break, will be the incentive. The damage to reputation hurts the business, and that will drive the issue."

This moment of apathy, however, may just be a predictable phase of technological immaturity. Perkal drew a powerful parallel to the early days of the internet, when the concept of digital security was itself a foreign idea.

A historical parallel: "It's like the beginning of the Internet. When Check Point started selling their firewall, they had to convince people they even needed to secure their website. The thinking was, 'It's just a website. Why do I need to protect it? It's not something physical.' We are at that phase with AI now. It's a matter of time before the entire ecosystem around it will mature."

But while the industry matures, a far greater threat is already emerging on the horizon. Perkal’s biggest concern isn't just flawed static code, but the rise of autonomous AI agents with the power to act on their own. These systems represent a monumental leap in the potential attack surface. "When you have AI agents that can access the Internet, access your machine, and take actions on your behalf," he warned, "that's the core of the problem."

The challenge of securing these agents is fundamentally different and exponentially more difficult than traditional software. Unlike a structured language with defined rules, an AI's input space is nearly infinite, especially as models become multimodal. This makes them vulnerable to creative attacks that are impossible to fully anticipate. "It's not like SQL injection where you at least have the syntax of the language that you need to comply with," Perkal explained. "The input space in an AI system is so wide that to protect it is not an easy challenge."

All articles

The illusion of correctness: How AI's functional prowess can hide security flaws

Key Points

Yotam Perkal

Yotam Perkal

All articles

Security & Governance

The illusion of correctness: How AI's functional prowess can hide security flaws

Key Points

Yotam Perkal

Yotam Perkal