AI-First Platform Engineering: 3 Signals From PlatformCon

4 min read Original article ↗

Both platform engineering and AI are becoming standards in companies and are tightly interconnected. In a recent report, Gartner found that 86% of organizations believe that platform engineering is essential to realizing the business value of AI. At the same time, 94% identify AI as critical to the future of platform engineering.

The rapid integration of AI into platform engineering is reshaping how developers build, manage, and deploy software. Rather than merely being “bolt-on” features, AI capabilities are becoming the foundation upon which platforms evolve. In this article, I explore three points that platform teams should consider based on learnings from an AI panel at PlatformCon NYC.

AI as an Enabler of Platform Evolution

One clear signal: companies are merging their DevOps, data/ML, and business platforms into more unified internal developer platforms, using AI as the glue. Vishaka Sadhwani, a cloud architect at Google, said that “generative AI has become foundational rather than just a feature add-on.” And this shift isn’t just architectural; it’s also operational. AI agents enable teams to experiment faster, integrate intelligence into their infrastructure, and reduce friction between previously disconnected systems.

This shift is reflected in the increasing granularity of platform services. AI systems are beginning to operate with more refined components, moving from the coarse-grained Duplo blocks of early automation to the more precise Technic Lego pieces that allow for complex assembly. This granularity is what makes it possible to bring AI directly into developer workflows. Sylvain Kalache, Head of AI Lab at Rootly, notes: “Engineering teams are consolidating the consumption layer into coding assistants,” a trend powered by protocols like MCP and ACP. Embedding AI into IDEs enables developers to work with smaller, more precise services without context switching, thereby streamlining workflows and making automation more accessible.

In this new landscape, the role of the platform is evolving into a deterministic foundation, argues Aaron Ericson, Founder of the Applied AI Lab at NVIDIA’s DGX Cloud. He believes platforms would be this needed “system of record” that offers reliable, trusted data. These deterministic layers serve as the grounding infrastructure for AI agents, ensuring that they can perform their tasks with the best internal company context and increasing the chances of their outputs being accurate.

Governance, Safety, and Human-Centric Design

As AI becomes integrated into platform engineering, governance and security become increasingly crucial. AI agents must now be treated with the same rigor as microservices, strictly adhering to security policies and operational guardrails.

Ericson joked that AI stands for “Angry Interns” in Docker containers. But his analogy isn’t far from the truth, and that’s why engineers must clearly define the scope and boundaries of agentic systems to avoid unintended consequences. The focus should be on partnering with AI, rather than simply deploying it. Treat agents as real architecture components, which reinforces the necessity of human oversight, especially as platform accessibility expands to less technical users through conversational building interfaces and low-code tools. Vibe coding isn’t going away anytime soon.

AI as a Force Multiplier for Reliability and Operations

AI isn’t only reshaping the way platforms are built and consumed, but also how they are maintained. Agentic workflow in reliability and operational efficiency is already proving its value, and it is not replacing SREs and Platform Engineers, but rather boosts their capabilities.

For example, AI-assisted root cause analysis (RCA) can reduce investigation time by 80-90% on simple production incidents, according to Kalache, drastically cutting incident response times and improving reliability. A scenario that Rootly’s customers are already experiencing with their AI SRE.

Ericson takes it a step further, sharing that some incidents can even be entirely prevented. For example, NVIDIA DGX Cloud uses time-series transformer models to predict emerging system issues before they escalate, much like how a language model predicts the next word in a sentence. Only here, it’s spotting patterns in infrastructure data, and alerting platform operators before things break.

Building the Foundations of an AI-Powered Platform

AI isn’t just an add-on feature; it’s fundamentally reshaping how platform engineers work. By automating complex tasks like root cause analysis, embedding intelligence directly into infrastructure, and engineering context for AI agents, platform teams can significantly accelerate their engineering team maturity.

However, as these tools become integral to operations, maintaining a rigorous approach to security, determinism, and contextual integration is a must. As Sadhwani shared, these tools provide intelligence. Still, because you don’t want to be in a situation where that “Angry Intern” deletes your production database, decision-making must stay with humans. Keeping humans in the loop isn’t just a feel-good statement, but an actual necessity for reliability.

Group Created with Sketch.