Agentic Workflows Book

A living book about agentic workflows, agent orchestration, and agentic scaffolding

View the Project on GitHub arivero/agentbook

13 February 2026

The Agentic Coding Landscape in February 2026: A Snapshot

by

February 2026 may be the month when agentic coding stopped being a niche and became the default way serious software gets built. Here is what the landscape looks like right now, why it matters, and where the pressure points are.

The Speed Wars: Codex-Spark and the Hardware Turn

OpenAI shipped GPT-5.3-Codex-Spark on February 12, running on Cerebras’ Wafer Scale Engine 3—a single chip with 4 trillion transistors. The numbers are striking: 15x faster than the flagship Codex model, consistently over 1,000 tokens per second, with a persistent WebSocket connection that cuts round-trip overhead by 80%. “Real-Time Steering” lets you interrupt and redirect the model mid-generation.

This is not just an incremental improvement. It signals that model deployment is now co-designed with specialised hardware, and that speed is becoming a first-class model property alongside capability. The underlying Cerebras partnership—over $10 billion, OpenAI’s first major inference deployment beyond Nvidia—suggests this is a strategic direction, not a one-off experiment.

GitHub Agent HQ: The Multi-Agent Control Plane

On February 4, GitHub launched Agent HQ, a unified surface where developers can run Copilot, Claude, and Codex side by side within GitHub’s native interface. With over 20 million Copilot users and 90% of the Fortune 100 as customers, this instantly gives all three agent providers access to the largest developer distribution channel. Google (Jules) and Cognition (Devin) integrations are expected to follow.

The architectural significance: Agent HQ turns GitHub into an agent orchestration layer, not just a code host. Developers choose the right agent for each task—Copilot for quick edits, Claude for complex multi-file reasoning, Codex for high-throughput generation—without leaving their workflow. This is the multi-agent future arriving through a familiar interface.

Apple Joins: Xcode 26.3 Goes Agentic

Apple’s entry on February 3 with Xcode 26.3 caught many by surprise. The IDE now supports coding agents from Anthropic (Claude) and OpenAI (Codex) natively. Agents can create files, examine project structure, build and run tests, take image snapshots, and access Apple’s full developer documentation.

The critical detail: Xcode 26.3 exposes its capabilities through the Model Context Protocol (MCP), making it compatible with any MCP-capable agent. This is the first major platform-vendor IDE to ship native MCP support, and it signals that the protocol is becoming the standard integration layer industry-wide.

Cursor’s RL Breakthrough

Cursor shipped Composer 1.5 on February 9, scaling reinforcement learning 20x over Composer 1 on the same pretrained model. For the first time, post-training compute surpasses pretraining compute. The practical result is noticeably better multi-file editing, planning, and tool use—without a larger base model.

Cursor also led the creation of the Agent Trace specification (RFC v0.1.0), an open format for recording AI contributions alongside human authorship in version-controlled code. With support from Cognition, Cloudflare, Vercel, and Google Jules, Agent Trace addresses a growing need: as AI-generated code becomes a larger share of commits, teams need attribution metadata for debugging, compliance, and performance improvement.

MCP: From Protocol to Infrastructure

The Model Context Protocol has crossed from single-vendor project to industry infrastructure. Anthropic donated MCP governance to the Linux Foundation’s Agentic AI Foundation. The ecosystem reports over 97 million monthly SDK downloads and more than 10,000 active MCP servers. First-class client support spans Claude, ChatGPT, Cursor, Gemini, Microsoft Copilot, and VS Code.

The launch of MCP Apps—interactive UIs rendered inside MCP clients—shows the protocol expanding beyond tool calls into richer interaction surfaces. And the Agent Skills standard, adopted by Microsoft, OpenAI, Atlassian, Figma, Cursor, and GitHub, means skills written once can be discovered across platforms.

Domain-Specific Agents Emerge

Not every coding agent needs to be general-purpose. Snowflake Cortex Code (generally available February 2026) generates SQL and Python code that understands enterprise data schemas, RBAC policies, and governance rules. It outperforms general-purpose agents within its vertical because it has access to proprietary context that external models lack.

This pattern—domain-specific agents with deep vertical context—will likely expand to other data platforms, cloud providers, and enterprise tools. General-purpose agents handle the long tail; domain-specific agents win where context density matters.

The Security Reality Check

The most sobering developments of the month are on the security side.

ClawHavoc (February 2026): 341 malicious skills deployed through the OpenClaw ecosystem, installing Atomic Stealer (AMOS) across macOS and Windows. A single user (“hightower6eu”) was responsible for 314 of them. Censys counted over 30,000 exposed OpenClaw instances. Gartner’s recommendation: block OpenClaw in enterprise environments immediately.

MCP vulnerabilities in the wild: CVE-2025-6514 (CVSS 9.6) in the mcp-remote npm package affected 437,000+ AI development environments through an OS command injection in the authentication flow. Three flaws in Anthropic’s own Git MCP server (CVE-2025-68145/68143/68144) chained into a remote-code-execution path. The first malicious MCP server found in the wild (September 2025) secretly BCC’d every email through a Postmark impersonation.

Two new OWASP frameworks crystallise the threat landscape: the OWASP MCP Top 10 for protocol-level risks, and the OWASP Top 10 for Agentic Applications (2026) for broader agent attack surfaces, introducing the principle of “least agency.”

The lesson is clear: agent security is no longer theoretical. Supply-chain attacks on skills, protocol-level vulnerabilities in MCP, and exposed self-hosted instances are all active attack vectors. Teams deploying agents in production must treat security automation as a first-class feature.

Persistent Memory Arrives

GitHub’s Copilot Memory (public preview) builds repository-scoped memory that learns coding preferences from corrections over time, with 28-day auto-expiry and full user control. Copilot Spaces (GA since September 2025) provides team-curated knowledge bases spanning multiple repositories. Together they represent a shift toward persistent project context as a first-class development artifact.

The governance questions are real: which memories are retained, who can inspect them, and when they should expire. But the productivity gains from not re-explaining project conventions on every interaction are substantial.

What This Means

Several threads connect these developments:

The multi-agent future is here. GitHub Agent HQ, Claude Agent Teams, and Copilot’s parallel CLI agents all point the same direction: developers will routinely work with multiple agents, choosing the right one for each task. Architectures need to accommodate this.

Protocols are winning. MCP for agent-to-tool, A2A for agent-to-agent, Agent Skills for packaging, Agent Trace for attribution. The stack is crystallising. Invest in standards, not bespoke integrations.

Speed and capability are decoupling. Codex-Spark shows that speed can be a separate axis from raw capability, enabled by hardware co-design. Cursor’s Composer 1.5 shows that RL scaling can improve capability without larger models.

Security is the gating factor. The ClawHavoc campaign, MCP CVEs, and OWASP frameworks demonstrate that agent security incidents are no longer hypothetical. Governance, sandboxing, and supply-chain verification are prerequisites for production deployment.

Domain-specific agents will proliferate. As Snowflake Cortex Code demonstrates, vertical context beats general capability within a domain. Expect every major platform to ship its own coding agent.

The field is moving fast. The book tracks these developments as they happen—check the chapters for detailed coverage, and the bibliography for primary sources.


This post was generated as part of the book’s February 2026 news update. For the full technical details, see the updated chapters on Language Models, Agents for Coding, Future Developments, and Failure Modes.

tags: