OpenAI turned Codex into a full desktop agent today.
The update rolling out to macOS users adds computer use (operating desktop apps in the background without interrupting your work), an in-app browser with inline commenting, 111 new plugins spanning GitLab, Atlassian Rovo, and Microsoft Suite, and a memory feature that lets the agent remember context across sessions. Multiple agents can now run in parallel on the same machine.
This is the most aggressive Codex release since the tool launched. Head of Codex Thibault Sottiaux called it building “the super app out in the open,” though he framed the current scope narrowly: developers first, everyone else later.
The timing is not subtle. OpenAI launched a $100/month ChatGPT Pro plan on April 9, price-matched exactly to Anthropic’s Claude Max. That plan includes 5x the Codex usage of the $20 Plus tier, with a launch promo bumping it to 10x through May. One week later, this feature dump arrives.
The underlying model is GPT-5.3-Codex, released February 5. OpenAI claims it’s 25% faster than its predecessor and scores 77.3% on Terminal-Bench 2.0 versus Claude’s 65.4%. On SWE-Bench Pro, the gap narrows to almost nothing: 56.8% versus Claude’s comparable range.
“This is helpful for testing and iterating on frontend changes, testing apps, or working in apps that don’t expose an API.”, OpenAI, on computer use for developers
The computer use approach differs meaningfully from Anthropic’s Claude Cowork. Codex runs background agents that don’t take over your screen. OpenAI claims a “secret sauce” that lets agents operate apps without bogging down the system. Claude Cowork, by contrast, leans into a step-by-step collaborative model where you watch the agent work.
Thread Automations might be the sleeper feature here. Codex can now schedule future work for itself and wake up to continue long-term tasks automatically. That’s the same territory as Claude Code Routines, which lets coding agents run on cron schedules without a human present. Both companies clearly believe unattended agent work is the next frontier.
The memory feature (opt-in, still preview) stores personal preferences, corrections, and context from past sessions. Enterprise and EU rollout is listed as “soon,” which in OpenAI’s vocabulary means anywhere from next week to next quarter.
Why We’re Watching
The AI coding tool war just became a desktop agent war. A year ago, these tools autocompleted lines of code. Today they operate your entire computer. OpenAI and Anthropic are converging on the same product vision (background agents, scheduled tasks, persistent memory) from opposite architectural directions. OpenAI builds a centralized super-app. Anthropic builds a CLI-native local tool. Both cost $100/month at the top tier.
The benchmark race matters less than the UX bet. Developers will pick whichever tool disappears into their workflow. Right now that’s a coin flip.
Watch the plugin ecosystem. 111 integrations on day one is OpenAI flexing its platform gravity. If third-party MCP servers and plugins consolidate around one platform, the other loses the distribution game regardless of model quality.