- Verifiable agent workflows and software surfaces
Agents that can inspect state, verify a build is actually moving, and modify the software surface itself.
Developer productivity grounded in usefulness:
All of the token counting is great but really what we should be able to get now is a meaningful metric for developer productivity. Just run a nano model over every commit and ask the model to rate the codes usefulness with your teams intended goal and some grounding of the codebase
Always fun when you notice Codex being clever in a way you don't expect. In a session today, it was running a slow build process and got annoyed (don't we all). Before making a change it checked that progress was actually happening and did so not by checking the logs, but by checking CPU usage.
I put Codex into DOOM using Codex app server. Codex modified the actual game files and modified the game engine to render the terminal interface natively and have Codex engage with the game.
Friday, 3 April 2026
- Consequential OpenAI utility
In health:
I had a health scare this week (I’m ok) and after the lab result came back I put it in @ChatGPTapp and it immediately gave me a (personalized) analysis that was better than the specialist I saw (much) later. I’d have been freaked out for days without it. So grateful for AI.
ChatGPT voice mode now available via CarPlay
And science:
OpenAI is starting to give scientists superpowers and accelerate real scientific discovery.
We are excited to share a new paper solving three further problems due to Erdős; in each case the solution was found by an internal model at OpenAI.
- Codex deployability
Codex can write code, fix bugs, ship PRs. It doesn't get rolled out - it spreads. We are seeing unprecedented demand for Codex, but enterprise adoption stuck in fragmented paths. No more upfront cost! Pay only for what enterprise customers actually use.
Thursday, 2 April 2026
- Codex workflow craft and product ergonomics
One of the biggest unlocks in Codex UX seems to be workflow craft: lightweight house rules in
AGENTS.md, auto-cleanups, compaction, and background execution. Great agent products feel steerable.Plan mode creates more slop than actually iterating through the design, data model and interfaces through a conversation with the model.
House rules as behavior shaping:
one of the best things about GPT 5.4 is that it'll just clean up random parts of your codebase as you work with it, purely because it saw a rule in the AGENTS md file
I'm having Codex build something with doom in the background while I'm doing chores and just jumped when I suddenly heard shots and grunting. Turned out Codex was testing the game
You can truly, build anything
Wednesday, 1 April 2026
- Subagent swarms
Orchestrated subagents and swarm-style workflow in Codex:
…fire 4 sub agents, and use main as an orchestrator in order to hunt for the precise root cause and hopefully resolution solution(s) for a given bug
Codex can run subagent workflows by spawning specialized agents in parallel and then collecting their results in one response. This can be particularly helpful for complex tasks that are highly parallel, such as codebase exploration or implementing a multi-step feature plan.
…create your own custom subagents ~/.codex/agents/bonsai.toml
The zeitgeist:
subagents maxxing
- Voice workflows
Appointments and intakes with gpt-realtime-1.5:
We built a clinic concierge demo for a Singapore health clinic with gpt-realtime-1.5. It speaks naturally with patients, collects the right details, and books appointments in real time.
Voice-to-frontend with Codex:
Using voice transcription with Codex Spark in the pop out window is wild for rapid front-end development! You don't even have time to use steering
- Codex as Chief of Staff
For knowledge work:
Plugins enable Codex to do most of my knowledge work. It reads and responds to emails and Slack. It builds models in Sheets and handles all of my note taking for me. Codex is quickly becoming my assistant. It understands me and what I’m doing enough to help get real work done.
For dinner planning:
Our incredible comms leader, @lindsmccallum, planned a closed door dinner for the Codex team in a fraction of the time it would normally take - thanks to Codex.
She used the Codex App to:
- compile the invite list
- send out invitations
- hourly scan of her emails to update RSVP status
- populate a doc with bios on every attendee
- create a mini app to plan the seating chart
Getting codex's help to me climb from under my inbox mountain
I've been using plug-ins a ton internally. I have about 58 automations and 30 plug-ins and I've automated everything except the part where I have to come up with ideas and talk to people
Tuesday, 31 March 2026
- The Infinity Machine
From Sebastian Mallaby:
Even by the standard of a tech industry stacked with so-called geniuses, Demis Hassabis is a special case. Born poor in North London to immigrant parents, a chess prodigy by age five and wizard coder in his teens, he turned down a seven figure offer before turning 18 to feed his insatiable scientific curiosity at Cambridge. Later, he added a neuroscience PhD to his computer science skills to pursue the dream of artificial general intelligence, the ultimate goal being to unravel the mysteries of biology and theoretical physics and to usher in super-abundance. Alongside a small group of fellow travelers, that is the path he is still on, leading the AI research at Google, winning a Nobel Prize along the way, and imagining machines that will compound, or possibly supplant, the human understanding of the universe.
- AI compute in orbit
Space as a new compute layer. From Starcloud:
The round comes after the successful deployment of our first satellite, Starcould-1, a few months ago, which had the first @NVIDIA H100 on board and was the first to train an LLM in space. The funds will be used to develop our third satellite, which aims to be cost-competitive with Earth-based data centers in terms of AI inference cost.
- Alignment priors wash out under RL
Reading moral books doesn’t matter much if the later training regime teaches different habits.
From Tomek Korbak and team at OpenAI:
Can midtraining on docs about aligned AI bake in alignment priors for agents? We report an experiment where those priors are quickly washed away by RL and fail to generalize to agentic settings. But that cuts both ways: priors that AIs are misaligned fade too!