Radar
Where I'm finding leverage.
πPinned
Proactive SRE with full codebase context
Agents on cron review every block of telemetry, not just the spikes humans noticed. Loaded with the full codebase, they correlate errors and latency outliers to specific code paths and propose fixes. The "unknown unknowns" tier of SRE is finally cheap enough to staff.
πPinned
Suspended persistent microVMs
Firecracker-class VMs you can pause in milliseconds and resume just as fast, full state intact. Cheaper than always-on, more durable than serverless cold-starts. The unit of agent compute is shifting from request to session.
πPinned
Terminal computer use
Gemini 3 topped terminal bench with a simple harness, not an agent framework. Container runtime + read/edit/bash + agent SDK as inner loop. More durable than MCP, more practical than GUI computer use.
πPinned
Trendlines held
3-4 more OOMs of training compute on the horizon with Vera Rubin and Stargate. Scaling trendlines held steady through Gemini 3. UPDATE: METR's latest long-horizon agent results only reinforce the trajectory.
π₯Clutch
Progressive docs
Same shape as Claude skills: index up top, drill-down references, load context only when needed. Beats a 50-page wiki dumped into the window. Docs structured for agents are also clearer for humans.
π₯Clutch
Agents make OTel a dream
Instrumenting everything used to be a tax β now agents emit the spans, draft the dashboards, and write the APL queries. Observability finally pays for itself when reading and writing telemetry is free.
π₯Clutch
Generator-verifier gap
LLMs are better at verifying than generating. Exploit this with external verification loops (retry on failure) or internal ones (code execution + self-correction). Either way, verify rigorously.
π₯Clutch
Resetting context window
Don't use more than 50% of your context window. Use subagents for focused work. Start over when things go off track - don't try to steer back. Do research, store it, then leverage in fresh context.
π₯Clutch
Everyday max effort
Default to deep-thinking modes for everything, not just competition math. Mini research partners that work through problems methodically. Now available via API across frontier providers.
πWatch
Claude agent skills
Progressive AGENTS.md with supporting resources. Easier to build than MCP, more flexible structure, avoids context bloat from heavy tool servers.
