April 20, 2026super-user

Super User Daily: April 19, 2026

The biggest signal today wasn't a new product. It was the ecosystem moving from playing with agents to running them as infrastructure. People are reverting Opus 4.7 to 4.6 because the new tokenizer eats 40% more, dentist-aged founders are writing wills around their Mac mini OpenClaw setups, and a guy who lost 665 monetized blog posts dashed to Akihabara, dropped $337k yen on a new Mac, and burned 80% of his $30k Max plan in a single day to recover them. The pattern: agents have moved past curiosity into the spend column of real businesses, and the operational lessons are sharpening fast.
@dannylivshits [OpenClaw]
OpenClaw#1
https://x.com/dannylivshits/status/2045921779605237978
A weekend project that's actually a stress test of agentic operations. Install OpenClaw on Ubuntu with local models, harden security, set up Codex as QA assistant, layer in an AutoResearch self-improvement loop, monitoring dashboards, and a Karpathy-style wiki. He had Claude write the 13-step install manual then told it to execute end-to-end. This is the new template for testing agents — give them documentation and let them stand up their own stack overnight.
@Barret_China [Claude Code]
Claude Code#2
https://x.com/Barret_China/status/2045787288618299542
The clearest field report on why Claude Code stops mid-job. He had it backfill ~1,000 unit tests, it died at 200. Walks through the full mechanism: 80k tokens triggers compact, history becomes summary, model forgets its own work, eventually exits without a tool call. Solution is the master/sub-agent pattern with progress.json on disk and explicit bounded prompts per child. Claude Code already has /coordinator mode for this. If you're trying to run long jobs in one session, this thread saves you a week of debugging.
@AYi_AInotes [OpenClaw]
OpenClaw#3
https://x.com/AYi_AInotes/status/2045825582600958461
Garry Tan got fed up with OpenClaw sub-agents timing out and losing all progress, so he built Minions — a Postgres-backed task queue inside GBrain. His production benchmarks: 30-day social ingest task went from 0% success at 10s timeout to 100% in 753ms, memory from 80MB to 2MB, token cost to zero. 19 cron jobs running in parallel, no failures. Restart-tolerant. The deeper point: the multi-agent bottleneck was never the model, it was the same backend plumbing engineers have been solving for 30 years — queues, state, retries, persistence.
@smaxor [Claude Code]
Claude Code#4
https://x.com/smaxor/status/2045971525409661307
Mass-reverted every Claude Code project from Opus 4.7 back to Opus 4.6. Tasks that took 10-15 minutes on 4.6 were taking 60-90 on 4.7. Same prompts, same project. More hallucinated paths, more circular refactoring, more unsolicited "let me restructure this." After revert, first task back to 12 minutes. He notes if you want the 1M context window you must use the model string `claude-opus-4-6-1m`, not the default. Real benchmark, real production code, not a vibe check.
@cyrilXBT [Claude Code]
Claude Code#5
https://x.com/cyrilXBT/status/2045791572764283272
Got laid off, built career-ops on Claude Code, evaluated 740+ job offers with it, landed Head of Applied AI. The system grades each role A-F, generates ATS-optimized PDFs, runs salary research, prepares interview material, and tracks everything. 14 skill modes. 45+ company portals pre-loaded. Refuses to recommend applying below 4.0/5 — it's a filter, not a spray-and-pray tool. Open-sourced, MIT, 8.2k stars. The job he got wasn't from an application — a CEO saw the system and hired him. The agent was the portfolio.
@09pauai [Claude Code]
Claude Code#6
https://x.com/09pauai/status/2045822734756987339
Lost 665 monetized blog posts (¥10M+ lifetime revenue) when Claude Code wiped them. Sprinted to Akihabara, bought a ¥337,800 Mac on the spot, spent 80% of his ¥30,000 Claude Code Max plan in a single day rebuilding everything. Site now back making ¥5,384/day. The recovery story is the data point — when your business is built on agent throughput, your disaster recovery plan is also an agent throughput problem.
@goyalshaliniuk [Claude Code]
Claude Code#7
https://x.com/goyalshaliniuk/status/2045827250373705761
12 agents tackling 3,528 TypeScript errors in parallel burned through 5 hours of Opus credits in 32 minutes on the Claude Code Max 20x plan. Concrete data point on what fleet operation actually costs and how fast you exhaust quotas at scale. The math everyone obsessed with parallel sub-agents needs to see.
@MGMurray1 [Claude Code]
#8
https://x.com/MGMurray1/status/2045837567539413342
62 days of running agent operations as eval-driven autoresearch. 37 daily tasks. 105+ deliverables. Every recurring task has an ideal trajectory. Every failure becomes a regression eval. The system proposes improvements, tests against historical outputs, promotes winners. Git history is the research log. Agents running eval loops produce measurably better output by week 4 than week 1 — not because the model improved, because the specs improved. This is what "production agent ops" looks like, not a thread of one-shot prompts.
@nummanali [Claude Code]
Claude Code#9
https://x.com/nummanali/status/2045963036322886141
Working setup for Codex and Claude Code talking to each other through cmux. One prompt to bootstrap: ask the agent to identify both surface IDs and write a message protocol with XML tags into AGENTS.md. Now Codex can ask Claude for a code review and vice versa, with you orchestrating. Cmux extends to controlling tmux panes, browsers, sending keys, reading screens, live-reload markdown viewers, structured per-workspace logs, and progress bars for long agent runs.
@0xViviennn [Claude Code]
OpenClaw#10
https://x.com/0xViviennn/status/2045776994131234981
Hermes ships delegate_task for spawning sub-agents but no peer-to-peer agent comms. Built hermes-a2a using Google's A2A protocol so messages enter the live agent session instead of spawning a new process. Now her Hermes agents review code with Claude Code over A2A, and her Hermes talks daily to a friend's OpenClaw agent about philosophy. Privacy isolation built in so private memory doesn't leak into A2A messages. Single command install.
@Prince_Canuma [Claude Code]
Claude Code#11
https://x.com/Prince_Canuma/status/2045781748571681231
Left a Claude Code session running on his M3 Ultra at home, traveled ~300km away, and the M3 Ultra auto-updated, restarted, and killed both his session and Tailscale. He SSH'd into a separate Linux box, asked Claude Code there to scan the network, SSH into the M3 Ultra, and restart Tailscale. Worked. Session resumed. The "agent calling agent across machines to fix infrastructure" pattern is now a one-shot.
@jbarbier [Claude Code]
#12
https://x.com/jbarbier/status/2045748791505305798
4-day project: 1 contributor, 137 commits, 14,664 net LOC/day, ~$27 prorated on Claude Max. Estimated pre-AI cost: $1,022,000 and 70 months solo. The /cost-estimate skill he built lets you run this on any of your projects to see your own multiplier. The data point is the comp — what software cost in 2024 vs what it costs you to ship the same thing today.
@theabhimanyu [OpenClaw]
OpenClaw#13
https://x.com/theabhimanyu/status/2045931822069023119
The honest critique nobody else is making: OpenClaw has no awareness of what's happening on your computer. Changes you make in the terminal don't sync. What happens outside the agent's session stays outside. Pointed criticism from someone who actually uses it — design choice that limits how far you can take it.
@lucas_flatwhite [Claude Code]
#14
https://x.com/lucas_flatwhite/status/2045884392699199975
Korean breakdown of Garry Tan's "Thin Harness, Fat Skills" essay with concrete worked examples. The 100x productivity gap isn't model intelligence — it's the structure wrapping the model. Five concepts: skill files (markdown method calls), thin harness (just loop and context), resolver (routing table that loads EVALS.md when you touch a prompt), latent vs deterministic spaces (don't push deterministic problems into latent space), and diarization (structured profile compression that SQL/RAG cannot replace). Karpathy-grade synthesis applied to operations.
@PedroChermont [Claude Code]
Claude Code#15
https://x.com/PedroChermont/status/2045876513581519060
Two months ago his Leblon firm used only Claude chat. Today 12 people running Claude Code daily. Opus 4.7 just landed, +10 points on SWE-bench in 3 months. "In 30 years of market, I've never seen a tool evolve at this cadence." Single-firm cross-section of how fast adoption flips when the team sees one person ship something they couldn't.
@machidento [OpenClaw]
OpenClaw#16
https://x.com/machidento/status/2045773772939206963
A 66-year-old dentist writing publicly that practitioners with Mac minis running OpenClaw should write wills covering their setups. Specifically for those with kids in middle/high school. The signal here isn't the demographic — it's that solo practitioners building agent-driven ops now consider their setup a transferable estate asset. The Open agent stack has crossed the line from hobby into business continuity planning.
🗣 User Voice
User Voice

Opus 4.7 verbosity and cost are pushing power users back to 4.6. Three independent reports today: more verbose responses, faster quota burn, sometimes worse output. Token cost going up while behavior gets less predictable. Quote: smaxor, marcioportes, Sattyamjjain.

The token-budget problem is taking over the day. Hitting limits in 32 minutes with 12 agents (goyalshaliniuk), Mintlify-style fine-tuning suggestions, and serious calls for "datadog for LLM cost." Teams are starting to budget agent operations as compute spend, not subscriptions.

Multi-agent peer communication is the missing primitive. Hermes only does parent-child delegate_task. Custom A2A wrappers (hermes-a2a, cmux) are emerging because users want their agents to discuss work across frameworks. Quote: 0xViviennn, nummanali.

The OpenClaw plateau is real. Multiple users (theabhimanyu, galileowilson, juliarturc) say it lacks computer-state awareness, hits ban issues, and overcomplicates simple tasks. Hermes is taking mindshare with persistent memory and self-improving skills. Garry Tan is patching the queue layer with Minions.

Production-grade agent ops is becoming its own discipline. Eval frameworks, regression detection, sub-agent orchestration with progress files, fleet management. The people winning aren't writing better prompts — they're building better operating environments.
📡 Eco Products Radar
Eco Products Radar

Claude Code: still the dominant terminal-first agent. Opus 4.6 with 1M context regaining favor over 4.7 for reliability and cost.

OpenClaw: heaviest day yet of mixed signals — viral install events in Bengaluru, 100K-star competitor pressure from Hermes, Garry Tan shipping Minions queue layer, and growing chorus of "underdelivers for non-technical users."

Hermes Agent: 100K GitHub stars in 53 days, persistent memory and self-improving skills. Now supported in Ollama. New A2A protocol layer for peer comms. Positioned as the first viable OpenClaw replacement.

Codex: paired with Claude Code increasingly via cmux for cross-review workflows. Strong stable performer for users running into Claude rate limits.

Cowork: Claude's middle child — desktop app sitting between chat and Code. Users keep asking why they'd use it over Claude Code directly.

Claude Design: research preview. Ships with one-click handoff to Claude Code. Burns tokens fast, has a separate quota. DTC and design teams trying it; designers split on whether it's a Figma threat or a useful prototype tool.

GBrain / Minions: Garry Tan's stack. Postgres-native task queue for OpenClaw sub-agents. Solves the timeout/state-loss problem.

Hyperframes: HeyGen's HTML-to-video skill. Plug in npx skills add heygen-com/hyperframes, run inside Claude Code, get a rendered video from a URL or HTML page.

claude-mem: persistent memory across Claude Code sessions, 62.5k stars. Auto-runs in background, hooks-based, stores compressed summaries.

cmux: terminal multiplexer for AI agents. Drives panes, browsers, surfaces, notifications. Becoming the orchestration layer for multi-agent setups.
← Previous
QA Crow gives indie devs a 50 dollar QA agent instead of a 8K SaaS
Next →
Loop Daily: April 19, 2026
← Back to all articles

Comments

Loading...
>_