Tokens & Signals · Monday, April 13, 2026

Anthropic’s Mythos: The New Agentic Ceiling

mythosspudgpt-5.5claude-codeminimax-m2.7gpt-5.4-xhighgpt-5.2-codexclaude-opus-4-6-thinking-autogemini-3.1-proclaude-sonnet-4-6-thinking-autogpt-5.4anthropicopenaiminimaxperplexitymetakaustswe-bench-procoding-agentsopen-sourceself-evolutionmodel-benchmarkingagentic-workflowsai-safetygui-cursor-controlsam-altmankarpathyscaling01petergyanggarrytan
Tokens & Signals for 4/13/2026. We scanned ~1,200 Twitter accounts (1399 tweets), 13 subreddits (70 posts), Hacker News (21 stories), 8 newsletter posts, 9 podcast episodes, 421 Discord messages, and leaderboard data for you. Estimated reading time saved: ~17 hours.

TLDR

* Anthropic's 'Mythos' Preview is tearing through benchmarks (77.8% on SWE-bench Pro), but it's locked behind 'Project Glasswing' security gates — regular devs can't touch it. x.com/chatgpt21/status/2043396662216061352

* OpenAI is rushing 'Spud' (GPT-5.5) to production. Internal memos describe it as a "foundation for the next generation of work" — basically their play to catch up with Anthropic's agentic lead. x.com/ns123abc/status/2043682835026923725

* Claude Code users are not happy: a silent cache TTL change (1 hour → 5 minutes) is hiking costs by 32% and eating through usage quotas. No announcement, no warning. x.com/petergyang/status/2043466663258140685

* MiniMax M2.7 is the new open-source heavyweight — a 230B MoE model that used autonomous "self-evolution" loops to literally write its own training code. huggingface.co/MiniMaxAI/MiniMax-M2.7

* Sam Altman's home was hit by two separate attacks (a Molotov cocktail and a shooting) within 48 hours. Two suspects are in custody. x.com/AndrewCurran_/status/2043461046745370631

* Perplexity rebranded its direction as "Perplexity Computer," an agent platform juggling 19 models, and watched ARR jump 50% in a single month to $500M. x.com/testingcatalog/status/2043746464656933275

* @karpathy on where things are headed: "We're watching the transition from 'apps you interact with' to 'agents that work for you in the background.' The engineering focus moved from prompts to systems." x.com/garrytan/status/2043566215927328955

* @scaling01 on the "AI Slop" debate: "The golden age of raw, unrestricted performance is hitting a wall as providers optimize compute for the masses." x.com/scaling01/status/2043338102983315839


Go deeper on what matters to you

Tap to expand

Best to Build With Today

* Codinggpt-5.4-xhigh for agentic workflows; gpt-5.2-codex for pure coding.

* Reasoningclaude-opus-4-6-thinking-auto is the current SOTA.

* Chatgemini-3.1-pro dominates the Arena rankings for conversational flair.

* Open-sourceMiniMax M2.7 is currently the best performer for agentic, locally-run software tasks.

* Value pickclaude-sonnet-4-6-thinking-auto gives you 90% of the top-tier power for half the cost.


Deeper Dives

🧠 Models & Research

Anthropic's Mythos Preview Benchmarks

Mythos is the most capable model we've seen right now — 77.8% on SWE-bench Pro versus GPT-5.4's 57.7%. The catch: it's gated behind "Project Glasswing" partners specifically because of its zero-day exploit capabilities. Anthropic isn't taking chances.

� Twitter� Reddit

MiniMax Releases M2.7 Agentic Model

MiniMax's new 230B MoE model hits 56.22% on SWE-Pro, which is impressive on its own. But the really interesting part is how it got there — a "self-evolution" training loop where it autonomously rewrote its own training code as it learned.

� Twitter� Discord

Meta's 'Neural Computer' Paradigm

Meta and KAUST researchers are done messing around with brittle OS wrappers. Their new approach bakes the entire OS environment directly into the model's latent states — and early testing shows 98.7% GUI cursor control accuracy.

� Twitter� Hacker News

💼 Industry & Business

OpenAI's 'Spud' Model Development

Leaked memos show OpenAI is moving fast to ship 'Spud' (expected to land as GPT-5.5). What's different this time: it's being built from the ground up as an autonomous agent foundation for enterprise work, not just a smarter chatbot.

� Twitter

Perplexity's Agentic Pivot

Perplexity has made the jump from search tool to full-blown multi-model agent platform. The "Perplexity Computer" pivot seems to be working — ARR hit $500M after a single month of that growth run.

� Twitter

Sam Altman's Security Escalation

Two physical attacks on his home in 48 hours. The industry is now having a very real conversation about the personal safety of AI leadership.

� Twitter� Reddit

🚀 Products & Launches

Claude Code Cache Regression

Anthropic quietly cut cache TTLs from 1 hour down to 5 minutes. Developers are calling it a "silent tax" — the model now re-processes tokens far more frequently, and budgets are taking a hit. No changelog, no heads-up.

� Reddit� Hacker News


Launches

* MiniMax M2.7 — Open-source MoE model with an autonomous self-evolution training loop baked in.

* Perplexity Computer — Enterprise agent platform that orchestrates 19+ models to get workflows done end-to-end.


Closing thought: The industry has moved on from chatbot hype. We're in the era of the background worker now — where the measure of a model isn't how clever its answers sound, but how reliably it can do your job while you're off doing something else.