Tokens & Signals for 3/16/2026. We scanned ~605 Twitter accounts, 13 subreddits (0 posts), Hacker News (15 stories), 10 newsletters, 10 podcasts, and leaderboard data for you. Estimated reading time saved: ~41 hours.
TLDR
* Moonshot AI's new "Attention Residuals" architecture cuts training compute by 20% by letting transformer layers attend to previous representations. x.com/teortaxesTex/status/2033387106337267868
* Perplexity's "Computer" agent landed on Android with deep Samsung Galaxy S26 integration, juggling ~20 different models to handle multi-step tasks. x.com/AravSrinivas/status/2033603347534713300
* Recursive Language Models (RLMs) using GPT-5-mini outperform vanilla GPT-5 by 49% on long-context benchmarks — just by thinking in smaller, repeated steps. x.com/dylan522p/status/2033568059643076873
* Mistral AI is teaming up with NVIDIA to co-develop frontier open-source models, combining their architecture chops with NVIDIA's compute muscle. x.com/MistralAI/status/2033642111984177247
* Ollama is now an official provider for OpenClaw, so you can run local, tool-capable agents without wrestling with custom adapters. x.com/ollama/status/2033339501872116169
* A fake Japanese metal band called "Neon Oni" racked up 80,000 monthly Spotify listeners using Suno AI before someone blew the cover. x.com/TheRundownAI/status/2033568236227244451
* Anthropic's Claude Code hit a $1B annualized revenue run rate within six months of launch. x.com/gdb/status/2033605419726483963
Best to Build With Today
* Coding — gpt-5.4-xhigh leads agentic coding tasks. Use this for complex multi-file edits.
* Reasoning — claude-opus-4-6-thinking-auto is the current leader for complex logic and math.
* Chat — gemini-3.1-pro-preview is currently the best general-purpose assistant.
* Open-source — Ollama + OpenClaw is the new standard for private, local agentic workflows.
Deeper Dives
🧠 Models & Research
Moonshot's 'Attention Residuals' (AttnRes)
Moonshot AI swapped out fixed residual connections in transformers for a learned depth-wise attention mechanism. By letting layers selectively attend to previous outputs, they squeezed out 1.25x better compute efficiency.
* Why it matters: This might be the most meaningful change to transformer architecture in years — large-scale training could get a lot cheaper.
� Twitter
Recursive Language Models (RLMs)
RLMs treat context like a REPL, recursively querying the model to chew through large data snippets in chunks. This inference-time trick lets GPT-5-mini outperform vanilla GPT-5 on long-context tasks — no massive context window required.
* Why it matters: Bigger isn't always better. Sometimes you just need to let the model think in smaller, repeated steps.
� Twitter
Extracting Agent Skills from GitHub
A new framework from DAIR AI automatically spots procedural knowledge buried in repo structures, turning raw code into agent-executable skills.
* Why it matters: Agents can now bootstrap their own abilities by studying how humans organize code, instead of needing everything hand-wired upfront.
� Twitter
🚀 Products & Launches
Perplexity Computer on Android
Perplexity brought its agentic orchestration engine to Android. It taps ~20 models to handle tasks, bundles a local "Comet" browser for deep web research, and gives Samsung Galaxy S26 users extra OS-level hooks to play with.
* Why it matters: Agentic automation is jumping from the desktop to your pocket — and it's starting to touch the actual OS.
� Twitter
Ollama for OpenClaw
Ollama is now an official OpenClaw provider. Route any local model into an agentic workflow with one command — no custom API adapters needed.
* Why it matters: If privacy and zero-latency local agents matter to you, this is the easiest on-ramp there is.
� Twitter
💼 Industry & Business
Mistral AI & NVIDIA Partnership
Mistral and NVIDIA are co-developing frontier open-source foundation models, pairing Mistral's architecture with NVIDIA's massive training infrastructure.
* Why it matters: Open-source AI is finally getting the same hardware-plus-model optimization that's kept closed labs like OpenAI ahead for years.
� Twitter
Aman Gotchu Joins SpaceX & xAI
Aman Gotchu, founder of the Firebender coding agent, is heading to the Musk-led AI and space companies to build specialized coding agents.
* Why it matters: Coding agents are trending toward vertical integration — expect AI built specifically for hardware and rocket engineering stacks.
� Twitter
Claude Code's $1B Revenue Run Rate
There was some noise about GPT-5.4 hitting a $1B revenue milestone, but the actual story here is Anthropic's Claude Code hitting that number in just six months.
Why it matters: People are paying real money for AI that can actually write and run* code. The "agent" market isn't theoretical anymore.
� Twitter
Canada's Surveillance Bill (C-22)
Researchers are flagging that Bill C-22 carries backdoor surveillance risks that could force AI platforms to hand over user data to the government.
* Why it matters: If you're building privacy-focused tools in Canada, this is a serious legal risk worth tracking closely.
� Hacker News
🔥 Takes & Drama
The "Neon Oni" AI Band Scandal
"Neon Oni," a Japanese metal band with 80,000 monthly Spotify listeners, turned out to be a complete AI fabrication — music, bios, the whole thing.
* Why it matters: AI-generated personas can now pass the market test. Platforms are going to have a very hard time figuring out how to verify there's an actual human behind any of this.
� Twitter
* @ollama on OpenClaw: "Ollama is now an official provider for OpenClaw agents." x.com/ollama/status/2033339501872116169
* @MistralAI on NVIDIA deal: "We are partnering with NVIDIA to co-develop frontier open-source AI models." x.com/MistralAI/status/2033642111984177247
* @AravSrinivas on mobile agents: "Computer is now available on Android with deep OS integration." x.com/AravSrinivas/status/2033603347534713300
* @AmanGotchu on his move: "Excited to share I'm joining SpaceX and xAI to build coding agents." x.com/AmanGotchu/status/2033256922598830464
* @dylan522p on RLMs: "Recursive Language Models outperform vanilla GPT-5 by 15 points on long-context benchmarks." x.com/dylan522p/status/2033568059643076873
* @gdb on revenue: "Claude Code achieved a $1B annualized revenue run rate." x.com/gdb/status/2033605419726483963
Closing thought: Between recursive inference tricks that beat frontier models and architecture shifts like AttnRes trimming compute costs, the clearest pattern right now is that smarter software often punches harder than just throwing more chips at the problem. Oh, and your phone is about to get a whole lot more autonomous.