Tokens & Signals for 3/25/2026. We scanned ~1,200 Twitter accounts (1287 tweets), 13 subreddits (66 posts), Hacker News (9 stories), 7 newsletter posts, 3 podcast episodes, 238 Discord messages, and leaderboard data for you. Estimated reading time saved: ~12 hours.
* Google's TurboQuant is a big deal for memory: it compresses LLM KV caches by 4-6x with almost no accuracy loss, which means much longer context windows on the same hardware you already have. research.google/blog/turboquant-redefining-ai-e...
* OpenAI is shutting down the Sora app to free up compute for core LLM research and robotics. The experiment is over, and a $207B funding gap has a way of focusing the mind. x.com/ClementDelangue/status/2036548842099781694
* Intel just dropped the Arc Pro B70/B65 GPUs — 32GB of VRAM for $949. If you want to run 70B+ models locally on a workstation without selling a kidney, this is worth paying attention to. reddit.com/r/LocalLLaMA/comments/1s3e8bd/intel_...
* Bernie Sanders and AOC introduced the "Data Center Environmental Sustainability Act," which would freeze new data center construction until strict energy and water standards are in place. reddit.com/r/agi/comments/1s3cjb3/bernie_sander...
* @fchollet on the new ARC-AGI-3 benchmark: "Frontier models are scoring under 1% on the private test set. We keep building benchmarks, they keep failing. The gap between AI and human-level general intelligence remains enormous." x.com/fchollet/status/2036863769981403497
Sakana AI's "The AI Scientist" just landed in Nature*, which is the scientific community's formal stamp of approval that AI agents can independently run, write up, and peer-review actual research. x.com/SakanaAILabs/status/2036840833690071450
* Cursor dropped their Composer 2 report, breaking down how reinforcement learning got their coding agent to a 92% success rate on code edits while cutting latency by 40%. Good read if you're building agents. x.com/cursor_ai/status/2036566134468542651
* Anthropic's Claude mobile app now lets you view and navigate files in Figma, Canva, and Amplitude directly inside the chat. It's starting to feel less like a chatbot and more like an actual workspace. x.com/claudeai/status/2036850783526719610
Best to Build With Today
* Coding — claude-opus-4-6-thinking-auto is the top performer for complex, multi-file engineering tasks.
* Reasoning — gemini-3.1-pro-preview-high leads current benchmarks for general math and difficult, multi-step logical reasoning.
* Chat — gemini-3.1-pro-preview is the best all-around assistant for general conversation.
* Open-source — NVIDIA Nemotron 3 Nano 30B is a strong, efficient choice for local inference.
* Value pick — gpt-oss-20B is the most budget-friendly option for high-volume tasks at $0.1/M tokens.
Deeper Dives
🧠 Models & Research
Google Research: TurboQuant for Efficient KV Cache
TurboQuant uses non-uniform quantization to shrink LLM key-value cache memory by 4-6x without losing accuracy. Less memory footprint means models can handle much longer context windows on the same GPU hardware.
Why it matters: Memory is the biggest bottleneck for long-context LLMs — this effectively multiplies your available VRAM.
� Twitter� Hacker News
ARC-AGI-3 Benchmark Released
The ARC Prize team released ARC-AGI-3, a benchmark focused on agentic efficiency — the ability to act, observe, and learn in novel, non-language environments. Top models are struggling badly, scoring under 1% on the hardest subsets.
Why it matters: It shifts the focus from "PhD-level" memorization to real-world, generalizable problem-solving — which is where the actual gap still lives.
� Twitter� Reddit� Hacker News
Cursor Releases Composer 2 Report
Cursor's report breaks down how reinforcement learning and self-summarization let their system handle complex, long-running coding tasks. The results: 92% success rate on generated edits and 40% lower agent latency.
Why it matters: It's essentially a blueprint for building agentic coding systems that stay reliable during long-horizon editing work.
� Twitter
The AI Scientist Published in Nature
Sakana AI's "The AI Scientist" is now formally published in Nature, confirming the framework can autonomously run the full scientific lifecycle — ideation, coding, experiments, and drafting the paper.
Why it matters: It's peer-reviewed validation that foundation models can produce peer-review-quality research on their own. A bit meta, honestly.
� Twitter
🚀 Products & Launches
Anthropic Claude Mobile Update
Claude's mobile app now has direct, interactive integrations with Figma, Canva, and Amplitude — you can view and work with files from those tools right inside the chat.
Why it matters: It turns the mobile app from a passive chatbot into something that actually feels like a workspace.
� Twitter
💼 Industry & Business
OpenAI Shutting Down Sora Application
OpenAI is killing the standalone Sora app to redirect compute toward core LLM research and robotics. It's a pretty clear signal about where they think the real value is — and how tight compute resources actually are right now.
Why it matters: It marks a deliberate pivot away from consumer video tools toward reasoning and productivity models.
� Twitter� Reddit
Intel Launching 32GB VRAM Arc Pro B70/B65 GPUs
Intel's new workstation GPUs pack 32GB of VRAM starting at $949, optimized for INT8/FP8 compute and local LLM inference.
Why it matters: This meaningfully lowers the barrier for individuals and small teams who want to run high-parameter models locally on private data.
� Reddit
Sanders and AOC Propose Data Center Bill
The proposed legislation would put a nationwide moratorium on new data centers over 50MW until strict environmental and labor standards are established.
Why it matters: This is the start of real legislative friction around the massive physical infrastructure AI actually requires to run. It won't be the last bill like this.
� Reddit
Funding & Deals
* Plaid acquired Twi: The fintech infrastructure giant keeps consolidating, betting on AI-driven automated financial services.
Launches
* Ensu: A new privacy-focused app for running LLMs locally — no cloud required.
* LiteParse: A fast, free, non-VLM document parser built for feeding context to AI agents.
Closing thought: The AI industry is entering a genuinely physical phase — where compute, energy, and water costs are now real enough to kill products and attract politicians. We're done playing with toys. The obsession now is the infrastructure required to actually keep this thing running.