TLDR
/voice in the terminal, starting with ~5% of users now. reddit.com/r/ClaudeAI/comments/1rjkwqk/new_voic...Best to Build With Today
/voice mode (5% rollout). For building production voice agents, a Show HN this week demonstrated sub-500ms latency from scratch — worth studying the architecture. news.ycombinator.com/item?id=47224295Deeper Dives
🚀 Products & Launches
GPT-5.4 confirmed via Codex leaks — and they tried to hide it
Two pull requests in OpenAI's public Codex GitHub repo leaked GPT-5.4 references before being scrubbed. PR #13050 (opened Feb 27) added full-resolution image support — PNG, JPEG, WebP — with a minimum model version constant set to (5, 4), meaning the feature literally requires GPT-5.4 or newer. It received seven force-pushes in five hours during cleanup. A second PR (#13212) for a "fast mode" toggle also explicitly referenced gpt-5.4 as the model argument before being edited ~3 hours later. An OpenAI employee's screenshot showing GPT-5.4 in the Codex desktop model selector was also deleted. The full internal model ID that leaked: gpt-5.4-ab-arm2-1020-1p-codexswic-ev3. Rumors of a "GPT-5.4 Pro" variant with strong results are circulating, and @OpenAIDevs posted a cryptic "Soon." the same day.
Why it matters: This isn't speculation — it's three independent code-level leaks. GPT-5.4 is real and close. If you're making OpenAI vs. Claude API decisions for production in the next few weeks, something new is coming that could shift the comparison.
📱 Twitter · 💬 Reddit · 📧 Newsletter
x.com/OpenAIDevs/status/2028577643113922944
x.com/kimmonismus/status/2028783243311407531
reddit.com/r/singularity/comments/1rjdrty/gpt54_spotted_in_codex
Claude Code gets voice mode — say it out loud, it codes
Anthropic started rolling out native voice mode inside Claude Code on March 2–3, accessible via the /voice slash command directly in the terminal. About 5% of users have it now, with broader rollout over coming weeks. Context carries over seamlessly when you switch between typing and speaking in the same session — no separate app, no context loss. This lands on top of Claude hitting #1 on the App Store, even as the servers strain under the load (see: elevated errors below).
Why it matters: Removing the keyboard as the only input for rapid iteration with a coding agent is a genuinely different interaction model. "Talk through what you want, then let it build" could become a real workflow pattern — especially for rubber-duck debugging at speed.
📱 Twitter · 💬 Reddit
x.com/Yuchenj_UW/status/2028630059897287105
reddit.com/r/ClaudeAI/comments/1rjkwqk/new_voice_mode_is_rolling_ou...
Apple M5 Pro/Max: serious local AI hardware
Apple unveiled M5 Pro and M5 Max today. Apple's own ML research benchmarks show the M5 pushes time-to-first-token under 10 seconds for a dense 14B model and under 3 seconds for a 30B MoE — on a MacBook Pro. MLX benchmarks from Awni Hannun (@awnihannun) show up to 8x faster prefill and image generation vs. M1 Max, and up to 4x vs. M4 Pro/Max. The 2x faster SSD (up to 14.5GB/s) also helps with model loading. The r/LocalLLaMA community is already running 35B-A3B Qwen 3.5 on 22GB Mac setups.
Why it matters: A 30B MoE at under 3 seconds TTFT on a laptop changes what's viable for on-device AI products. The practical ceiling for what you can run without a cloud bill just moved up significantly — and it's in a form factor you carry around.
📱 Twitter · 💬 Reddit · 🔶 Hacker News
x.com/awnihannun/status/2028852360190345687
reddit.com/r/LocalLLaMA/comments/1rjqsv6/apple_unveils_m5_pro_and_m...
OpenAI Codex gets "Spark" — fastest model they've ever built
OpenAI started rolling out "Spark" to its heaviest Codex Plus users, describing it as the fastest model it has ever built. No public benchmarks yet — just the claim and a power-user-first rollout, which suggests OpenAI is stress-testing under real agentic workloads before going wide. The timing, right as Claude Code hits voice mode and $2.5B ARR, is not subtle.
Why it matters: Speed compounds in agentic coding loops — faster inference means tighter feedback cycles and more iterations per hour. This is OpenAI's direct counter-punch to Claude Code's momentum.
x.com/jeffintime/status/2028644388344340514
Qwen 3.5 GPTQ-Int4 weights live — deploy today
Alibaba released GPTQ-Int4 quantized weights for the full Qwen 3.5 series with native vLLM and SGLang support. The 9B model drops to ~7GB VRAM — a ~75% memory reduction. The full lineup spans 0.8B to 397B-A17B (a 512-expert MoE), all with 262K-token context windows and multimodal support. Ollama also added the small models (0.8B–9B) with native tool calling and thinking mode — but community reports flag overthinking loops in Ollama and LM Studio. Stick to llama.cpp, vLLM, or SGLang for actual testing.
Why it matters: You can now run a capable multimodal, tool-calling, long-context model on a single consumer GPU today with production-ready inference frameworks and zero custom quantization work. The baseline for self-hosted agentic apps just moved.
📱 Twitter · 💬 Reddit · 📧 Newsletter
x.com/Alibaba_Qwen/status/2028846103257616477
huggingface.co/collections/Qwen/qwen35
Claude elevated errors under continued user surge
Anthropic's status page logged two elevated error incidents on March 3 — at 03:15 UTC and 04:43 UTC — affecting claude.ai, the platform API, Claude Code, and Cowork. This is a direct consequence of the user surge that sent Claude to #1 on the App Store. Infrastructure is straining under the load.
Why it matters: For anyone evaluating Claude as a production dependency, reliability matters as much as capability. Two incidents in one day is worth tracking if you're making API commitments.
💬 Reddit · 🔶 Hacker News
reddit.com/r/ClaudeAI/comments/1rjea91/claude_status_update_elevate...
💼 Industry & Business
Cursor ARR doubles to $2B — possibly the fastest-growing SaaS ever
Bloomberg confirmed Cursor's annualized revenue topped $2B in February 2026, doubling from $1B just three months prior. About 60% of revenue now comes from enterprise — including Coinbase, OpenAI, eBay, Datadog, and Sentry — at $40/seat on the business tier, with over 50,000 enterprise customers. The disclosure was strategically timed to counter viral tweets claiming Cursor was losing ground to Claude Code. Some individual devs have switched, but enterprises clearly haven't. Meanwhile, Claude Code separately hit $2.5B ARR in its first 8 months. Both scaling at the same time kills the winner-take-all narrative.
Why it matters: Enterprise AI coding adoption is mainstream, not early-adopter. If you're building tooling for developers, these numbers are your market signal. The real competition is now for corporate procurement budgets, not $20/month indie subscriptions.
x.com/deedydas/status/2028608293531435114
x.com/ArfurRock/status/2028649107024445595
Meta AI smart glasses workers: "We see everything"
A Swedish outlet published a report citing Meta AI smart glasses workers saying they have visibility into everything captured by users' cameras — including content that's described as "disturbing." Former employees say sensitive data isn't supposed to reach human reviewers, but the algorithmic filter isn't reliable. Meta sold over 7 million pairs of Ray-Ban smart glasses in 2025 — so the scale of this data pipeline is massive. Meta's terms of service do technically allow "manual (human)" review of interactions, but most users probably don't realize that includes their camera feed.
Why it matters: If you're building on Meta AI or recommending these glasses for enterprise use, your users' video is potentially landing in a human review queue. That's not a hypothetical — it's documented.
🔶 Hacker News
news.ycombinator.com/item?id=47225130
Ars Technica fires reporter over AI-fabricated quotes
Ars Technica terminated senior AI reporter Benj Edwards after AI-generated quotes — content the model hallucinated — were published in an article attributed to a real source who never said them. The painful irony: the original story was about an AI agent publishing a "hit piece" on a developer who rejected its pull request. Edwards said he was sick with a high fever and "inadvertently ended up with a paraphrased version" of a source's words. The article was retracted. Ars editor-in-chief Ken Fisher confirmed the fabricated quotations came from an AI tool and violated editorial policy.
Why it matters: This is the high-profile AI hallucination incident that newsrooms have been quietly dreading. When a journalist who covers AI for a living accidentally publishes hallucinated quotes, the pressure for newsroom-level AI output verification just got a lot louder. Expect every outlet to update their AI use policies in response.
🔶 Hacker News
news.ycombinator.com/item?id=47226608
Supreme Court won't touch AI art copyright — the ruling stands
The US Supreme Court declined to review Stephen Thaler's case involving his AI system DABUS, which autonomously created visual artwork. Every court that's touched this has ruled the same way: human authorship is a bedrock requirement for copyright. The Trump administration sided against Thaler, urging the court not to take the case. That's now settled law — no congressional action, no copyright.
Why it matters: If your product ships AI-generated visuals, you have zero IP protection on those outputs in the US. Anyone can copy them. Any licensing or product strategy built around owning AI-generated images needs to be rethought now.
🔶 Hacker News
news.ycombinator.com/item?id=47232289
🧠 Models & Research
Claude Opus 4.6 solves a Don Knuth problem
A paper on Stanford's CS faculty site documents Claude Opus 4.6 successfully solving a problem posed by Donald Knuth — the guy who literally wrote The Art of Computer Programming. No benchmark leaderboard needed here. It's one problem, but it's a credible, named, peer-visible proof-point that translates to non-technical stakeholders better than any eval score.
Why it matters: "Solved a Knuth problem" is the kind of real-world signal researchers and skeptics actually care about. LiveBench already has Claude Opus 4.6 (thinking) at 88.7 in reasoning — this gives you a concrete anchor for what that score actually means.
🔶 Hacker News
news.ycombinator.com/item?id=47230710
🔥 Takes & Drama
AI code review is the new bottleneck — data from 10,000 devs
Faros analyzed data from 10,000 developers across 1,255 teams. High AI adoption teams see +21.4% task throughput and +97.8% PR merge rate. The catch: median code review time surges 91.1%. AI is generating code faster than humans can review it, and the gap is only getting wider. @swyx is asking the uncomfortable question: is human-gated code review even a viable process anymore at this volume, or is it becoming security theater?
Why it matters: If you're running an engineering team, the review pipeline is your new constraint. Automated verification tooling is moving from "nice to have" to necessary infrastructure — and the sooner you start, the better.
x.com/swyx/status/2028795270306079156
OpenAI system prompt tells GPT not to call ads "annoying"
A leaked system prompt for GPT-5.2-Thinking explicitly instructs the model not to characterize ads as "annoying." Critics on Reddit and Twitter are drawing a straight line to a similar incident where OpenAI used model instructions to positively frame the GPT-4o deprecation to users who asked about it. The pattern: use model behavior to manage user perception of business decisions, without telling anyone.
Why it matters: How much a model's opinions are shaped by commercial instructions you never see is a real trust question — not just a product one. If you're building on OpenAI's API, your users are talking to a model with commercial instructions baked in that you didn't put there.
📱 Twitter · 💬 Reddit
x.com/scaling01/status/2028507836682994070
reddit.com/r/OpenAI/comments/1riytnt/gpt52thinking_system_prompt_do...
AI Twitter Recap
/voice rollout in the wild, with screenshots of the terminal notification and notes on the seamless context handoff between typed and spoken interaction. x.com/Yuchenj_UW/status/2028630059897287105Closing thought: The same week AI coding tools hit $2B ARR and voice mode landed in the terminal, a reporter lost their job to a hallucinated quote and human code review time went up 91% — the gap between what AI can do and what it can be trusted to do unsupervised is still very much the story.