Tokens & Signals · Monday, March 9, 2026

Karpathy’s Autoresearch: The AI Loop Closes

gpt-5.4-xhighclaude-opus-4-6-thinking-autogemini-3.1-pro-previewnvidia-nemotron-3-nano-30bhelix-02figure-roboticsperplexity-computerhugging-facecortical-labseon-systemssambanovagithubcoding-agentsautonomous-researchsynthetic-dataneural-priorvla-modelsbiological-computingagentic-inferencetos-legal-updateskarpathylvwerratheoomarsar0
Tokens & Signals for 3/9/2026. We scanned ~605 Twitter accounts, 13 subreddits (0 posts), Hacker News (14 stories), 10 newsletters, 10 podcasts, and leaderboard data for you. Estimated reading time saved: ~24 hours.

TLDR

  • Andrej Karpathy released 'autoresearch,' a ~630-line framework that lets AI agents run and evaluate their own ML experiments — no human required. x.com/karpathy/status/2030371219518931079
  • Figure Robotics' 'Helix 02' humanoid is now cleaning living rooms on its own, using a neural prior learned from human motion. x.com/Figure_robot/status/2031038981333565949
  • Perplexity Computer added native support for Claude Code and GitHub CLI so you can orchestrate entire coding projects without leaving the interface. x.com/AskPerplexity/status/2031038321678528667
  • A US Appeals Court ruled that TOS updates sent via email are legally binding — as long as you keep using the app. news.ycombinator.com/item?id=47305461
  • Researchers trained 200,000 biological neurons in a petri dish to play 1993's DOOM. Yes, really. x.com/TheRundownAI/status/2030422570911113638
  • Hugging Face's Synthetic Data Playbook team dropped 'FinePhrase,' a massive 500B token dataset distilled from 1 trillion generated tokens. x.com/lvwerra/status/2030587112253247808

  • Best to Build With Today

    * Codinggpt-5.4-xhigh (LiveBench leader for agentic coding)

    * Reasoningclaude-opus-4-6-thinking-auto (Top performer for complex math/reasoning)

    * Chatgemini-3.1-pro-preview (Top Arena general chat model)

    * Open-sourceNVIDIA Nemotron 3 Nano 30B (Best for efficient local agentic tasks)


    Deeper Dives

    🧠 Models & Research

    Karpathy open-sources 'autoresearch'

    Karpathy dropped a 630-line framework that runs ML research on autopilot. The agent reads high-level instructions, modifies training scripts, runs 5-minute experiments on a single GPU, and keeps only the changes that actually improve performance. It can crank through up to 100 experiments overnight without anyone watching.

    Why it matters: This isn't AI helping you write code anymore — it's AI running the entire research loop by itself.

    � Twitter

    OpenDev paper details terminal-based AI coding agents

    This 81-page deep dive on 'OpenDev' is basically a field manual for building CLI-first coding agents. It covers how to separate planning from execution, use workload-specialized routing, and avoid the usual traps like context bloat spiraling out of control.

    Why it matters: It's the definitive blueprint for moving beyond IDE plugins to fully autonomous, terminal-native software engineers.

    � Twitter� Hacker News

    Synthetic Data Playbook releases FinePhrase

    The Synthetic Data Playbook team distilled 1 trillion generated tokens down to 'FinePhrase,' a curated 500B token dataset. It's also a masterclass in what actually makes synthetic data worth training on.

    Why it matters: Human-written data is running out. High-quality synthetic data is quickly becoming the main competitive edge.

    � Twitter

    Petri dish neurons learn to play DOOM

    Cortical Labs trained 200,000 human neurons in a petri dish to play 1993's DOOM. The cells learned to navigate the game by reacting to electrical signals — biological intelligence adapting to a digital environment in real time.

    Why it matters: It genuinely makes you rethink where biological intelligence ends and silicon-based AI begins.

    � Twitter� Hacker News

    Eon Systems simulates a fruit fly brain

    Eon Systems pulled off a functional simulation of a fruit fly brain. By mapping the fly's neural connections, they built an emulation capable of basic sensory-motor navigation.

    Why it matters: Emulating a living brain is the closest thing we have to a "ground truth" for building smarter, more efficient AI architectures.

    � Twitter

    🚀 Products & Launches

    Figure Robotics showcases Helix 02

    Figure's 'Helix 02' is cleaning houses on its own now. The impressive part: it ditches over 100,000 lines of hand-written C++ in favor of a single neural prior learned from 1,000+ hours of human motion. That shift is shaving years off the timeline to home-ready robots by 2027.

    Why it matters: VLA models are officially leaving the lab. This is what "robots in the real world" actually looks like.

    � Twitter

    Perplexity Computer adds Claude Code and GitHub CLI

    Perplexity now natively integrates Claude Code and GitHub CLI, turning what used to be a search engine into a full dev environment. You can prompt it to fork a repo, squash bugs, and push PRs without ever leaving the interface.

    Why it matters: Search just became action. Perplexity is making a serious play to be your primary workspace for agentic coding.

    � Twitter

    💼 Industry & Business

    US Court of Appeals rules on TOS updates via email

    The Ninth Circuit ruled that companies can update their terms of service by emailing you. Keep using the app after that email lands? You're legally bound — mandatory arbitration clauses and all.

    Why it matters: Your inbox is about to get a lot more TOS emails, and ignoring them now actually has consequences.

    � Hacker News

    SambaNova introduces SN50 RDU chip

    SambaNova launched the SN50 RDU (Reconfigurable Dataflow Unit), a chip built from the ground up for agentic inference. The pitch: running multi-step AI agents faster and cheaper than throwing it all at general-purpose GPUs.

    Why it matters: We're finally getting silicon designed for the agentic era — not just brute-force LLM training.

    � Twitter


    AI Twitter Recap

  • @karpathy on autonomous research: "Instead of manually tweaking code, the agent iterates, runs experiments, and learns. This is how we run 100 experiments while we sleep." x.com/karpathy/status/2030371219518931079
  • @Figure_robot on the Helix 02 demo: "Watch a robot clean a room using a single neural prior, not a million lines of hard-coded C++." x.com/Figure_robot/status/2031038981333565949
  • @theo on Perplexity: "Perplexity adding native Claude Code and GitHub integration is massive. The agentic workflow is becoming the default." x.com/theo/status/2030421544611254388
  • @omarsar0 on the OpenDev paper: "If you're building coding agents, this 81-page deep dive is the new gold standard for architecture." x.com/omarsar0/status/2030771811705872435
  • @lvwerra on synthetic data: "Playbook from our 90 experiments is live. 500B high-quality tokens from FinePhrase ready for use." x.com/lvwerra/status/2030587112253247808

  • Closing thought: Petri dish neurons playing DOOM. Karpathy's agents running their own research labs overnight. The gap between "building something" and "letting it build itself" is closing faster than anyone expected. The age of the autonomous machine isn't coming — it's already here.