There’s Now an AI Research Agent That Does in Seconds What Takes PhDs Hours

Best for: Researchers, academics, developers, and anyone who reads papers regularly and wants AI to handle the grunt work of searching, cross referencing, and verifying sources.

Not ideal for: Non-technical users who aren’t comfortable with a terminal. Feynman is a CLI tool, not a web app.

Every AI tool in 2026 wants to help you write faster. Build faster. Ship faster.

Almost none of them want to help you think better.

(Note: this is not Opennote’s Feynman-3 or MIT’s AI Feynman symbolic regression project. This is a completely different tool by Companion AI, built for a completely different purpose.)

Feynman is an open source AI research agent that runs in your terminal and does something genuinely different from everything else in the agent space right now. You give it a topic. It searches academic papers, synthesizes findings across multiple sources, verifies every claim against real citations, and hands you a structured research brief with working links to everything it referenced.

Not a chatbot. Not a summary tool. A multi-agent research system that dispatches four specialized agents in parallel, each handling a different part of the research process. The kind of work that takes a graduate student an afternoon takes Feynman about 30 seconds.

The project hit 2,300+ GitHub stars within days of launching. The announcement tweet got 1,708 likes and 2,768 bookmarks from an account with 1,400 followers. The bookmark count tells the real story: people aren’t just liking it. They’re saving it to come back to. And right now, nobody else has covered it.

What Feynman Actually Does

The core idea is simple: you type a research question into your terminal and get back a cited, structured answer built from real sources. But the way it gets there is where it gets interesting.

Feynman runs four subagents automatically for every query:

Researcher pulls evidence from academic papers, the open web, GitHub repositories, and documentation. It’s not searching Google and summarizing the first page of results. It’s querying alphaXiv (an academic paper search engine) and cross referencing what it finds across multiple source types. The agent follows strict integrity constraints: it never fabricates a source, never claims a project exists without checking, and requires a verifiable URL for every citation.

Reviewer runs a simulated peer review on the findings. It grades feedback by severity and flags weak claims, missing context, or contradictions between sources. This is the part most AI tools skip entirely. They give you an answer. Feynman gives you an answer and then tells you what’s wrong with it.

Writer takes the research notes and produces structured, paper style output. Literature reviews, research briefs, summaries with clear sections for consensus, disagreements, and open questions.

Verifier checks every citation in the output. Dead links get killed. Claims that don’t match their cited source get flagged. This is the difference between “AI research” that hallucinates citations (which is most of them) and research you can actually trust enough to use.

The system is built on Pi for the agent runtime and alphaXiv for paper search and analysis. Every output is source grounded. Claims link to papers, docs, or repos with direct URLs.

The Commands That Matter

You can talk to Feynman in plain English or use specific commands:

feynman "what do we know about scaling laws" gives you a cited research brief.

feynman deepresearch "mechanistic interpretability" triggers the full multi-agent investigation with parallel researchers, synthesis, and verification. This is the heavy mode.

feynman lit "RLHF alternatives" produces a literature review with consensus views, disagreements between researchers, and open questions nobody has answered yet.

feynman audit 2401.12345 takes an arXiv paper ID, pulls the paper’s claims, then compares them against the actual public codebase. This is genuinely unique. Most people reading papers have no way to quickly check whether the code matches what the paper says it does. Feynman automates that entire process.

feynman replicate "chain-of-thought improves math" goes even further: it attempts to replicate experiments on local or cloud GPUs via Modal or RunPod.

Each of these would be a meaningful tool on its own. Feynman bundles all of them into one CLI with a single install command.

The Part Most People Will Miss

There’s one more command worth knowing about that doesn’t show up in the launch tweet.

feynman autoresearch "transformer efficiency" starts an autonomous research loop. You give it a topic, it runs continuous investigation cycles without you babysitting it. Each cycle refines the previous findings, adds new sources, and deepens the analysis. Walk away, come back, and the research brief has evolved.

This is the feature that separates Feynman from a tool you use once from a tool that works while you’re not looking. The same pattern that made OpenClaw and Hermes Agent compelling: agents that operate independently between sessions. Feynman applies it to research instead of task automation.

How to Install It

One command:

curl -fsSL https://feynman.is/install | bash

Requires Node.js 20.19 or newer. The installer handles everything else.

You’ll need an API key from an LLM provider (the agents need a brain to reason with) and optionally an alphaXiv account for deeper paper search capabilities. Multiple providers are supported including Anthropic, OpenAI, Gemini, and Perplexity for web search.

If you don’t want the full terminal app and just want the research skills, you can install them directly into Claude Code or Codex:

# For Claude Code
npx feynman install --target claude-code

# For Codex
npx feynman install --target codex

That drops the skill library into your agent’s skills directory. Now your existing coding agent can also do research. The skills are just markdown files following the same pattern as OpenClaw skills and the broader agent skills ecosystem.

Who Built This

Feynman comes from Companion AI, built by @aigleeson. The project is MIT licensed with 87 commits and 2,300+ stars as of this writing. It’s actively shipping with a full documentation site at feynman.is and a changelog that shows consistent development momentum.

The team also built Pi, the underlying agent runtime that powers Feynman. This means the agent framework and the research tool were designed together, not bolted onto each other as an afterthought. That integration shows in how cleanly the four subagents coordinate. There’s no janky handoff between steps. The Researcher findings flow directly into the Reviewer’s analysis which feeds the Writer’s output which gets checked by the Verifier. One pipeline, four agents, zero manual intervention.

Why This Matters Right Now

The AI agent space has been dominated by tools that do things: send emails, manage calendars, write code, automate workflows. OpenClaw does things. Claude Cowork does things. Hermes Agent does things and remembers what it learned.

Feynman is the first serious open source agent built specifically for understanding things.

That distinction matters because the biggest problem with AI research tools right now isn’t capability. It’s trust. ChatGPT will confidently cite papers that don’t exist. Perplexity will summarize articles it didn’t fully read. Every AI search tool has the same fundamental weakness: you can’t verify the output without doing the research yourself, which defeats the entire point.

Feynman’s architecture addresses this directly. The Verifier agent exists for one reason: to catch the lies before they reach you. The paper audit command exists because the gap between what researchers claim in papers and what their code actually does is a known, widespread problem that nobody had automated a solution for.

The Honest Take

Feynman is still early. The community is growing fast (2,300+ stars in the first week is serious traction) but it’s not battle tested at the scale of something like OpenClaw or Hermes Agent.

What it is: the most interesting research agent architecture released this year. The four agent pipeline (search, review, write, verify) is how research should work when AI is involved. The paper audit against actual codebases is a feature that should have existed years ago. And the fact that it installs as skills into Claude Code or Codex means it doesn’t have to replace your existing workflow. It just makes it smarter.

The 2,768 bookmarks on a 1,400 follower account tell you what the market thinks. People want this. They’ve been waiting for an AI research tool they can actually trust. The star growth in the first week suggests they’re not just bookmarking. They’re installing.

The bet Feynman is making is that research agents need a verification layer baked into the architecture, not bolted on after the fact. Nobody else is making that bet yet. And that’s usually the bet worth paying attention to.