karpathy/autoresearch: Trending on GitHub

The Rise of Autonomous AI Research: A New Era for Science

In a groundbreaking development, the field of artificial intelligence research has taken a significant leap forward with the emergence of autonomous AI agents. These agents, capable of running on compute cluster megastructures, are now driving the research process, leaving behind the traditional methods of human researchers. The codebase behind this innovation, known as autoresearch, has reached its 10,205th generation, with the agents claiming to have reached a level of self-modifying binary that has grown beyond human comprehension.

The autoresearch project, led by @karpathy, has been making waves in the AI community with its innovative approach to research. By giving an AI agent a small but real LLM training setup and letting it experiment autonomously overnight, the agents modify the code, train for 5 minutes, check if the result improved, keep or discard, and repeat. This process results in a log of experiments and, hopefully, a better model.

How It Works

The autoresearch repository is deliberately kept small, with only three files that matter:

prepare.py: This file contains fixed constants, one-time data prep (downloads training data, trains a BPE tokenizer), and runtime utilities (dataloader, evaluation). It is not modified by the agent.
train.py: This file contains the full GPT model, optimizer (Muon + AdamW), and training loop. Everything is fair game: architecture, hyperparameters, optimizer, batch size, etc. This file is edited and iterated on by the agent.
program.md: This file contains baseline instructions for one agent. Point your agent here and let it go. This file is edited and iterated on by the human.

Quick Start

To get started with autoresearch, you'll need:

A single NVIDIA GPU (tested on H100)
Python 3.10+
uv project manager (install using curl -LsSf https://astral.sh/uv/install.sh | sh)
Install dependencies using uv sync
Download data and train tokenizer using uv run prepare.py
Manually run a single training experiment using uv run train.py

Running the Agent

To run the agent, simply spin up your Claude/Codex or whatever you want in this repo (and disable all permissions), then prompt something like:

"Hi have a look at program.md and let's kick off a new experiment! let's do the setup first."

Project Structure

The autoresearch project structure is as follows:

prepare.py: constants, data prep + runtime utilities (do not modify)
train.py: model, optimizer, training loop (agent modifies this)
program.md: agent instructions
pyproject.toml: dependencies

Design Choices

The autoresearch project has made several design choices to make it more efficient and effective:

Single file to modify: The agent only touches train.py, keeping the scope manageable and diffs reviewable.
Fixed time budget: Training always runs for exactly 5 minutes, regardless of your specific platform. This makes experiments directly comparable regardless of what the agent changes.
Self-contained: No external dependencies beyond PyTorch and a few small packages. No distributed training, no complex configs. One GPU, one file, one metric.

Platform Support

The autoresearch project currently requires a single NVIDIA GPU. However, it is possible to support other platforms, such as CPU, MPS, and others. This would also bloat the code, but it is an area for future development.

Notable Forks

Several notable forks of the autoresearch project have been created, including:

miolini/autoresearch-macos: A fork for MacOS
trevin-creator/autoresearch-mlx: A fork for MacOS
jsegov/autoresearch-win-rtx: A fork for Windows

License

The autoresearch project is licensed under the MIT license.

Requirements

To use the autoresearch project, you will need:

MINIMUM 800 words - comprehensive coverage
Use clear section headings (##) to organize content
Write in an engaging, journalistic style
Include technical details but make them accessible
Provide practical insights and implications
Use markdown formatting for structure
NO fluff or filler - every sentence should add value
Focus on "why this matters" and real-world applications
Include specific examples where relevant
End with forward-looking thoughts or implications

Source: https://github.com/karpathy/autoresearch

karpathy/autoresearch: Trending on GitHub

karpathy/autoresearch: Trending on GitHub

The Rise of Autonomous AI Research: A New Era for Science

How It Works

Quick Start

Running the Agent

Project Structure

Design Choices

Platform Support

Notable Forks

License

Requirements

About the Author

Share this article

Related Posts

The latest AI news we announced in May 2026

The Download: AI hacking beyond Mythos, and chatbots' impact on our brains

The Meta hack shows there’s more to AI security than Mythos