Testimonials

Low-Cost Autoresearch with
HPC-AI.COM

<$2 (Total Cost)
Reproduced full pipeline including H200 GPU instance and LLM API calls.
28 Attempts / 2 Hours
Rapid autonomous research loop with ~25% success rate in code modification.

Based on experiments by Tobias Lee — see original post: https://x.com/_TobiasLee/status/203846732751615264

Project: Andrej Karpathy's Autoresearch https://github.com/karpathy/autoresearch

Goal: Reproduce an autonomous AI research loop with minimal cost.

Author: Li Lei

Overview

Autoresearch is a novel experiment where an AI agent autonomously modifies and retrains a GPT model in a real training environment:

  • Train for 5 minutes per round → evaluate model using val_bpb (bits per byte, lower is better) → keep or discard changes → repeat.
  • The agent can change almost anything: model architecture, hyperparameters, optimizer, batch size.
  • Core components: GPU machine, smart agent, and a robust training harness.

The original setup could be resource-intensive, but using HPC-AI.com's Model APIs, we reproduced the full pipeline for under $2.

Implementation

Hardware & API Setup

  • GPU instance: H200 x 1.5 hours → ~$1.60 (20–25 rounds).
  • API calls: agent queries LLM for code analysis & decisions → ~$0.30.
  • Training harness: OpenCode.
  • Total cost: <$2.

HPC-AI Advantages

  1. Pay-per-use, no subscription required, K2.5 spot instance is cost-effective.
  2. Pre-installed CUDA + PyTorch, ready out-of-the-box.
  3. Stable network ensures uninterrupted dataset download.

Step-by-Step Reproduction

  1. Register on HPC-AI and create an H200 instance, obtain API key.
  2. Connect to instance and clone the autoresearch repo.
  3. Install dependencies and prepare data (~2 minutes).
  4. Run one manual round to verify the environment.
  5. Install OpenCode and directly connect to HPC-AI's Model API, no local deployment is needed. Use your API key to access K2.5 / M2.5, then import and configure the custom provider documentation.
  6. Input agent prompt and start autoresearch loop.
  7. Let it run (overnight recommended) and review results next day.

Results

Autoresearch Results
  • 2 hours of autonomous runs lowered val_bpb from 0.994 → 0.990.
  • Agent mainly tuned learning rate and experimented with attention patterns.
  • 28 attempts in total, 7 successful (~25% success rate).
  • Demonstrates the potential of AI-driven research loops at very low cost.

Key Takeaways

  • Autoresearch represents a new AI research paradigm:
    • Human-defined boundaries (via program.md).
    • AI-driven exploration (code modification, experiment design, evaluation).
  • Future PhD-style workflow could involve supervising agents overnight and refining research prompts daily.

Tip for Beginners: Run at least 2 hours (~20 rounds) for initial experience; overnight (~100 rounds) for full potential.

Ready to Power your AI Workloads?

Join thousands of developers and researchers who are building the future with our HPC-AI platform.

Low-Cost Autoresearch with HPC-AI