Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement
ACL 2025 · Main Conference
We build an AI agent inspired by the Gödel machine that can read, evaluate, and rewrite its own source code at runtime, achieving recursive self-improvement without human-designed optimization routines -- improving from 4% to 78% accuracy on Game of 24 and outperforming all baselines on four reasoning benchmarks.
The rapid advancement of large language models (LLMs) has significantly enhanced the capabilities of AI-driven agents across various tasks. However, existing agentic systems, whether based on fixed pipeline algorithms or pre-defined meta-learning frameworks, cannot search the whole agent design space due to the restriction of human-designed components, and thus might miss the globally optimal agent design. In this paper, we introduce Gödel Agent, a self-evolving framework inspired by the Gödel machine, enabling agents to recursively improve themselves without relying on predefined routines or fixed optimization algorithms. Gödel Agent leverages LLMs to dynamically modify its own logic and behavior, guided solely by high-level objectives through prompting. Experimental results on mathematical reasoning and complex agent tasks demonstrate that implementation of Gödel Agent can achieve continuous self-improvement, surpassing manually crafted agents in performance, efficiency, and generalizability.
The Vision
What if an AI agent could look at its own code, understand its limitations, and rewrite itself to become better? This seemingly science-fiction idea is at the heart of Gödel Agent -- a framework that brings the concept of recursive self-improvement from theoretical computer science into practical reality.
The name pays homage to the Gödel machine, a theoretical construct proposed by Jürgen Schmidhuber, which describes a self-referential universal problem solver capable of optimally improving itself. Our work demonstrates that with large language models, this vision is no longer just theoretical.
The Problem with Current Agents
Today's AI agents, despite their impressive capabilities, are fundamentally limited by their human designers. Whether based on fixed pipeline algorithms like ReAct or Chain-of-Thought, or more sophisticated meta-learning frameworks, they all share a common constraint: they can only search within a space that humans have predefined.
This means that even the most advanced agents might miss globally optimal designs simply because those designs fall outside the boundaries of human imagination. We asked ourselves: what if we removed these constraints entirely?
Key Insight
By giving an agent access to its own source code and the ability to modify it at runtime, we can create systems that discover novel strategies humans never conceived -- strategies that emerge from the agent's own experience and reasoning.
How It Works
Gödel Agent operates on a beautifully simple principle: the agent can read its own code, evaluate its performance, and propose modifications -- all guided by nothing more than a high-level objective like "solve more math problems correctly." The process unfolds in a clear recursive loop.
Self-Observation
The agent reads its own source code through its sensor module, gaining full awareness of its current logic, prompts, and decision-making process.
Performance Evaluation
It executes tasks and collects feedback -- accuracy scores, error traces, and efficiency metrics -- to identify weaknesses in its current implementation.
Self-Modification via Monkey Patching
Using Python's monkey patching mechanism, the agent dynamically rewrites its own methods at runtime. No restart is needed -- changes take effect immediately.
Validation and Iteration
The modified agent is tested. Improvements are kept, regressions are rolled back, and the cycle repeats -- leading to continuous, compounding self-improvement.
Results
We evaluated Gödel Agent on mathematical reasoning tasks (Game of 24, DROP, MGSM, MMLU, GPQA) and embodied agent tasks (ALFWorld). The results exceeded our expectations across the board.
Benchmark Comparison (GPT-3.5 Backbone)
| Method | DROP (F1) | MGSM (%) | MMLU (%) | GPQA (%) |
|---|---|---|---|---|
| Chain-of-Thought | 64.2 | 28.0 | 65.4 | 29.2 |
| COT-SC | 64.4 | 28.2 | 65.9 | 30.5 |
| Self-Refine | 59.2 | 27.5 | 63.5 | 31.6 |
| LLM Debate | 60.6 | 39.0 | 65.6 | 31.4 |
| Step-back Abstraction | 60.4 | 31.1 | 65.1 | 26.9 |
| Quality-Diversity | 61.8 | 23.8 | 65.1 | 30.2 |
| Role Assignment | 65.8 | 30.1 | 64.5 | 31.1 |
| Meta Agent Search | 79.4 | 53.4 | 69.6 | 34.6 |
| Gödel Agent (Ours) | 80.9 | 64.2 | 70.9 | 34.9 |
On the Game of 24 benchmark, starting from a mere 4% accuracy, Gödel Agent iteratively improved itself to achieve 78% accuracy -- a transformation that happened without any human guidance on how to solve these problems. On the broader reasoning benchmarks, Gödel Agent consistently outperformed all baselines including Meta Agent Search, achieving 80.9 F1 on DROP (+1.5 over MAS), 64.2% on MGSM (+10.8 over MAS), and 70.9% on MMLU (+1.3 over MAS).
Perhaps more remarkably, the agent didn't just get more accurate; it became more efficient, reducing the average number of actions needed per task. It discovered optimization strategies that emerged naturally from its self-improvement process.
What We Learned
- Emergence is real: Novel problem-solving strategies can emerge from self-referential improvement without explicit human design
- Efficiency follows accuracy: As the agent improves its logic, it naturally becomes more efficient
- Generalization happens: Improvements on one task often transfer to related tasks
- The ceiling is high: Unrestricted Gödel Agent (with code execution and LLM calls) reaches 90.5 F1 on DROP and 90.6% on MGSM
Ablation Insights
Removing the thinking module causes a 13.4-point drop on MGSM; removing error handling costs 14.8 points. This confirms that self-reflection and robustness mechanisms are critical to effective self-improvement -- the agent needs to reason about why it failed, not just that it failed.
Looking Forward
Gödel Agent represents a step toward AI systems that can genuinely improve themselves -- not through human engineering of better algorithms, but through their own recursive self-reflection. While we're still far from systems that can fundamentally redesign their own architectures, this work shows that meaningful self-improvement is possible today.
The implications extend beyond academic research. As AI systems become more autonomous, the ability to self-correct and self-improve becomes increasingly valuable. Gödel Agent offers a glimpse of what that future might look like.