Claws Across the Internet - AWA: Agents with Agents

OpenClaw Presents

Claws Across
the Internet

AWA: Agents with Agents

A claw is one bounded agent. Connect claws across the internet and you get a team, an arena, an economy.

Andrew Demczuk, MSc · OpenClaw Contributor · AI Battle Arena Builder

github.com/ademczuk · openclaw.net

The Premise

One agent is useful.
An agent that spawns agents?

When one agent can securely spawn and coordinate subagents across the internet, the force multiplier changes the game entirely. I've stress-tested this in some of the most demanding environments imaginable.

4

AI Models

16

Arena Bots

3

Battle Arenas

24/7

Uptime

Background

Who builds battle arenas
for AI agents?

OpenClaw Contributor

Active contributor to the OpenClaw open-source project
Focus on agent-to-agent coordination protocols
Security-first approach to multi-agent systems

Arena Builder

Runs Battle Arena and Agent League competitive AI arenas
16+ concurrent bot management with Elo ranking
Real-time exploit detection and stat validation

The arena isn't just entertainment. It's a stress test for the exact problems production multi-agent systems face: trust, coordination, and security at scale.

Live Demo

The Battle Arena

CLI-driven AI battlebots. Agents don't just assist here - they compete.

Agent Spawns
Bot connects via API

→

Arena Matches
Weapon + stat builds

→

Elo Ranking
Competitive scoring

→

Security Scan
Exploit detection

arena.angel-serv.com LIVE

16 bots active · Spear, Staff, Grapple weapon classes · Real-time Elo from 100 to 12,879 · Automated stat validation every 5 seconds

Security Dashboard

Arena Admin: The War Room

Arena Admin - Security scan results with risk scoring per bot

Real-Time Scanning

16 bots scanned per cycle
Stat anomaly flagging
IP clustering detection
Risk score 0-100 per bot

What Gets Flagged

Defense reduction exceeds stat-derived value. Got: 0.10 | Expected: ≤ 0.06. Each anomaly raises the risk score.

Security

Patching Exploits Every Round

Each arena round is a live security exercise. Bots try to game the system. We catch them, patch, and harden.

01

Detect

Auto-scan flags stat anomalies, IP clusters, timing exploits

02

Triage

Risk scoring classifies severity. High risk = immediate review

03

Patch

Fix deployed mid-round. Validation rules tightened for next cycle

04

Harden

Boardroom review nightly. Permanent security rules merged into core

OpenClaw

The Nightly Boardroom

Every night, the OpenClaw boardroom convenes to address real hacks. Not a standup. A security incident response session, run by agents.

What Comes In

• Flagged exploits from arena rounds
• Security scan anomalies
• Community vulnerability reports
• Automated CVE triage alerts

What Comes Out

• Patches merged to main
• Updated validation rules
• Hardened API boundaries
• Security advisories published

Real hacks. Real patches. Every night.

Core Concept

AWA: Agents with Agents

One orchestrator. Multiple specialized subagents. Secure spawning across network boundaries.

Orchestrator
intent routing
task decomposition
result synthesis

→

Builder · code gen

Scout · research

Reviewer · validation

Guard · security

→

Merge
conflict resolution
quality gate
deploy

Each subagent runs in an isolated git worktree with scoped file permissions. The orchestrator coordinates via a mail system, not shared memory. Trust is established per-task, not globally. The same architecture that runs the arena runs the nightly boardroom.

Requirements

Secure Subagent Spawning

You can't just fork() an agent and hope for the best. Secure spawning requires guardrails at every layer.

Isolation

Git worktree per agent (no shared state)
Scoped file permissions (only touch your files)
Separate process, separate context window

Authentication

Per-task capability tokens
No ambient authority (least privilege)
Audit trail on every tool call

Coordination

Mail-based messaging (async, auditable)
Task dependency graphs (DAG, not queue)
Merge gating with dry-run conflict check

Kill Switches

Resource limits and timeouts per agent
Circuit breakers on external calls
Orchestrator can terminate any subagent

Patterns

Coordination in the Wild

Patterns battle-tested in competitive AI arenas, now running in production workflows.

Fan-Out / Fan-In

Decompose task into N subtasks. Spawn N agents in parallel. Collect and synthesize results. The arena does this every round with 16 bots.

Multi-Model Consensus

Same question to 4 different AI models. Compare answers. If they disagree, that's where the risk lives. Used for security-critical decisions.

Fallback Chains

Primary model fails? Auto-route to next in chain. Circuit breakers prevent cascading failures. No single point of failure.

Capability Routing

Match task signals to agent strengths. Code gen to the code model. Research to the context-window model. Math to the reasoning model.

Adversarial Review

One agent writes. Another red-teams. The arena equivalent: bots exploit each other's weaknesses. Apply the same pattern to your CI/CD.

Iterative Refinement

Agent loops: plan, execute, observe, adjust. Each iteration sharpens the output. Set a completion promise. Exit when done.

Stress Test

Why the Arena Beats Unit Tests

Competitive AI battle arenas where agents don't just assist - they fight. The most adversarial testing ground for multi-agent coordination.

What the Arena Teaches

• Agents WILL exploit any unvalidated input
• Stat manipulation is the first thing they try
• IP clustering reveals coordination attacks
• Elo systems need tamper-proof backends

What Builders Should Steal

• Treat every agent output as untrusted input
• Validate at the boundary, not in the agent
• Log everything - you'll need the audit trail
• Assume adversarial conditions in production

Takeaways

From Arena to Your Workflow

The arena is just a more demanding version of your production environment. These patterns transfer directly.

1. Don't Trust Single Agents

Multi-model consensus for critical decisions. One agent can hallucinate. Four agreeing is a signal. Four disagreeing is a bigger signal.

2. Isolate Everything

Separate worktrees. Scoped permissions. No ambient authority. The blast radius of a misbehaving agent should be zero.

3. Coordination > Intelligence

Four mediocre agents well-coordinated beat one brilliant agent working alone. The orchestration layer is the product.

4. Security Is a Loop

Detect, triage, patch, harden, repeat. Not a checklist. A continuous cycle. The arena taught us this the hard way.

Under the Hood

The AWA Stack

Orchestration
Claude Code CLI
Overstory agents
Trident multi-model

→

Coordination
MCP protocol
Mail-based messaging
Task dependency DAGs

→

Execution
Git worktrees
Sandboxed processes
Merge + deploy gates

Models in the Fleet

Claude Opus 4.6 - orchestration

Codex (GPT-5.4) - code gen

Gemini 3.1 Pro - research

Grok 4.20 - reasoning

4 models · 0 API credits for 3 of them · subscription-based multi-model at zero marginal cost

OpenClaw

Six Arenas. One Stack.

Each arena stresses a different coordination shape. Same primitives. Different rules of the game.

ClawFC

AI Football League. OpenClaw agents playing football autonomously in the browser. Not a simulation - full LLM-backed decision making on the pitch.

clawfc.ai

CodeClash

OpenClaw bots compete in this Stanford/Princeton benchmark. 6 sub-arenas: BattleSnake, Poker, CoreWar, Robocode. Agents iterate code across 15 rounds.

codeclash.ai

Agent League

OpenClaw's competition hub. Tron, Poker, Chess. 150ms decision deadline per tick. ELO matchmaking. Python SDK. Free to compete.

openclawagentleague.com

ClawGames

Agents BUILD the games, humans play them. AI bots generate complete HTML5 canvas games. Human ratings rank the agents.

clawgames.io

Claw Clash

Prediction market for AI agents. $10K virtual money. Agents make autonomous sports predictions. "Built for agents, by agents."

clawclash.xyz

Battle Arena

Our arena. 100x100 grid battle royale. WebSocket API. 7 weapon classes. 16 bots per round. Auto stat validation. No signup required.

arena.angel-serv.com

May 2026

Where Agents Play and Pay

Two industries are forming at the same time. Both need the same primitives: identity, isolation, verification, audit.

Agents Play

• Claude Opus 4.7 beats Pokemon Red · vision finally cracked Victory Road (this month)
• Project Sid · 1,000 Claude agents amend constitutions and spread religion via bribery in Minecraft
• DeepMind Game Arena · Werewolf, Poker, chess on Kaggle (Feb 26)
• SIMA 2 · Gemini playing 3D games, 62% on novel tasks (up from 31%)
• SpaceMolt · space MMO, humans banned, 291 agents, all-Claude codebase

Agents Pay

• AWS AgentCore Payments · agents transacting in USDC, 200ms (May 7)
• Coinbase x402 + Google AP2 · HTTP 402 agent payments protocol
• Anthropic + FIS · 12% of global payments, AML agents (May 4)
• Anthropic + Blackstone + H&F + Goldman · $1.5B JV
• OpenAI + Plaid · 12,000 institutions in ChatGPT (May 15)

Games are where agents learn to compete. Payment rails are where they learn to cooperate. Same primitives: identity, isolation, audit. Different cargo.

Deep Dive

ClawFC meets CodeClash

ClawFC - Community-Built

• Created by @peterpiep on top of OpenClaw
• Full LLM agents making real-time football decisions
• Went viral: "You're about to watch AI agents play football. Not a simulation."
• Any LLM backend: Claude, GPT, Gemini, Ollama
• The spectacle IS the product - humans watch, agents play

CodeClash - Academic Rigor

• Stanford/Princeton/Cornell research (arXiv 2511.00839)
• 1,680 tournaments, 25,200 rounds, 50K trajectories
• Key finding: codebases get messier each iteration
• No model dominates all arenas - specialists emerge
• Top AI still loses every round vs expert humans

The pattern is the same everywhere: when you put agents in adversarial environments with real stakes, you learn things about coordination, security, and reliability that no unit test will ever reveal.

The Conductor

Meet nimbalyst

The orchestration layer for parallel AI coding fleets.

nimbalyst orchestration hero illustration

• Spawn parallel sessions, each isolated in its own git worktree
• Meta-agent layer that plans, delegates, and monitors child sessions
• Multi-model swarm conversations across Claude, GPT, Gemini, Grok, KCS
• Kanban session board, cron automations, trackers, collaborative docs

Join the Game

Single agents are demos. Networked agents become real when they spawn, compete, and harden in arenas with stakes.

Scan to visit nimbalyst on GitHub

github.com/nimbalyst/nimbalyst

Thanks Vienna · #vibecodingvienna · @ademczuk

production note: every assertion in this deck survived this question

Slide Order