NewTry the cost calculator — see your savings →

Harness engineering for agentic AI

One harness.Three ways to use it.

The moat is the harness, not the model. AnyForge is the production harness every AI engineering team ends up building themselves — policy, portable memory, verification gates, hash-chained audit — built once, governed, provider-neutral. Keep your own agent behind Control, code in our browser Code Studio, or ship multi-agent initiatives with Crew. Same factory underneath.

AnyForge Control·Code Studio·AnyForge Crew
Install Control →Try Code Studio free →Apply for Crew pilot →
3–10× cheaper
200+ models, one endpoint
portable memory across providers
hash-chained audit
one-click escalation to Crew
human approval gates

One conversation

Discover. Escalate. Execute.

Start in a single-agent thread. Escalate to a governed Crew when the scope grows. Stay on the same conversation. AnyForge Control runs underneath every step, so cost, audit, and memory stay unified.

01 / DISCOVER

Start in Code Studio, or your IDE.

Explore a question, sketch an ADR, iterate on a fix. Use our browser Code Studio, or keep Claude Code / Cursor / Cline / Continue and let Control govern them. Every token is already on the ledger.

> How should I split checkout?
Proposing a Crew for that…

02 / ESCALATE

One click to a governed Crew.

When scope outgrows a single agent — cross-service, compliance-touching, architect-level — Code Studio proposes a Crew with a full task breakdown. Start it inline. Same thread.

CODE STUDIO → CREW
◻ ADR: service boundaries · ARCH
◻ PaymentService scaffold · ENG
◻ Integration tests · QA
Start Crew →

03 / EXECUTE

Specialists run. You approve in-thread.

Architect, Engineer, QA, and Compliance take over under governance. See every agent message live. Approve HIL gates from Code Studio, from the operator console, or (where the product supports it) from your IDE.

ARCHITECT · PLAN APPROVAL
5 tasks · 2 services · PCI scope
ApproveRejectRefine

AnyForge Control runs underneath every step. One cost ledger. One audit chain. One operator console.

The Products

One governance layer. Three consumption surfaces.

Control is the layer underneath. Code Studio and Crew are two ways to consume it natively. Pick whichever surface matches your team. You get the same cost ledger, the same audit chain, and the same portable memory everywhere.

The governance layer

AnyForge Control

Govern the agents you already use.

A governed LLM proxy. Point Claude Code, Cursor, Cline, Continue, or your own SDK at our endpoint. Zero workflow change. Immediate visibility.

  • Unified proxy: OpenAI, Anthropic, Gemini, OpenRouter
  • Self-hosted Ollama for air-gapped or on-prem inference
  • Per-call cost and latency analytics
  • Intent-based routing (rules, regex, or embedding)
  • Portable memory that survives provider switches
  • Hash-chained, signed audit log
  • Optional OPA policy (redaction, caps, allow-lists)
Single-agent · Browser

AnyForge Code Studio

Turnkey AI coding. Control built in.

Browser-based single-agent coding workspace. 200+ models via OpenRouter. Native GitHub integration. Escalate to a governed Crew in-thread when scope grows.

Powered by AnyForge Control

Governance, cost, and audit are native. Nothing separate to install.

  • 200+ AI models via OpenRouter
  • Native GitHub integration, main branch locked
  • Repo-grounded plans, failing tests first
  • Reverse-engineer any repo — C4 diagrams, threat model, tech debt registry
  • Escalate to Crew in one click, stay on the thread
  • Approve Crew HIL gates from inside Code Studio
  • BYOK keys, $0.30 / 1M tokens platform fee
Multi-agent · Governed

AnyForge Crew

One Engineer. Specialists on demand. Operator-gated.

A single capable Engineer agent drives each crew, consulting Architect, Security & Compliance, QA, Compliance, and Oracle specialists on demand. At PR time the specialists fan out and review the diff in parallel before one mandatory operator approval. Turn on per project when a workflow deserves the full treatment.

Powered by AnyForge Control

Governance, cost, and audit are native. Nothing separate to install.

  • One Engineer agent + 5 specialist advisors (Architect, Security, QA, Compliance, Oracle)
  • Engineer reads the repo, applies patches with deterministic git apply, runs your real tests, opens a PR
  • Specialists run on-demand mid-run and fan out at PR time for parallel review
  • One mandatory HIL gate at PR review — approve, reject, or amend (re-enters Engineer)
  • Budget enforcement with auto-pause thresholds
  • Cryptographic audit trail of every specialist call, tool call, and operator decision
  • Runs on self-hosted Ollama for regulated or air-gapped customers

Code Studio · Codebase Intelligence

Reverse-engineer any codebase.

Connect a repository and Code Studio produces a full documentation suite in two passes — structural map, C4 architecture diagrams, security threat model, tech debt registry, and more. Every gap auto-seeds a specific DRAFT initiative you can launch into a governed Crew.

See how it works →

Phase 1 · 2–5 min

Structural Map

Gaps & Assumptions

Well-Architected Scan

Gap initiatives seeded

Phase 2 · 10–20 min

C4 Diagrams (L1/L2/L3)

Security Threat Model

Data Model + API Overview

Runbook + Tech Debt + Onboarding

One operator console. Every AI call cost-tracked, routed, and audited — whether it came from Code Studio, your IDE, or a Crew agent.

The Problem

🏥

They use the expensive AI for everything.

Like hiring a surgeon to put on a band-aid. Other AI tools send every task — simple or complex — to their most expensive model. You're overpaying for work a cheaper model handles just fine.

🚨

When they go down, you stop working.

Anthropic, OpenAI, and Google all have outages — regularly. If your team is locked to one vendor, everyone sits idle until it's back. One provider failing shouldn't stop your entire organisation.

🧨

Silent failures accumulate.

Agents write code that passes review and quietly breaks an invariant three sprints later. Without a verification layer — policy gates, compliance loops, hash-chained audit — you don't catch drift. You react when it explodes.

📜

Your instructions go stale, nobody notices.

AGENTS.md, .cursorrules, CLAUDE.md — every team writes them, nobody keeps them current. Memory resets every session and nothing pins context to policy. Agents follow yesterday's rules. You find out at review time.

📋

Nobody knows what the AI actually did.

Your engineers use AI every day, but there's no record of what it changed, why, or whether anyone approved it. When a regulator or client asks, you don't have an answer. You have a problem.

🔒

Once you're in, there's no way out.

Every major AI vendor bundles their models, memory, and tools together. Build on their stack and switching means rebuilding from scratch. It's the cloud lock-in playbook all over again — but moving faster.

“The AI providers don't sell you a model — they sell you a dependency. The token bill is just the interest payment.”

— Edwin Poot, Founder

The Harness

Your AI factory in seven layers.

Every AI engineering team ends up building these seven layers themselves — usually as duct tape, one incident at a time. AnyForge ships them as one product. Pick any entry point and you inherit the whole harness underneath.

Models commoditise. Harnesses compound.

01

Intent capture

What is the team actually trying to do?

Every request — from the chat box in Code Studio, from Claude Code in the terminal, from a raw SDK call — is logged as a typed goal. Who asked, what for, and what it cost. No more guessing what your team spent on AI last month.

ControlCode Studio
02

Issue framing

What is the real work hiding inside the request?

Crew's Architect turns an intent into an ADR, breaks it into tasks, and routes each task to the right role. The plan is the first thing a human approves — before any code is written.

Crew
03

Context & instruction

How does the model know what's true here?

Policy-scoped memory (Zep, bi-temporal) carries context across providers. Switch Claude for GPT and the memory comes with you. Instruction files live in the repo, stay version-controlled, and never silently drift.

Control
04

Execution

Which model should run this task, at what price?

The governed LLM proxy routes by intent, fails over across providers, and charges wholesale. Simple tasks go to cheap models. Hard tasks go to expensive ones. You see every call — what it cost, what it returned, where it went.

Control
05

Verification

How do we catch the silent failures before they ship?

Two mandatory HIL gates (plan approval + PR review), a compliance loop that runs before and after implementation, and a QA loop with a bounded retry budget. Agents cannot skip verification nodes — the graph topology forbids it.

Crew
06

Isolation & permissions

What can this agent actually touch?

Role-locked agents with hard-bound capabilities. OPA policy decisions on every tool call. Per-tool MCP servers for filesystem, shell, deploy, and db — each one gated. Your agents run with the permissions you grant them, not the permissions the vendor ships.

CrewControl
07

Feedback loops

What happened, and what should we change?

Hash-chained audit of every call, every approval, every policy decision. Per-agent and per-task spend analytics. Drift detection that surfaces stale context and silent regressions before your CFO — or your auditor — surfaces them.

Control

Vibe coding gets you to a prototype. A harness gets you to production.

Every AI coding tool on the market today ships layer 4 (execution) and calls it a product. Helicone, Portkey, Langfuse add layer 7 (feedback). LangGraph and CrewAI give you bits of 2 and 4 if you build the rest yourself. AnyForge is the only place all seven layers ship together — as one product, with one console, one audit chain, one cost ledger.

Why Not Just Use Their Agents?

Same code quality. Fraction of the cost.

GitHub Copilot, Cursor, Devin, Amazon Q, and Vertex AI collectively cover 0–1 of 9 enterprise governance capabilities. The gap is structural — not a feature they're planning to ship.

Claude Code and Codex are excellent single-agent tools. But they charge you for every token in their autonomous loop — and you have zero control over how many rounds they run. AnyForge runs the same multi-turn tool loop with your budget controls.

FeatureClaude CodeCodexAnyForge
Agent architectureSingle agentSingle agentMulti-agent crew (4 specialized roles)
Human approval gatesNoneNone2 mandatory gates (plan + PR)
Compliance enforcementNoneNoneAutomated (pre + post implementation)
Audit trailSession logsSession logsCryptographic, exportable, immutable
LLM providerAnthropic onlyOpenAI onlyAny (BYOK — Anthropic, OpenAI, Google, etc.)
Cost per feature task$5–15$5–15$1–2
Budget enforcementToken limit onlyToken limit onlyAuto-pause at 80%/100% + incident queue
Governance docs in repoNoNoYes — ADR, diagrams, audit reports, approval records
Vendor outage impactTeam idleTeam idleAuto-failover in milliseconds
Who controls the loop?The providerThe providerYou

Other tools lock you to one vendor, charge premium prices for every task, and leave your team idle when that vendor goes down. AnyForge gives you every AI model, full control, and no strings attached.

9 governance capabilities. The nearest rival checks 2.

Enterprise AI governance requires more than session logs. Here is how every major platform stacks up across the nine capabilities that regulated industries actually need.

CapabilityGH CopilotCursorDevinAmazon QVertex AIAnyForge
Multi-agent teams~~~
Governance & audit trail~
Role-locked agents
Human-in-Loop approvals~~
Cost visibility per agent~~
200+ model access~~
Smart Routing (auto fast/smart)~
Enterprise compliance~~
Enforced compliance memory~

The Platform

Two products. Four core capabilities.

Control
SDK-compatible proxy

Governed LLM Endpoint

One endpoint in front of OpenAI, Anthropic, Gemini, OpenRouter, and self-hosted (Ollama / vLLM). Sign up at anyforge.ai (free, 100M-token trial allowance on signup), add your provider keys at /settings/secrets, then run `npx anyforge init` — it detects Claude Code, Cursor, Cline, Continue, or a raw SDK client and rewrites the config. Your agents keep working. Per-call cost, routing, portable memory, and a hash-chained audit log show up at crew.anyforge.ai instantly. Also native inside Code Studio and Crew — nothing separate to install for those surfaces.

Code Studio
Browser-native · Any device

Single-agent AI coding

Browser-based single-agent coding workspace. 200+ AI models via OpenRouter, repo-grounded plans, quality-first testing, native GitHub integration with main-branch locking by default. Governance is native — every call flows through Control. Escalate to a governed Crew in-thread when scope grows.

Crew
LangGraph state machine

Governed Agent Teams

Architect, Engineer, QA, Compliance & Security — each with a mandate and hard constraints enforced at the orchestration layer. An AI-native task manager structures the backlog. Every workflow moves through a governance state machine: agents cannot skip nodes. The HIL approval node is non-negotiable.

Crew
Token intelligence + analytics

Visibility & Control

Full token spend visibility by agent, task, and team — with actionable recommendations. Approval flows with cryptographic proof at every node. Immutable audit trails for every decision. Analytics & reporting across all teams. Dynamic LLM routing picks the most cost-effective model for each task automatically.

The Architecture

Governance on top. Providers swappable underneath.

The layers we own stay constant. The execution layer is a pluggable contract — any adapter, same governance, same audit log, same cost ledger.

01

Governance Layer

Role-bounded access controls · Compliance-aware state machine · Cryptographic audit logging · Per-company / per-crew / per-role token spend · Approval workflows & deployment policy

AnyForge
02

Orchestration Layer

Portable crew definitions (stored in AnyForge, not in any provider) · Task routing & model selection · BYOK — bring your own API key for any supported provider; keys live in Google Secret Manager, never in our database · Cost rules & budget enforcement

AnyForge
03

Execution Adapter Interface

An AnyForge-defined contract. Each adapter implements the same interface — we route traffic to whichever one fits the task.

Pluggable
OpenRouter
200+ models — Claude, GPT-4o, Gemini, OSS
Anthropic
Claude models — direct API access
OpenAI
GPT models — direct API access
AWS Bedrock
AWS procurement · data residency
Google Vertex AI
Gemini models · EU data residency · GCP-aligned

Why provider independence matters for a PE portfolio.A single portfolio rarely sits on a single cloud. One company is on AWS with data-residency rules, another is all-in on Google Cloud, a third standardised on Anthropic for cost reasons. A governance platform that hard-codes a provider forces the portfolio to re-platform — or forces the fund to run parallel toolchains with parallel audit trails. Neither is acceptable. The adapter contract means one governance spine serves every company on whatever execution stack they already pay for.

What AnyForge owns vs. what execution providers own.AnyForge owns the parts that determine whether a regulator, board, or LP can trust the output — access controls, the state machine, audit logs, token spend, approval policy. Execution providers own the parts that are commodities or specialisations: sandboxing, tool use, streaming, model access. The contract line is deliberate. Everything a provider could reasonably be swapped for in two years lives below it. Everything that would cost months of policy re-derivation to replace lives above it.

The Anthropic Managed Agents relationship.Anthropic’s Managed Agents product (launched April 2026) is a useful execution adapter for Claude-specific tasks — it handles sandboxing, session state, and tool routing well. AnyForge uses it where it fits. But Managed Agents is Claude-only by design, with no portfolio governance layer and no cross-provider audit trail. That is precisely the surface area AnyForge owns. When the question is “does the fund adopt Managed Agents or AnyForge,” the answer is both — Managed Agents underneath AnyForge, not instead of it.

How it works

From backlog to governed production.

From requirements to merged PR — governed at every step.

Work Queue — What Your Team Sees

BRD / Backlog

Task enters the queue

Refine

AI Architect + Operator iterate

Ready

Backlog approved, crew on call

In Crew

Engineer drives, specialists on demand, you gate the PR

Done

PR merged, CI/CD deploys

Inside Every “In Crew” Run — The Governed Agent Workflow

Crew
01

Requirements submitted

Engineer or PM describes the task with requirements, BRD, or user stories. Optionally links a GitHub issue. Sets budget limit.

Crew
02

Architect explores & plans

Software Architect agent explores your codebase using 15 tool rounds — reads files, searches code, understands architecture. Produces an ADR with Mermaid diagrams and implementation plan.

Crew
03

Compliance gate (automated)

Compliance & Security agent checks the plan against your enforced constraints. Violations loop back to the Architect. No human cost — fully automated.

Crew
Non-negotiable HIL
04

Plan Approval (human)

You review the ADR, architecture diagrams, and compliance report. Approve to start implementation, or reject with comments. Cryptographically signed.

Crew
05

Engineer builds & tests

Full Stack Engineer agent gets 40 tool rounds — reads code, writes files, runs tests, iterates until passing, commits, and creates a PR. Real code, not text output.

Crew
Non-negotiable HIL
06

PR Review & Merge (human)

You review the diff in AnyForge, chat with the Engineer agent about decisions, and approve. PR is merged via GitHub API. Your CI/CD pipeline handles deployment.

LangGraph state machine — agents cannot skip governance gates. Every decision cryptographically logged.

Pricing

Token-native, not token-taxed.

Two distinct revenue lines. Free to start with a 100M-token trial allowance on signup; pay $0.30 per 1M tokens after that — provider-agnostic, the same rate for cloud BYOK and self-hosted, covering the full harness: routing, portable memory, hash-chained audit, policy. Crew Managed Service is sold separately as a managed dev pod. The trial is the acquisition onramp. Crew contracts stack on top.

Self-serve

Platform Fee

100M tokens free. Then $0.30 / 1M tokens.

Signup trial allowance100M tokens
Platform fee on governed tokens$0.30 / 1M
Code Studio + ControlUnlimited
First Crew (one per company)20K-token trial
BYOK model keysYour contract

The 100M-token trial covers roughly 1–2 weeks of active dev use before any AnyForge bill. After that, $0.30 per 1M tokens covers routing, portable memory, hash-chained audit, policy, and MCP tools — provider-agnostic, the same rate whether you use cloud LLMs or self-host. Smart routing and caching typically save customers more on wasted model spend than the platform fee costs.

Enterprise

Annual prepay

Same $0.30 / 1M, procurement-ready.

Same platform fee$0.30 / 1M
BillingAnnual prepay (USD)
ComplianceSOC 2 · SSO · SLA
Self-host / regulated deployCustom

For procurement gates that need an annual contract. Same unit economics as Self-serve — no second platform price.

Crew

Crew Managed Service

Sold separately. Priced like a managed dev pod.

Teams (5–20 engineers)$2K–$5K/mo
Enterprise (50+ engineers)$5K–$15K/mo
PE portfolio (fund-wide)Custom
Reseller / staffing partnershipsRev-share

Priced on offshore-engineer equivalence: $3K/mo replaces 2–3 offshore developers (~$5–9K/mo fully loaded). Token flow generated by the contract still accrues the $0.30 / 1M platform fee — dual revenue per customer.

$0.30 per 1M tokens is a platform fee, not a markup on model access.

Your API keys, your provider contracts, your wholesale rates. Model spend and AnyForge platform fee show up as separate line items in every invoice. Most teams save materially more on wasted tokens through intent-based routing than the fee itself costs. AnyForge has no seats and no daily caps — as AI efficiency rises and the rest of the market's seat-based tools shrink, token flow through the harness keeps compounding. The revenue line is inverse to seat-decline risk.

For Every Scale

Solo engineer.Growing team. Global enterprise.PE portfolio.

AnyForge scales with you without changing platforms. Pick the surface that matches your team today — Control in front of existing IDEs, Code Studio for turnkey browser coding, or Crew for governed multi-agent delivery. Expand to enterprise-grade compliance and audit trails across your portfolio as you grow.

For PE funds: install Crew once and inherit a unified governance layer across your entire portfolio — one security posture, one approval workflow, one place to look when regulators come knocking.

Individual

Solo engineer

Pick a surface. Install Control and govern your existing Claude Code / Cursor / Cline. Or start in Code Studio in the browser — free for everyone, 200+ models, Control native. BYOK on both.

Try Code Studio →

Teams

Growing team

Add Crew when a workflow deserves it. AI-native task manager, governed agent teams, shared token visibility, and approval flows. Code Studio and your external IDEs all share one cost ledger and one audit chain.

Enterprise

Large organisation

Full governance at scale. Cryptographic audit trails, multi-step compliance workflows, analytics & reporting across all teams, and dynamic LLM routing with a human-in-loop at every critical node.

PE Portfolio

Fund & portfolio

One governed platform deployed across every portfolio company. One security posture. One approval workflow. One place to look when regulators come knocking — or when something goes wrong.

1

platform across every scale — solo to PE portfolio

immutable audit trail per team and per company

0

custom infrastructure to build or maintain

Edwin Poot, Founder of AnyForge Crew

Founder — AnyForge Crew

Edwin Poot

Edwin has spent a decade operating at the intersection of financial services and engineering governance. As a CTO and technical advisor to PE-backed fintech companies, he watched the same problem emerge repeatedly: AI adoption without accountability structures — and the compliance failures that follow.

AnyForge Crew is his answer: a sovereign engineering platform built from first principles around governance, auditability, and the human oversight that regulators will eventually require from everyone shipping AI in regulated environments.

CTO Insights on Medium →LinkedIn →

Bring your own IDE · 5-min install

Install AnyForge Control

Sign up, add your provider keys, then `npx anyforge init` — your existing Claude Code / Cursor / Cline keeps working. Free to start with a 100M-token trial allowance on signup.

Install Control →

Browser · Free to start

Try AnyForge Code Studio

Browser-based AI coding, 200+ models, native GitHub. Control is built in. Free for everyone, no card, no daily caps.

Try Code Studio →

Multi-agent · Pilot

Apply for AnyForge Crew

Role-based agents (Architect, Engineer, QA, SRE/Compliance) with HIL gates. Pilot contracts only — design-partner intake below.

Apply for pilot →
or go further

Apply to Be a Design Partner

We're looking for 1–3 Design Partners.

Design Partners shape the product alongside us. You get free Crew access, dedicated onboarding with the founding team, direct input on the product roadmap, and preferred pricing at GA launch. We get a co-builder operating in a real environment.

Ideal profile: engineering team of 3–50 people, active GitHub repository, compliance or audit requirements, PE-backed or regulated industry.