Harness engineering for agentic AI
One harness.Three ways to use it.
The moat is the harness, not the model. AnyForge is the production harness every AI engineering team ends up building themselves — policy, portable memory, verification gates, hash-chained audit — built once, governed, provider-neutral. Keep your own agent behind Control, code in our browser Code Studio, or ship multi-agent initiatives with Crew. Same factory underneath.
One conversation
Discover. Escalate. Execute.
Start in a single-agent thread. Escalate to a governed Crew when the scope grows. Stay on the same conversation. AnyForge Control runs underneath every step, so cost, audit, and memory stay unified.
01 / DISCOVER
Start in Code Studio, or your IDE.
Explore a question, sketch an ADR, iterate on a fix. Use our browser Code Studio, or keep Claude Code / Cursor / Cline / Continue and let Control govern them. Every token is already on the ledger.
02 / ESCALATE
One click to a governed Crew.
When scope outgrows a single agent — cross-service, compliance-touching, architect-level — Code Studio proposes a Crew with a full task breakdown. Start it inline. Same thread.
03 / EXECUTE
Specialists run. You approve in-thread.
Architect, Engineer, QA, and Compliance take over under governance. See every agent message live. Approve HIL gates from Code Studio, from the operator console, or (where the product supports it) from your IDE.
AnyForge Control runs underneath every step. One cost ledger. One audit chain. One operator console.
The Products
One governance layer. Three consumption surfaces.
Control is the layer underneath. Code Studio and Crew are two ways to consume it natively. Pick whichever surface matches your team. You get the same cost ledger, the same audit chain, and the same portable memory everywhere.
AnyForge Control
Govern the agents you already use.
A governed LLM proxy. Point Claude Code, Cursor, Cline, Continue, or your own SDK at our endpoint. Zero workflow change. Immediate visibility.
- →Unified proxy: OpenAI, Anthropic, Gemini, OpenRouter
- →Self-hosted Ollama for air-gapped or on-prem inference
- →Per-call cost and latency analytics
- →Intent-based routing (rules, regex, or embedding)
- →Portable memory that survives provider switches
- →Hash-chained, signed audit log
- →Optional OPA policy (redaction, caps, allow-lists)
AnyForge Code Studio
Turnkey AI coding. Control built in.
Browser-based single-agent coding workspace. 200+ models via OpenRouter. Native GitHub integration. Escalate to a governed Crew in-thread when scope grows.
Powered by AnyForge Control
Governance, cost, and audit are native. Nothing separate to install.
- →200+ AI models via OpenRouter
- →Native GitHub integration, main branch locked
- →Repo-grounded plans, failing tests first
- →Reverse-engineer any repo — C4 diagrams, threat model, tech debt registry
- →Escalate to Crew in one click, stay on the thread
- →Approve Crew HIL gates from inside Code Studio
- →BYOK keys, $0.30 / 1M tokens platform fee
AnyForge Crew
One Engineer. Specialists on demand. Operator-gated.
A single capable Engineer agent drives each crew, consulting Architect, Security & Compliance, QA, Compliance, and Oracle specialists on demand. At PR time the specialists fan out and review the diff in parallel before one mandatory operator approval. Turn on per project when a workflow deserves the full treatment.
Powered by AnyForge Control
Governance, cost, and audit are native. Nothing separate to install.
- →One Engineer agent + 5 specialist advisors (Architect, Security, QA, Compliance, Oracle)
- →Engineer reads the repo, applies patches with deterministic git apply, runs your real tests, opens a PR
- →Specialists run on-demand mid-run and fan out at PR time for parallel review
- →One mandatory HIL gate at PR review — approve, reject, or amend (re-enters Engineer)
- →Budget enforcement with auto-pause thresholds
- →Cryptographic audit trail of every specialist call, tool call, and operator decision
- →Runs on self-hosted Ollama for regulated or air-gapped customers
Code Studio · Codebase Intelligence
Reverse-engineer any codebase.
Connect a repository and Code Studio produces a full documentation suite in two passes — structural map, C4 architecture diagrams, security threat model, tech debt registry, and more. Every gap auto-seeds a specific DRAFT initiative you can launch into a governed Crew.
See how it works →Phase 1 · 2–5 min
Structural Map
Gaps & Assumptions
Well-Architected Scan
Gap initiatives seeded
Phase 2 · 10–20 min
C4 Diagrams (L1/L2/L3)
Security Threat Model
Data Model + API Overview
Runbook + Tech Debt + Onboarding
One operator console. Every AI call cost-tracked, routed, and audited — whether it came from Code Studio, your IDE, or a Crew agent.
The Problem
They use the expensive AI for everything.
Like hiring a surgeon to put on a band-aid. Other AI tools send every task — simple or complex — to their most expensive model. You're overpaying for work a cheaper model handles just fine.
When they go down, you stop working.
Anthropic, OpenAI, and Google all have outages — regularly. If your team is locked to one vendor, everyone sits idle until it's back. One provider failing shouldn't stop your entire organisation.
Silent failures accumulate.
Agents write code that passes review and quietly breaks an invariant three sprints later. Without a verification layer — policy gates, compliance loops, hash-chained audit — you don't catch drift. You react when it explodes.
Your instructions go stale, nobody notices.
AGENTS.md, .cursorrules, CLAUDE.md — every team writes them, nobody keeps them current. Memory resets every session and nothing pins context to policy. Agents follow yesterday's rules. You find out at review time.
Nobody knows what the AI actually did.
Your engineers use AI every day, but there's no record of what it changed, why, or whether anyone approved it. When a regulator or client asks, you don't have an answer. You have a problem.
Once you're in, there's no way out.
Every major AI vendor bundles their models, memory, and tools together. Build on their stack and switching means rebuilding from scratch. It's the cloud lock-in playbook all over again — but moving faster.
“The AI providers don't sell you a model — they sell you a dependency. The token bill is just the interest payment.”
The Harness
Your AI factory in seven layers.
Every AI engineering team ends up building these seven layers themselves — usually as duct tape, one incident at a time. AnyForge ships them as one product. Pick any entry point and you inherit the whole harness underneath.
Models commoditise. Harnesses compound.
Intent capture
What is the team actually trying to do?
Every request — from the chat box in Code Studio, from Claude Code in the terminal, from a raw SDK call — is logged as a typed goal. Who asked, what for, and what it cost. No more guessing what your team spent on AI last month.
Issue framing
What is the real work hiding inside the request?
Crew's Architect turns an intent into an ADR, breaks it into tasks, and routes each task to the right role. The plan is the first thing a human approves — before any code is written.
Context & instruction
How does the model know what's true here?
Policy-scoped memory (Zep, bi-temporal) carries context across providers. Switch Claude for GPT and the memory comes with you. Instruction files live in the repo, stay version-controlled, and never silently drift.
Execution
Which model should run this task, at what price?
The governed LLM proxy routes by intent, fails over across providers, and charges wholesale. Simple tasks go to cheap models. Hard tasks go to expensive ones. You see every call — what it cost, what it returned, where it went.
Verification
How do we catch the silent failures before they ship?
Two mandatory HIL gates (plan approval + PR review), a compliance loop that runs before and after implementation, and a QA loop with a bounded retry budget. Agents cannot skip verification nodes — the graph topology forbids it.
Isolation & permissions
What can this agent actually touch?
Role-locked agents with hard-bound capabilities. OPA policy decisions on every tool call. Per-tool MCP servers for filesystem, shell, deploy, and db — each one gated. Your agents run with the permissions you grant them, not the permissions the vendor ships.
Feedback loops
What happened, and what should we change?
Hash-chained audit of every call, every approval, every policy decision. Per-agent and per-task spend analytics. Drift detection that surfaces stale context and silent regressions before your CFO — or your auditor — surfaces them.
Vibe coding gets you to a prototype. A harness gets you to production.
Every AI coding tool on the market today ships layer 4 (execution) and calls it a product. Helicone, Portkey, Langfuse add layer 7 (feedback). LangGraph and CrewAI give you bits of 2 and 4 if you build the rest yourself. AnyForge is the only place all seven layers ship together — as one product, with one console, one audit chain, one cost ledger.
Why Not Just Use Their Agents?
Same code quality. Fraction of the cost.
GitHub Copilot, Cursor, Devin, Amazon Q, and Vertex AI collectively cover 0–1 of 9 enterprise governance capabilities. The gap is structural — not a feature they're planning to ship.
Claude Code and Codex are excellent single-agent tools. But they charge you for every token in their autonomous loop — and you have zero control over how many rounds they run. AnyForge runs the same multi-turn tool loop with your budget controls.
Other tools lock you to one vendor, charge premium prices for every task, and leave your team idle when that vendor goes down. AnyForge gives you every AI model, full control, and no strings attached.
9 governance capabilities. The nearest rival checks 2.
Enterprise AI governance requires more than session logs. Here is how every major platform stacks up across the nine capabilities that regulated industries actually need.
The Platform
Two products. Four core capabilities.
Governed LLM Endpoint
One endpoint in front of OpenAI, Anthropic, Gemini, OpenRouter, and self-hosted (Ollama / vLLM). Sign up at anyforge.ai (free, 100M-token trial allowance on signup), add your provider keys at /settings/secrets, then run `npx anyforge init` — it detects Claude Code, Cursor, Cline, Continue, or a raw SDK client and rewrites the config. Your agents keep working. Per-call cost, routing, portable memory, and a hash-chained audit log show up at crew.anyforge.ai instantly. Also native inside Code Studio and Crew — nothing separate to install for those surfaces.
Single-agent AI coding
Browser-based single-agent coding workspace. 200+ AI models via OpenRouter, repo-grounded plans, quality-first testing, native GitHub integration with main-branch locking by default. Governance is native — every call flows through Control. Escalate to a governed Crew in-thread when scope grows.
Governed Agent Teams
Architect, Engineer, QA, Compliance & Security — each with a mandate and hard constraints enforced at the orchestration layer. An AI-native task manager structures the backlog. Every workflow moves through a governance state machine: agents cannot skip nodes. The HIL approval node is non-negotiable.
Visibility & Control
Full token spend visibility by agent, task, and team — with actionable recommendations. Approval flows with cryptographic proof at every node. Immutable audit trails for every decision. Analytics & reporting across all teams. Dynamic LLM routing picks the most cost-effective model for each task automatically.
The Architecture
Governance on top. Providers swappable underneath.
The layers we own stay constant. The execution layer is a pluggable contract — any adapter, same governance, same audit log, same cost ledger.
Governance Layer
Role-bounded access controls · Compliance-aware state machine · Cryptographic audit logging · Per-company / per-crew / per-role token spend · Approval workflows & deployment policy
Orchestration Layer
Portable crew definitions (stored in AnyForge, not in any provider) · Task routing & model selection · BYOK — bring your own API key for any supported provider; keys live in Google Secret Manager, never in our database · Cost rules & budget enforcement
Execution Adapter Interface
An AnyForge-defined contract. Each adapter implements the same interface — we route traffic to whichever one fits the task.
Why provider independence matters for a PE portfolio.A single portfolio rarely sits on a single cloud. One company is on AWS with data-residency rules, another is all-in on Google Cloud, a third standardised on Anthropic for cost reasons. A governance platform that hard-codes a provider forces the portfolio to re-platform — or forces the fund to run parallel toolchains with parallel audit trails. Neither is acceptable. The adapter contract means one governance spine serves every company on whatever execution stack they already pay for.
What AnyForge owns vs. what execution providers own.AnyForge owns the parts that determine whether a regulator, board, or LP can trust the output — access controls, the state machine, audit logs, token spend, approval policy. Execution providers own the parts that are commodities or specialisations: sandboxing, tool use, streaming, model access. The contract line is deliberate. Everything a provider could reasonably be swapped for in two years lives below it. Everything that would cost months of policy re-derivation to replace lives above it.
The Anthropic Managed Agents relationship.Anthropic’s Managed Agents product (launched April 2026) is a useful execution adapter for Claude-specific tasks — it handles sandboxing, session state, and tool routing well. AnyForge uses it where it fits. But Managed Agents is Claude-only by design, with no portfolio governance layer and no cross-provider audit trail. That is precisely the surface area AnyForge owns. When the question is “does the fund adopt Managed Agents or AnyForge,” the answer is both — Managed Agents underneath AnyForge, not instead of it.
How it works
From backlog to governed production.
From requirements to merged PR — governed at every step.
Work Queue — What Your Team Sees
BRD / Backlog
Task enters the queue
Refine↻
AI Architect + Operator iterate
Ready
Backlog approved, crew on call
In Crew
Engineer drives, specialists on demand, you gate the PR
Done
PR merged, CI/CD deploys
Inside Every “In Crew” Run — The Governed Agent Workflow
Requirements submitted
Engineer or PM describes the task with requirements, BRD, or user stories. Optionally links a GitHub issue. Sets budget limit.
Architect explores & plans
Software Architect agent explores your codebase using 15 tool rounds — reads files, searches code, understands architecture. Produces an ADR with Mermaid diagrams and implementation plan.
Compliance gate (automated)
Compliance & Security agent checks the plan against your enforced constraints. Violations loop back to the Architect. No human cost — fully automated.
Plan Approval (human)
You review the ADR, architecture diagrams, and compliance report. Approve to start implementation, or reject with comments. Cryptographically signed.
Engineer builds & tests
Full Stack Engineer agent gets 40 tool rounds — reads code, writes files, runs tests, iterates until passing, commits, and creates a PR. Real code, not text output.
PR Review & Merge (human)
You review the diff in AnyForge, chat with the Engineer agent about decisions, and approve. PR is merged via GitHub API. Your CI/CD pipeline handles deployment.
LangGraph state machine — agents cannot skip governance gates. Every decision cryptographically logged.
Pricing
Token-native, not token-taxed.
Two distinct revenue lines. Free to start with a 100M-token trial allowance on signup; pay $0.30 per 1M tokens after that — provider-agnostic, the same rate for cloud BYOK and self-hosted, covering the full harness: routing, portable memory, hash-chained audit, policy. Crew Managed Service is sold separately as a managed dev pod. The trial is the acquisition onramp. Crew contracts stack on top.
Platform Fee
100M tokens free. Then $0.30 / 1M tokens.
The 100M-token trial covers roughly 1–2 weeks of active dev use before any AnyForge bill. After that, $0.30 per 1M tokens covers routing, portable memory, hash-chained audit, policy, and MCP tools — provider-agnostic, the same rate whether you use cloud LLMs or self-host. Smart routing and caching typically save customers more on wasted model spend than the platform fee costs.
Annual prepay
Same $0.30 / 1M, procurement-ready.
For procurement gates that need an annual contract. Same unit economics as Self-serve — no second platform price.
Crew Managed Service
Sold separately. Priced like a managed dev pod.
Priced on offshore-engineer equivalence: $3K/mo replaces 2–3 offshore developers (~$5–9K/mo fully loaded). Token flow generated by the contract still accrues the $0.30 / 1M platform fee — dual revenue per customer.
$0.30 per 1M tokens is a platform fee, not a markup on model access.
Your API keys, your provider contracts, your wholesale rates. Model spend and AnyForge platform fee show up as separate line items in every invoice. Most teams save materially more on wasted tokens through intent-based routing than the fee itself costs. AnyForge has no seats and no daily caps — as AI efficiency rises and the rest of the market's seat-based tools shrink, token flow through the harness keeps compounding. The revenue line is inverse to seat-decline risk.
For Every Scale
Solo engineer.Growing team. Global enterprise.PE portfolio.
AnyForge scales with you without changing platforms. Pick the surface that matches your team today — Control in front of existing IDEs, Code Studio for turnkey browser coding, or Crew for governed multi-agent delivery. Expand to enterprise-grade compliance and audit trails across your portfolio as you grow.
For PE funds: install Crew once and inherit a unified governance layer across your entire portfolio — one security posture, one approval workflow, one place to look when regulators come knocking.
Individual
Solo engineer
Pick a surface. Install Control and govern your existing Claude Code / Cursor / Cline. Or start in Code Studio in the browser — free for everyone, 200+ models, Control native. BYOK on both.
Try Code Studio →Teams
Growing team
Add Crew when a workflow deserves it. AI-native task manager, governed agent teams, shared token visibility, and approval flows. Code Studio and your external IDEs all share one cost ledger and one audit chain.
Enterprise
Large organisation
Full governance at scale. Cryptographic audit trails, multi-step compliance workflows, analytics & reporting across all teams, and dynamic LLM routing with a human-in-loop at every critical node.
PE Portfolio
Fund & portfolio
One governed platform deployed across every portfolio company. One security posture. One approval workflow. One place to look when regulators come knocking — or when something goes wrong.
1
platform across every scale — solo to PE portfolio
∞
immutable audit trail per team and per company
0
custom infrastructure to build or maintain
Founder — AnyForge Crew
Edwin Poot
Edwin has spent a decade operating at the intersection of financial services and engineering governance. As a CTO and technical advisor to PE-backed fintech companies, he watched the same problem emerge repeatedly: AI adoption without accountability structures — and the compliance failures that follow.
AnyForge Crew is his answer: a sovereign engineering platform built from first principles around governance, auditability, and the human oversight that regulators will eventually require from everyone shipping AI in regulated environments.
Bring your own IDE · 5-min install
Install AnyForge Control
Sign up, add your provider keys, then `npx anyforge init` — your existing Claude Code / Cursor / Cline keeps working. Free to start with a 100M-token trial allowance on signup.
Install Control →Browser · Free to start
Try AnyForge Code Studio
Browser-based AI coding, 200+ models, native GitHub. Control is built in. Free for everyone, no card, no daily caps.
Try Code Studio →Multi-agent · Pilot
Apply for AnyForge Crew
Role-based agents (Architect, Engineer, QA, SRE/Compliance) with HIL gates. Pilot contracts only — design-partner intake below.
Apply for pilot →Apply to Be a Design Partner
We're looking for 1–3 Design Partners.
Design Partners shape the product alongside us. You get free Crew access, dedicated onboarding with the founding team, direct input on the product roadmap, and preferred pricing at GA launch. We get a co-builder operating in a real environment.
Ideal profile: engineering team of 3–50 people, active GitHub repository, compliance or audit requirements, PE-backed or regulated industry.
