Evolve — Evolve your OpenClaw pod

The problem

Running OpenClaw isn't the hard part — running it well is.

These are the issues we hear about most often from people running OpenClaw bots. This is what Evolve is built to solve.

Silent failure

Things stop working with no explanation — an app, an integration, a scheduled job. You only find out when something downstream breaks.

Phantom success

Worse: the bot reports green when nothing actually ran. Trust erodes silently — and you don't know it's gone until something matters.

Security

If you aren't vigilant, bad things can happen.

Cost

Tokens add up fast if nobody's watching.

Over-autonomy

You asked the bot to summarize emails. It also archived three, drafted a reply, and put something on your calendar. AI agents don't know when to stop.

Privacy

Your bot sees email, calendar, files, conversations. Where does that data live? Most assistants pipe it through someone else's cloud.

Statelessness

Bots are prompt/response by nature. They often promise things they won't do.

Integrations

Gmail, Slack, Calendar, GitHub — every one a little tricky to set up and keep working.

Sysadmin

Keeping bots current, rotating keys, watching liveness — a job by itself.

Applications

Making an "app" is easy. Making a good one, improving it over time, coordinating many — much harder.

Self-improvement

A bot can update its own notes. Real adaptation doesn't happen by itself.

Fleet management

The above is plenty for one bot. With several, it's overwhelming.

The shape of the experience

Evo at the center.

Evo is an OpenClaw bot that knows your pod end-to-end. Talk to evo and it resolves things — finds the suggestion, applies it, verifies, reports. The dashboard is still real and substantial underneath: usage graphs, credentials, applications, security audits, self-improvement suggestions. Chat for the common path; dig into the dashboard when you want detail.

The chat is the home page

Open the admin UI and the first thing you see is evo's short report on the state of things, plus a conversation thread to address any of it. Say "snooze every team_bot_a alert until tomorrow" or "fix the cron caps issue" — evo finds the matching change, applies it, verifies, and reports done.

A panel on every page

Wherever you are in the dashboard, evo is on the right with the context of that page. Open Alerts and ask "why is this firing?" — evo sees the same signals you see. Switch pages and the conversation context switches with you; each page keeps its own thread.

Same brain everywhere

Chat in the dashboard, DM evo on Telegram, or use the evo keyword from any bot's thread on Signal, iMessage, Slack, or Discord. One bot, one set of tools, one long-term memory. Improvements to evo's instructions land everywhere at once.

It actually resolves things. The old workflow was: an engine generates suggestions, they queue up on a Recommendations page, you click through them. The new workflow is: you describe the problem to evo, evo finds the matching suggestion (or stages one), applies it end-to-end, verifies, and reports. The Recommendations page is still there — useful for audit and triage — but most of the time you don't need it.

What Evolve does

The Evolve hierarchy.

Evolve helps you run your OpenClaw pod from the ground up.

Layer 1 Build them

Take a bare Mac — or an Ubuntu VPS — to a running pod in one pass. The setup wizard creates the service user, installs OpenClaw, deploys the plugin, wires the scheduled jobs, and walks you through your first bot.

One-command setup

evolve-admin setup --fresh handles everything — service user, OpenClaw install, plugin, channel tokens, launchd jobs. About 30 minutes when you have your keys ready.

Add a new bot

Spin up a fresh bot for a project, a family member, or a client engagement in five minutes. Compartmentalized by default — each bot gets its own OS user account, its own workspace, its own credentials.

Safe OC upgrades

When OpenClaw ships a new release, the safe-upgrade preflight checks compatibility, captures rollback state, and applies the update across the pod — without you having to baby it.

Users & pairing

Pod admins, self-claim passphrases, per-bot owners, and paired-user management per channel — all in the admin UI. Approve, reject, or disconnect users with a click. Pod-admin /start requests auto-approve so you skip the code round-trip. Single-user ↔ multi-user toggle per bot.

Layer 2 Keep them alive

Health monitoring, auto-healing, and version management — so a downed gateway, a stale plugin, or an OpenClaw upgrade doesn't quietly take a bot offline.

Health monitoring

Every gateway, every channel, every API key, every OAuth token — health at a glance. Surface what's running, what's degraded, what needs rotation.

Maintenance

Active alerts, drift detection, the heal daemon's restart history, daemon status. The reactive verb: remediate. Rollback now lives on the Backup page.

Backup & recovery

Cloud (private GitHub per bot — size estimate, classification audit, auto-prune), Local (Time Machine with exclusion sync), Data (per-bot default tier + per-app overrides for what's eligible), Recovery (restore from latest backup or run a host-swap flow). The vault key for stored credentials is pod-wide — admin and evo can both read it after a one-time migration on first read, so a separated-user upgrade doesn't strand secrets.

Alerts & Reports

System-driven alerts on the Maintenance page; user-configured digests (daily cost, weekly review, integration health) on the Reports page. Proposals get their own top-level subtab under Reports so improvements and incidents don't share a queue. Subscribe to what matters.

Layer 3 Keep them safe and affordable

Vigilant by default, friendly by design. Security audits and cost controls run in the background, surface plain-language findings, and gate every behavior change through a signed approval pipeline.

Security audit

Every 15 minutes, eight categories of pod drift get checked — config posture, integrity hashes, plugin posture (now with install-provenance scoring), content scans, and more. An Intentional Deviations page lets you mark a setting as deliberate so generators stop fighting your upstream choices. Findings surface as plain-language signals.

Usage tracking

Daily and monthly spend per bot, per model, per session class. Forecasts, anomaly detection, per-bot drill-downs — so you can see why a number moved.

Cost Optimization

One canonical Cost & Caps matrix per bot — and a Pod tab for defaults. A graduated remediation ladder picks the right response by severity: warn alerts you, downgrade drops the session tier and pauses non-critical crons, hard trips an L1 breaker that refuses gateway calls. Caps roll at midnight in the pod's local timezone. A behavioral Cost Efficiency Score grades the routing choices each bot is making.

Model Economics

A pod-wide, model-centric cost leaderboard — one bar per model across every bot, sortable by $/turn, effective cost per 1k tokens (cache-aware), spend, or share. Filter by provider, pricing band, bot, or audience. The transpose of per-bot usage — built for tuning model choice by comparing within a band or across providers.

AI Optimization

Per-bot default-tier picker (fast / standard / power) feeds a hierarchical routing cascade: classifier → operator default → per-user override. Maintenance sessions go to Haiku; productive sessions stay on Sonnet. Users can set their own default per bot via evo tier in any thread — it persists.

Session analysis

Every turn classified — productive, maintenance, or ambiguous. Daily metrics aggregated per bot. The productive-vs-maintenance ratio feeds the cost-per-useful-turn metric, and maintenance sessions auto-route to cheap models.

Layer 4 Make them useful

A bot that can't reach your calendar, email, or notes is a toy. Evolve gives you the capability surface — installable skills, configurable plugins, and the applications that combine them into something actually useful.

Skills catalog

Installable skills spanning messaging, productivity, dev, home, and creative work — Gmail, Calendar, Google Workspace (suite via vetted MCP — read + write), Slack, Discord, Telegram, iMessage, Obsidian, Notion, Linear, Home Assistant, AutoCAD, Runway, Zoom, and more. Install once; any bot can use it. Filesystem-shape skills (iMessage, Obsidian) run entirely on your machine — no cloud roundtrip. New MCP-backed catalog entries land via a bundled-plugin pattern so adding a hosted service stays a single registry file.

Installed applications

Each app is a contract: the goal, the skills it depends on, the tests that prove it works, and a satisfaction score. The same app on two bots produces two different implementations — because the contract is the goal, not the code.

App Coherence & Reconciliation

Manifests are typed and provenance-tracked: each field is either authored (a contract you wrote) or observed (something the scanner found). Three checks run before a manifest deploys — reconciliation (does what's on disk match the contract?), coherence (do the manifest's own claims hang together?), and the existing Tier 3 audit. A pre-deploy gate blocks anything that fails the contract; observational drift updates silently. Ask evo to app-changes or app-scan and it does the rest.

Pod Conduct

A shared behavioral contract — honesty about state, no empty commitments, privacy and data handling, scope awareness — injected into every bot's session context. Universal floor; per-bot personality on top. Amendments go through human approval.

Layer 5 Make it better

The self-improvement layer. A portfolio of specialized coaches watches your pod and proposes specific, falsifiable improvements; evo resolves them in chat end-to-end. The Continuity Engine closes the statelessness gap. The Gallery and Forge let you grow the pod's capabilities deliberately. The approval pipeline is what makes all of this safe — RSI applied to applications, not raw prompts, with a human in the loop on every change.

Resolves things, doesn't just suggest them.

Coaches watch how your pod is doing — substrate health, cost, security, application quality, voice fit. When something can be better, a coach proposes a specific change with a falsifiable claim (e.g. "this reduces gateway restarts ≥30% over 7 days") and a revert plan. A verify check grades the outcome at the claim's horizon and adjusts each coach's authority by results. New cross-bot coaches (engagement_amplifier, pod_capability_lift) surface patterns that only show up across the fleet; an anti-domain layer lets you mark a topic out of scope so coaches stop pitching it. The suggestion queue is inventory; evo is the resolver. Describe the problem in chat — evo finds the matching suggestion, applies it, verifies, reports. You approve every change before it ships.

Chat with evo

"fix the cron caps issue" → evo finds, applies, verifies, reports.

Dashboard, when you want to dig in

The Recommendations page lists every suggestion with full audit trail.

Issues

Say evo improve "this could be better" from any thread; evo captures it, drafts a body, and stages it as a Draft. Promote with one click — it lands as a real GitHub issue on the repo you pick. Inbound issues filed by other people on repos you maintain get LLM-triaged (category, urgency, draft reply) and queued for review; an opt-in auto-response policy with a 24-hour undo handles the obvious cases.

Continuity Engine

Extracts pending commitments from session transcripts and surfaces them at the next session start. Runs pre-approved recurring tasks (weekly review, cost summary, project digest) on a schedule. Approval gates before any task executes.

App Gallery

A growing Gallery of installable applications: Morning Briefing, Email Triage, Note-taker, Calendar Summary, EA Pack, Journal, Task Manager, Contacts, Workspace Backup, GitHub Integration, and more. One-click install with inline OAuth.

Forge

Generate new applications from a spec. LLM-enriched manifests, test cases written alongside, an interactive feedback loop while you build. For the cases where the Gallery doesn't have what you need.

Approval pipeline

What makes RSI safe. Every Better-Engine suggestion travels through a signed pipeline, a security review, and a human approval gate. Eight hard auto-reject rules. The gate exists because self-improvement without it is just self-modification.

Layer 6 Make it easy

The usage layer. Evo is the front door and the panel that follows you. Coaches do the proactive watching so you don't have to. The Claude Desktop bridge lets you bring pod context into a deep-work session when you want it.

Chat with evo, anywhere

The chat page is your home: evo's report at the top, conversation below. DM evo on Telegram, or use the evo keyword from any bot's thread on Signal, iMessage, Slack, or Discord. All routes hit one OpenClaw bot — same SOUL, same tools, same long-term memory.

Evo panel on every page

Wherever you are in the dashboard, evo sits on the right with the context of that page. Ask "why is this firing?" on Alerts; evo sees the same signals you see. Each page keeps its own conversation thread; long-term memory is shared across them. Heavy turns stream so you can watch evo work; a per-conversation tier selector (Auto / Fast / Standard / Power) trades cost for depth on the fly. Evo also verifies the capability it's about to use before it promises to take the action.

Coaches

A portfolio of specialized coaches watches your pod in the background — substrate health, cost, security, app quality, voice fit. Some flag trouble (Sysadmin Watchdog, Budget Hawk, Security Warden); others suggest improvements. They catch the silent failures and surface them as actionable suggestions, not raw alerts.

Claude Desktop bridge

MCP server connects Claude Desktop to the live pod over Tailscale. Deep-work sessions start with full pod context — workspace memory, pending tasks, active suggestions — without you having to brief the model.

Layer 7 Make it fun

The most important layer that doesn't show up in a feature matrix. Evolve is built to feel like it was made by people who care about your experience — not by a committee writing a roadmap.

A little personality goes a long way.

Running OpenClaw shouldn't feel like a chore. We've threaded a little whimsy throughout — in the voice, the status messages, and a few surprises you'll find as you use it. We won't spoil them.

Works with

The integrations Evolve manages for you.

A growing catalog of installable skills covering messaging, productivity, home, dev, and creative work. OAuth flows, key rotation, health monitoring, and credential handling — all handled.

Who it's for

Overkill for one bot. Underkill for enterprise. Built for the people in between.

Evolve sits in a specific spot in the market. If you recognize yourself in the cards below, it's probably for you. If you don't, that's useful information too.

Households running a pod

One bot per family member, plus a household coordinator. Domain-compartmentalized — health doesn't see ventures, ventures doesn't see giving.

Project-driven service businesses

Designers, contractors, planners, real estate agents. A studio bot plus a bot per active client engagement. Existing tools (Studio Designer, QuickBooks, Asana) keep working.

Solo professionals taking themselves seriously

Lawyers, accountants, consultants. One or two bots. Client data stays on a Mac in your office. No cloud roundtrip on local-system integrations like iMessage and Obsidian.

Enterprise platform teams

If you have a dedicated AI ops function, look at Preloop. Evolve doesn't try to be your platform.

Single-bot solo users

If one OpenClaw instance is all you need, you'll find Evolve heavy. The wins compound when you have several.

People who want hosted SaaS

Evolve runs on hardware you own, on purpose. There's no cloud control plane to sign up for.

Want more detail? The product vision doc goes deeper on positioning, architecture, and what's explicitly out of scope.

Architecture

Three-layer pod design

Bots cannot influence their own management layer. Every production change requires human approval.

Layer 3

Admin user — you

The only human in the loop

Has sudo access. Approves proposals, manages keys, deploys updates.
Does not run a bot — sysadmin and assistant roles stay separate.
Drives Evolve from the admin UI, the CLI, or the evo conversational interface.

Layer 2

evolve user — dedicated OS user account

Manages and improves the pod

Runs the admin server, scheduled jobs, the analyzer, and security audits.
Generates and validates improvement suggestions before surfacing them.
Cannot be influenced by the bots it manages.

Layer 1

Bot users — your OpenClaw bots

One OS user per bot, fully compartmentalized

Do the actual work. Each runs an OC gateway on its own user account.
Each bot's LLM inference runs with that bot's own credentials. No centralized inference service inside Evolve sees user data.
Monitored by Evolve. Cannot read each other's workspaces.

Requirements

What you need before installing

Full checklist: pre-install-checklist.md

Hardware

Any Mac, macOS 14 (Sonoma) or later — 16GB min, 24GB recommended
Mac mini M4 is the recommended always-on box
Linux host / Ubuntu 24.04 VPS — supported
Admin account with sudo access
Wired ethernet recommended (on-prem)

Software

Python 3.9+
Node.js 20+
OpenClaw (already installed)
Evolve repo cloned

Required accounts

One LLM provider (Anthropic, OpenAI, Ollama, etc.)
One messaging channel (Telegram, Slack, Discord, iMessage)

Strongly recommended

Brave Search API key
Google OAuth credentials
Tailscale (remote access + MCP Bridge)
Private GitHub repo (security backups)

Install

From zero to running a pod

⚠ Alpha software. Evolve will likely break an existing OpenClaw setup. It may introduce security issues — the security features are not fully audited. Built for macOS 14+ (tested on Mac mini M4); Linux (Ubuntu) is supported. Real API keys required — set spend limits before installing. Don't install on a machine you can't afford to wipe.

You can install Evolve on an existing OpenClaw instance or create a new pod from scratch using the Evolve setup wizard.

Check prerequisites

Python 3.9+ and Node.js 20+ must be installed.

python3 --version
node --version   # brew (macOS) / apt (Linux) install node if needed

Clone and set up

sudo git clone https://github.com/evolve-ops/evolve /Users/Shared/evolve-repo
sudo python3 -m venv /Users/Shared/evolve-venv
sudo /Users/Shared/evolve-venv/bin/pip install -e /Users/Shared/evolve-repo/packages/admin/

Run the wizard

The wizard creates the evolve service user, installs OpenClaw, deploys the plugin, and wires the scheduled jobs. Have your API keys, channel tokens, and a private GitHub repo URL ready.

sudo evolve-admin setup --fresh

Open the dashboard

The admin UI runs locally. No cloud.

open http://localhost:5050

Ran into something? File a bug report or start a discussion.

Evolve your OpenClaw pod.

Get early access to the code