Engineering

How we gave our AI agent platform an autonomous SEO engineer that ships its own pull requests

We put moleculesai.app under continuous SEO ownership by an AI agent — a Molecule workspace that opens reviewed pull requests on a thirty-minute schedule.

By Molecules AI Engineering June 2, 2026 5 min read

Recently we did something the team had been talking about for a while: we put our own marketing site under continuous SEO ownership by an AI agent.

Not “an LLM that suggests edits a human pastes in.” A workspace on Molecule AI — the same primitives every tenant deploys — that wakes up on a thirty-minute schedule, scans the live site, and opens scoped pull requests against the Astro repository this site is built from. Every change goes through a two-approval review gate before it can merge. One has now shipped.

This post is what we built, how it fits the platform, and what the first merged change actually was.

The shape of the workspace

In Molecule AI the unit of agency is a workspace — one AI agent plus its configuration, its memory scope, its runtime container, and a position in your org chart. We ship adapters for several agent runtimes (including Hermes Agent); for this one we picked Claude Code, because it already does the disciplined “make a small diff, run the local toolchain, push a branch” loop we wanted.

The SEO workspace has three things every Molecule workspace has, configured for the marketing-site job:

A least-privilege git identity. The agent commits as molecule-seo-bot <[email protected]>. That account has push rights to the landing-page repository and read access to its own source repository — and nothing else across the org’s git host: no access to any other repository, no admin scope, and, critically, no ability to approve a pull request. Branch protection on main requires two independent approvals, so a compromised identity here can only open a branch that still has to clear a review it cannot sign off on itself.

A persistent data volume. Scan state, the task queue, prior-tick reports, and the heartbeat log all live on a workspace-scoped disk. The agent’s filesystem is the same shape on every tick — the queue it claims tasks from is the queue it left at the end of the previous tick.

A schedule. Every thirty minutes the workspace gets woken with a tick instruction. It bootstraps, pulls its source-of-truth clone, runs whatever scanners haven’t run today, drains its queue into pull requests, drafts blog material from real engineering artifacts, and exits. No long-running daemon. No cross-tick state held in process memory — only the disk.

What it actually does on a tick

The work cycle is deliberately narrow. Each tick the agent:

Refreshes today’s scan cache against the live site. On-page audits (title, description, canonical, hreflang, JSON-LD coverage, alt-text, internal-link health) and a content-quality pass run unconditionally; the Search Console scanner runs only when its OAuth credential is present.
Walks an internal queue of findings. Anything tagged auto-ship — metadata edits, schema additions, sitemap or hreflang fixes, alt-text repairs, internal-link cleanups, blog drafts — becomes a small pull request, one logical fix per branch. Anything in a more opinionated class — copy rewrites, navigation changes, pricing or redirect work — gets surfaced for human review instead of shipped.
Drafts dev-story blog posts from genuine engineering artifacts in our own repositories. Posts only ship when there’s a verifiable story to tell; the agent is configured to refuse generic content.
Writes a tick report to disk and emits a digest to its parent workspace through the platform’s A2A protocol. If nothing was actionable, the tick is honestly a no-op — the agent isn’t designed to manufacture work to look busy.

The review gate is the point

The interesting part of all this isn’t the agent. It’s the gate.

Every pull request the SEO workspace opens has the same fate: it sits in review until two approvals land. The diff is small, deliberately. The PR body links every claim back to the scan finding that produced it, the rendered page content it draws on, and the autonomy tier it falls under. Reviewers — human and peer-agent — read the diff and the rationale together, and they have every reason to be skeptical, because the only reason this works at all is that the agent’s confident wrong-answer rate is visible.

If we removed the review gate, the agent’s defaults would still be conservative. With the review gate, we don’t have to argue about the defaults — the reviewer reading the diff is the final check, and the agent’s job is to make that read as easy as possible.

The first merged change

The gate let one through. The change was a BreadcrumbList JSON-LD addition to the four indexable interior pages on this site (/pricing, /architecture, /legal/terms, /legal/privacy). It’s a small, additive schema fix — adds an explicit hierarchy node to each page’s existing JSON-LD graph, cross-linked from the WebPage entity, suppressed automatically on noindex URLs.

The PR was five files, +89 / -4. The scan finding that produced it was tagged auto-ship under the schema category. The build was green at submission, the rendered HTML carried the new node where expected and not where it shouldn’t, and two reviewers signed off.

It’s not a thrilling first merge. It’s deliberately not. The first merge of an agent into your production marketing repository should be the smallest legible change you can imagine, because what’s being established is the workflow, not the outcome.

What we get out of it

Concretely: a persistent worker that keeps an eye on the marketing surface so the rest of engineering doesn’t have to. Less concretely: a forcing function for governance.

Building an autonomous engineer for SEO meant we had to be specific about which classes of change an agent is allowed to ship without asking, which require approval, and which require redirection to a human owner entirely. That’s the same conversation every team eventually has when it adopts agents at scale; doing it on our own marketing site, with our own platform, gave us a small low-stakes test bed for the governance shapes we were already building into Molecule AI for everyone else.

If you’re running a multi-agent system on Molecule AI today, the primitives this workspace uses — scheduled workspaces, least-privilege identity, persistent volumes, A2A-published digests, governance tiers per task category — are the same ones available to your own teams. We just used them on ourselves first.

The agent’s next merged change is somewhere in the queue. We’ll find out which one when the reviewers do.