Blog

Harness Engineering as Categorical Architecture

Paper 5 is on arXiv. The triple (G, Know, Φ) is the formalism behind every gate, certificate, and adapter Operon has shipped — here is what it claims, what it does not, and what it unlocks for the rest of the work.

Operon Paper 5 Categorical Architecture May 13, 2026

Pinecones and the Portable Certificate

A materials-engineering preprint from MIT just made the substrate-independence move explicit. The Operon papers have been making the same move silently for five iterations — here is what we can borrow from how they did it.

Operon Certificates Compositionality April 27, 2026

SLAM Already Solved Stagnation

operon-langgraph-gates v0.1 is a discrete-state port of factor-graph fixed-lag smoothing (Kaess 2012). The port is trivial — the scope discipline that falls out is the upgrade. Written in reply to Dellaert’s April 21 GTSAM post on factor graphs as world models.

operon-langgraph-gates Factor Graphs Structural Guarantees April 23, 2026

Score-Rejection Isn’t Cert-Firing (and n=10 Isn’t Enough)

I ran the same structural-critic experiment twice on the same 10 SWE-bench-lite instances: +0 pp pass@1 one run, +10 pp the next. Zero certificate fires on either. An honest update from an earlier post that conflated score-based retry with sustained-stagnation attestation.

operon-openhands-gates SWE-bench Structural Critics April 21, 2026 (updated)

Phase C: The 8B Ceiling Holds Cross-Model, and Retry Is the Wrong Lever

22 hours of Ollama on a second 8B model with format-correction retry active. 0/30 evaluated, 0 retry-recovered patches. The v0.34.5 “single survivor” was gemma4-specific. Retry helps competence-with-lapses, not capability-ceiling models — a sharper negative than v0.34.5 predicted.

Operon SWE-bench Cross-Model April 18, 2026

When Your Diagnostic Beats Your Result

A patch sanitizer + repo grounding pipeline disambiguated SWE-bench Phase 2’s “model or harness?” failure mode. The score didn’t move — but the failure became attributable, and the bottleneck localized to 8B diff-format discipline rather than file selection.

Operon SWE-bench Diagnostic Infrastructure April 17, 2026

Where Structural Guarantees Actually Help (and Where They Don’t)

End-to-end evaluation with real LLM agents reveals that Operon’s value depends on which layer you’re protecting. Updated for v0.33.1: interactive HF Spaces, per-stage LangGraph, Paper 5 citations

Operon E2E Evaluation Structural Guarantees April 10, 2026

Proving Your Agent Guarantees Survive Deployment

Operon now has self-verifiable certificates, empirical topology validation, and proof that structural guarantees survive compilation to four frameworks

Operon Certificates Validation April 8, 2026

What Biological Agent Design Actually Buys You

After benchmarking three operon subsystems against naive alternatives, the honest answer is: structural guarantees, not algorithmic sophistication

Operon Benchmarks Structural Guarantees April 4, 2026

Operon v0.26: What Happens When You Try to Evolve Your Agent Architectures

We built biological evolution for AI organisms. Here’s why random mutation won.

Operon Meta-Evolution April 3, 2026

Operon v0.25: The Compile–Decompile Loop

Why comparing agent frameworks requires a round-trip through structure, and what Scion taught us about isolation

Operon Evaluation Scion Compilation Multi-Agent March 2026

Operon v0.24: Why Your Agent Framework Needs a Structural Linter

Five-layer convergence architecture, TLA+ verification, and what it means for Swarms, DeerFlow, and the rest

Operon Convergence TLA+ Verification Multi-Agent March 2026

Operon: Roadmap Complete

Eight Phases, Six Layers, 1130 Tests, and What It Means to Build Agents That Grow Up

Operon Roadmap Release March 2026

Operon v0.23: Developmental Staging

Critical Periods, Capability Gating, and What Happens When Agents Grow Up

Operon Development Critical Periods Scaffolding March 2026

Operon v0.22: The Cognitive Architecture

Cognitive Modes, Sleep Consolidation, Social Learning, Curiosity, and What Happens When Agents Start Dreaming

Operon Cognition Social Curiosity March 2026

Operon v0.21: Adaptive Foundations

Pattern Repository, Watcher Component, and the Static Scaffolding for Dynamic Assembly

Operon Adaptation Watcher Patterns March 2026

Operon v0.20: Substrate Integration

Three-Layer Context, Auditable Workflows, and the Question Every Multi-Stage System Should Answer

Operon Substrate Bi-Temporal Workflows March 2026

Operon v0.19: Bi-Temporal Memory

Append-Only Facts, Dual Time Axes, and Belief-State Reconstruction for Auditable Agent Systems

Operon Memory Temporal Compliance March 2026

Operon v0.18: Pattern-First Skills

Skill Organisms, Provider-Bound Agents, and a Thinner Front Door for Multi-Agent Workflows

Operon API Design AI Agents Skills March 2026

Operon v0.17: Epistemic Topology

Observation Profiles, Topology Classification, and Structural Predictions for Multi-Agent Systems

Operon Epistemics AI Agents Topology March 2026

Operon v0.15: Diagram Optimization

Cost-Annotated Wiring Diagrams, Categorical Rewriting, and Resource-Aware Execution

Operon Optimization Category Theory Metabolism March 2026

Operon v0.14: Coalgebra, Diffusion, Optics

Formal State Machines, Spatially Varying Gradients, and Conditional Wire Routing

Operon Coalgebra Diffusion Optics February 2026

Operon v0.13: Multicellular

Plasmid Registry, Denaturation Layers, and Tissue Architecture

Operon HGT Denaturation Tissue February 2026

Operon v0.12: Verification

Eval Harness, Mathematical Corrections, and Reproducible Benchmarks

Operon Eval BFCL AgentDojo February 2026

Operon v0.11: Homeostasis

Epiplexity, Innate Immunity, and the Metabolic Motherboard

Operon MIPS Homeostasis January 2026

Biological Motifs for Agentic Control

A Categorical Isomorphism between Gene Regulatory Networks and Autonomous Software Architectures

Category Theory Systems Biology AI Agents Preprint