Operon: Roadmap Complete

Eight Phases, Six Layers, 1130 Tests, and What It Means to Build Agents That Grow Up

Bogdan Banu · March 2026 · github.com/coredipper/operon

Release: v0.23.1
Abstract

The eight-phase roadmap that began with bi-temporal memory in v0.19 is now complete. Operon has grown from a library of typed wiring patterns into a cognitive architecture where agents have auditable memory, adaptive structure selection, sleep consolidation, social learning, curiosity, and developmental staging with critical periods. This post describes what the capstone integration actually proves, how the enriched capabilities compose into something greater than the sum, and what that means for convergence with operational runtimes.

1. The Arc

PhaseLayerVersionKey Deliverable
1Memoryv0.19Bi-temporal facts with dual time axes
2Memoryv0.20Substrate integration, three-layer context model
3Adaptationv0.21.0PatternLibrary + WatcherComponent
4Adaptationv0.21.1Adaptive assembly + experience pool
5Cognitionv0.22.0Cognitive modes + sleep consolidation
6Cognitionv0.22.1Social learning + curiosity signals
7Developmentv0.23.0Developmental staging + critical periods
8Integrationv0.23.1Memory adapters, integration tests, paper polish

The progression — structure → memory → adaptation → cognition → development → integration — mirrors the biological sequence from genome (fixed structure) through epigenetics (learned bias) to neural development (plastic then crystallizing). Each layer assumes the previous one is stable. The capstone proves they compose.

2. The Capstone: What Integration Actually Proves

Building subsystems in isolation is the easy part. The hard part is proving they compose — that a watcher can observe a substrate-equipped organism, that consolidation can distill adaptive runs, that scaffolding respects developmental gates. Phase 8 exists to answer that question with five end-to-end integration tests that exercise the full stack.

Substrate + Watcher

An organism runs with a bi-temporal memory substrate and a watcher simultaneously attached. Stages emit facts into the substrate via emit_output_fact; the watcher collects signals and monitors convergence. After the run, the bi-temporal store contains the auditable fact trail and the watcher’s signal history records what it observed at each stage. The two systems do not interfere — they share shared_state but operate on orthogonal keys.

Adaptive Assembly + Consolidation

An AdaptiveSkillOrganism fingerprints a task, retrieves the best template from the PatternLibrary, assembles it, runs it, and records the outcome. Then SleepConsolidation takes over: it replays the successful run record into EpisodicMemory, promotes it from WORKING to EPISODIC tier, and compresses recurring patterns into consolidated templates. The experience pool accumulates intervention outcomes across both phases. The loop closes: run → record → consolidate → better template selection next time.

Social Learning + Development

A mature organism (MATURE stage, 80% telomere consumed) exports its successful templates. A young organism (EMBRYONIC, with an open “rapid adoption” critical period) imports them via scaffold_learner(). Templates with min_stage="adolescent" are filtered out — the learner isn’t ready yet. After ticking the learner forward to ADOLESCENT, the same scaffolding call succeeds for the advanced templates. The critical period closes permanently. Trust scores update based on whether adopted templates actually work for the learner.

Memory Adapters

histone_to_bitemporal() bridges HistoneStore markers into bi-temporal facts; episodic_to_bitemporal() does the same for EpisodicMemory entries. These one-way, non-destructive adapters mean that the three memory systems — epigenetic bias, episodic recall, and auditable facts — can flow into a single temporal store for unified querying. A consolidation cycle that promotes histone marks can now also create bi-temporal facts, making the promotion auditable.

Why This Matters

Each phase was tested in isolation. The integration tests prove something different: that the subsystems were designed to compose from the start. The three-layer context model (topology, ephemeral, bi-temporal) carries through from Phase 2 to Phase 8 without modification. The SkillRuntimeComponent protocol from Phase 3 is the same protocol the developmental signals use in Phase 7. The PatternLibrary scoring function from Phase 3 is the same one that consolidation compresses into and social learning shares across organisms. Coherence was not bolted on at the end — it was the constraint from the beginning.

3. The Enriched Capabilities

What does an Operon organism actually do now, with all eight phases composed?

It starts as an embryonic organism with maximum learning plasticity and minimal capabilities. During a critical period, it rapidly adopts templates from mature peers via trust-weighted social learning. As it ticks through its telomere lifecycle, it transitions through JUVENILE and ADOLESCENT stages, unlocking progressively more complex tools (gated by Plasmid.min_stage). Its watcher monitors three signal categories — epistemic (epiplexity, curiosity), somatic (ATP budget, developmental maturity), and species-specific (immune threats) — and intervenes with retry, escalate, or halt when signals cross thresholds. When curiosity is high on a fast model, it escalates to the deep model for more thorough investigation.

After a batch of runs, it sleeps. The SleepConsolidation cycle prunes stale context, replays successful patterns into episodic memory with tier promotion, compresses recurring high-success patterns into consolidated templates, runs counterfactual replay over bi-temporal corrections (“what if we had known this fact earlier?”), and promotes frequently-accessed histone marks from temporary ACETYLATION to permanent METHYLATION. The experience pool accumulates intervention outcomes across runs, so the watcher’s decisions improve with operational history.

At MATURE stage, it becomes a teacher. It exports proven templates to younger organisms, filtered by the learner’s developmental readiness. The trust registry tracks whether its exports actually help the learner, creating a calibrated reputation. Critical periods close: the rapid-adoption window shuts, and the organism settles into its mature pattern repertoire. The bi-temporal memory preserves the full history — every fact, every correction, every belief state at every point in time — so that any past decision can be reconstructed and explained.

4. What This Means for Convergence

Operon is a library. It does not run agents. It describes how they should be wired, what they should remember, how they should adapt, and when they should grow. The next question is: what happens when this structural layer meets an operational runtime?

Two projects have independently converged on complementary ideas. AnimaWorks builds persistent agent organizations that run 24/7 with identity, memory consolidation, and heartbeat-driven autonomy. It has the cognitive runtime that Operon lacks — but no structural guarantees, no bi-temporal auditability, no convergence detection. Swarms provides enterprise-grade multi-agent orchestration with 10+ pattern types and production scaling. It has the deployment infrastructure that Operon lacks — but no formal topology analysis, no experience pool, no developmental gating.

The completed roadmap makes a three-layer architecture possible:

Operon designs and validates the structure. Swarms executes it at scale. AnimaWorks gives it persistent cognition. Each layer is independently useful; the combination is greater than the sum.

Concretely, the enriched capabilities enable specific integration points that didn’t exist before the roadmap:

The Deeper Point

The roadmap was never about building everything in one library. It was about building the structural guarantees that make convergence with operational runtimes safe and auditable. An AnimaWorks agent that consolidates memories without bi-temporal auditability cannot explain its past decisions. A Swarms workflow that selects patterns without scored templates cannot learn from failure. A newly deployed agent without developmental gating gets full capabilities before it has proven it can handle them. Operon provides the missing layers. The roadmap proves they compose.

5. Proving It with TLA+

The convergence of three independently developed systems creates coordination problems that benefit from formal specification. Murat Demirbas’s “TLA+ Mental Models” (March 2026) describes seven mental models for effective formal specification. They map directly onto the convergence architecture.

The Illegal Knowledge Problem

Demirbas identifies a critical pitfall: guards that read global state atomically represent “illegal knowledge” no real process could possess. Operon’s WatcherComponent currently reads shared_state atomically — fine in a single process, but in a distributed deployment via Swarms, no single watcher can atomically observe all stages across nodes.

TLA+ refinement identifies which guards are locally stable — they can only be changed by the local organism’s own actions:

Versus guards that require coordination:

Demirbas’s insight: “If you can make your guards locally stable, the protocol requires less coordination and tolerates communication delays gracefully.” The locally stable guards are the operations that can be implemented without distributed locking. The convergence adapters should be designed around this distinction.

Safety Invariants

Six safety invariants that the convergence must preserve:

  1. Template adoption safety: an organism never acquires a template with min_stage > its current developmental stage
  2. Trust monotonicity: trust scores only change via record_outcome(), never by direct mutation
  3. Bi-temporal append-only: facts are never deleted, only superseded
  4. Critical period irreversibility: once a critical period closes, it never reopens
  5. Convergence budget: intervention rate never exceeds threshold without triggering HALT
  6. Developmental monotonicity: stages never regress, even if telomeres are renewed

These are not just test assertions. They are the contracts that inter-layer adapters must preserve. A Swarms adapter that imports an Operon template must preserve template adoption safety. An AnimaWorks consolidation bridge must preserve bi-temporal append-only semantics. TLA+ model checking can verify these contracts hold across all interleavings — something unit tests cannot do.

Atomicity That Needs Refinement

The current implementation assumes single-process atomicity everywhere. For distributed deployment, these assumptions break:

OperationCurrentDistributed
register_template()Atomic dict writeReplicated write with lag
consolidate()Single batchConcurrent organisms
on_stage_result()Atomic state writeAsync heartbeat signal
record_outcome()Atomic EMA updateConcurrent trust updates
import_from_peer()Atomic mutationNetwork partition mid-adoption

The TLA+ methodology: start with coarse-grained atomicity (current implementation), systematically split into finer steps, reverify safety at each step. The payoff is that fine-grained actions give Swarms maximum scheduling freedom and AnimaWorks maximum heartbeat concurrency — while the model checker proves the invariants still hold.

Three Candidate Specs

TemplateExchangeProtocol.tla — models peer exchange with trust scoring; proves adoption preserves safety invariants. DevelopmentalGating.tla — models lifecycle with critical periods; proves capability gating is never violated under concurrent adoption. ConvergenceDetection.tla — models the intervention-rate signal across distributed organisms; proves non-convergence is always detected within bounded steps.

6. What Comes Next

The roadmap is complete. The code is stable. The paper is publication-ready. What remains is the convergence work: building the adapters that connect Operon’s structural layer to AnimaWorks’ cognitive runtime and Swarms’ enterprise orchestration, with TLA+ specifications proving the inter-layer contracts hold under distributed interleavings.

Agents that cannot explain their past decisions are agents you cannot trust with consequential ones. Agents that cannot learn from their operational history are agents that require constant human supervision. Agents that get full capabilities before they have matured are agents that fail in preventable ways. The roadmap addressed all three. The convergence investigation — with formal verification via TLA+ — is about making those guarantees available to agents that actually run in production, with mathematical proof that the guarantees survive the transition from single-process library to distributed deployment.

Code and release: github.com/coredipper/operon, operon-ai on PyPI, documentation