How to structure DAGs, SOPs, and procedural rules to guide planning agents.

Modern AI systems increasingly rely on multi-agent architectures, where specialized agents (planning, search, action, review, and response) collaborate to achieve complex goals. To make that collaboration reliable and efficient, we have found that data must flow cleanly across three key layers:
Each layer needs intentional data modeling. Done right, it eliminates duplication, reduces operational cost, and keeps your agents aligned with the real circumstances of your business, avoiding drift into disconnected silos.
👉 This series expands on the framework introduced in our main article, Data Modeling for Multi-Agent Architectures diving deeper into the third layer and how to design it effectively.
Here are the links to the rest of the series:
The Orchestration Layer defines procedural rules for planning agents, specifying who does what, when, and under which conditions.
In practice, this often takes the form of DAGs (Directed Acyclic Graphs) and SOPs (standard operating procedures). These structures ensure that agents execute tasks reliably, consistently, and in alignment with business processes, without hardcoding logic into prompts or ad-hoc scripts.
Each process (e.g., “Customer Query Resolution”) should be modeled as a Directed Acyclic Graph (DAG):
Why DAGs? They make the orchestration inspectable, testable, and replayable, which is critical for debugging multi-agent behavior. Store them as a combination of structured metadata and declarative configuration, rather than a single JSON blob.
Workflow definitions and SOPs can be persisted in several ways depending on maturity:
| Stage | Approach | Pros | Cons |
|---|---|---|---|
| Prototype / Small-scale | YAML/JSON configs in Git (with schema validation) | Versioned, human-readable, simple deployment | No real-time updates, limited runtime introspection |
| Operational / Mid-scale | Workflow orchestration tools like Prefect, Temporal, or Airflow, with metadata stored in Postgres/SQLite | Built-in state, retries, monitoring, DAG UI | Needs integration with agent framework |
| Advanced / Large-scale | Hybrid: workflows as code (DSL) + execution metadata in orchestration DB | Fine-grained observability, replayability, high resilience | More infrastructure overhead, requires DevOps discipline |
Best practice for early to mid-stage startups: Use Prefect or Temporal. Workflows are versioned as code, executions are tracked in a metadata DB, and failed tasks can be rerun or rolled back with minimal friction.
Consider storing SOP files in Git or object storage (e.g., S3) while keeping execution metadata in Postgres/SQL. This separation ensures traceability, observability, and flexibility for branching or rollback.
Workflows should explicitly reference CKMS entities:
This guarantees context lineage, allowing agents to reason consistently and providing auditability. Feedback loops can also be established: orchestration logs → CKMS updates → inform future agent decisions.
A pragmatic orchestration stack may include:
Agents should never directly manipulate workflow definitions, the orchestrator enforces structure and consistency.
Instead of storing the full workflow config in Postgres, store metadata and version pointers:
-- orchestration_workflows table
id | name | version | config_path | created_at
---|------------------------|---------|--------------------------|------------
1 | customer_query_dag | v2 | s3://workflows/customer_v2.yaml | 2025-10-15
At runtime, the orchestrator loads the workflow, executes tasks, and logs state in the metadata DB. This setup provides:
Proper orchestration ensures multi-agent AI systems act reliably, contextually, and predictably, even as workflows evolve.



