Orchestrate coding agents
like a long-running daemon.
像守护进程一样
编排编码 Agent。
Symphony continuously reads work from an issue tracker, creates an isolated workspace for each issue, and runs a coding-agent session inside it — repeatable, bounded, and observable. Symphony 持续从 issue tracker 读取工作,为每个 issue 创建独立 workspace,并在其中运行编码 agent 会话 —— 可重复、有上限、可观测。
Four operational problems
that turn into a single daemon.
四个运维问题,
合并为一个守护进程。
Instead of manual scripts and ad-hoc shells, Symphony treats issue execution as a repeatable, bounded, in-repo workflow. 与人工脚本和零散 shell 不同,Symphony 把 issue 执行视为一种可重复、有上限、随仓库走的工作流。
Repeatable workflow可重复的工作流
Issue execution becomes a daemon loop, not a manual script run. issue 的执行从人工脚本变成守护循环。
Isolated execution隔离执行
Per-issue workspaces — agent commands run only inside their own directory. 每个 issue 独立 workspace —— agent 命令只在自己的目录内运行。
In-repo policy仓库内的策略
WORKFLOW.md versions prompt and runtime settings alongside code.
WORKFLOW.md 把提示词与运行时设置和代码一起做版本管理。
Observability可观测性
Enough visibility to operate and debug many concurrent agent runs. 足以运维与调试多个并发 agent 运行的可见性。
Eight components, six layers,
one authoritative orchestrator.
八个组件、六个层级,
一个权威的编排器。
Click any node to inspect its responsibilities. The orchestrator owns the runtime state; everything else feeds or executes work on its behalf. 点击任意节点查看其职责。编排器拥有运行时状态,其它组件为它供给或代为执行工作。
Orchestrator
Owns the poll tick, owns the in-memory runtime state, and decides which issues to dispatch, retry, stop, or release. 拥有轮询 tick 与内存运行时状态;决定哪些 issue 被分发、重试、停止或释放。
- Serializes state mutations through one authority通过单一权威序列化状态变更
- Tracks session metrics and retry queue state跟踪会话指标与重试队列
- Reconciles active runs every tick before dispatching每个 tick 在分发前先调和活跃运行
WORKFLOW.md prompt body and team rules for ticket handling, validation, and handoff.仓库定义的 WORKFLOW.md 提示词正文,以及团队的 ticket 处理、校验与交接规则。
Five claim states.
Distinct from tracker states.
五个认领状态,
不同于 tracker 状态。
Tracker has Todo, In Progress, etc. Symphony tracks a separate claim state so it never dispatches the same issue twice. Hover a state to see what triggers move it.
Tracker 有 Todo、In Progress 等;Symphony 维护独立的认领状态,避免重复分发。悬停状态可查看触发条件。
Transition triggers迁移触发器
- Poll TickReconcile · validate config · fetch candidates · dispatch until slots exhausted.调和;校验配置;拉取候选;直至并发槽用尽。
- Worker Exit (normal)Remove running entry · update totals · schedule continuation retry (attempt 1).移除 running;更新累计;排程续传重试(attempt 1)。
- Worker Exit (abnormal)Remove running entry · update totals · schedule exponential-backoff retry.移除 running;更新累计;指数退避重试。
- Codex Update EventUpdate live session, token counters, rate limits.更新 live session、token 计数与速率限制。
- Retry Timer FiredRe-fetch active candidates and attempt re-dispatch, or release if ineligible.重新拉取活跃候选并尝试再分发,否则释放认领。
- Reconciliation RefreshStop runs whose issue states are terminal or no longer active.停止 issue 已进入终态或不再活跃的运行。
- Stall TimeoutKill the worker and schedule a retry.杀死 worker 并排程重试。
Run attempt lifecycle运行尝试生命周期
Every tick is six steps,
in a strict order.
每个 tick 是六个步骤,
严格按序执行。
Press Run tick to watch the orchestrator reconcile, validate, fetch, sort, dispatch, and notify. 点击 Run tick,观察编排器执行调和、校验、拉取、排序、分发与通知。
active_statestracker 查询限定到 active_statesTodo全局与按状态并发;Todo 的 blocker 规则Live orchestrator log实时编排器日志
tick=0Exponential, but capped. 指数增长,但有上限。
Failure-driven retries grow as 10000 · 2^(attempt − 1), clamped to agent.max_retry_backoff_ms (default 5 min). Adjust the cap and attempt to feel the effect.
失败重试以 10000 · 2^(attempt − 1) 增长,并被 agent.max_retry_backoff_ms(默认 5 分钟)封顶。拖动滑块查看效果。
Two different retry kinds两种重试
A clean worker exit triggers a continuation retry with a fixed 1000 ms delay — the worker may have just finished a turn loop, and the orchestrator wants to check whether the issue is still active.
Worker 正常退出会触发 续传重试,固定延时 1000 ms —— worker 可能刚结束一个 turn 循环,编排器需要再确认 issue 是否仍活跃。
An abnormal exit triggers a failure retry with exponential backoff and a per-issue cap. 异常退出触发 失败重试,按指数退避并按 issue 上限封顶。
Three rules the runtime must uphold. 运行时必须守住的三条规则。
These are the most important portability constraints. Each is enforced before the agent subprocess is launched. 这是最关键的可移植性约束。三条规则均在 agent 子进程启动前校验。
Agent runs only in its workspaceAgent 只在其 workspace 内运行
Before launching, the runtime verifies that the subprocess cwd equals the issue's workspace path. Anything else aborts the attempt.
启动前,运行时校验子进程的 cwd 等于该 issue 的 workspace 路径,否则终止本次尝试。
Workspace stays inside workspace rootWorkspace 必须位于根目录内
Both paths are normalized to absolute and the workspace path must have the root as a prefix directory. Any escape is rejected. 两条路径都归一化为绝对路径;workspace 路径必须以 root 为前缀目录,任何越界都被拒绝。
Workspace key is sanitizedWorkspace key 经过清洗
Only [A-Za-z0-9._-] survive in workspace directory names. Everything else is replaced with _.
workspace 目录名仅允许 [A-Za-z0-9._-],其余字符一律替换为 _。
Five failure classes,
one principle: stay alive.
五类失败,
一个原则:保持存活。
The orchestrator never crashes on transient errors. Dashboards, sinks, fetches — all degrade gracefully. Worker failures become retries. 编排器不会因瞬时错误崩溃。Dashboard、日志 sink、tracker 拉取等失败都做优雅降级;Worker 失败转换为重试。
Workflow / ConfigWorkflow / 配置
- Missing
WORKFLOW.md缺失WORKFLOW.md - Invalid YAML front matter非法 YAML front matter
- Unsupported tracker kind不支持的 tracker
- Missing agent executable缺失 agent 可执行
WorkspaceWorkspace
- Directory creation failed目录创建失败
- Population / sync failed填充/同步失败
- Invalid path config非法路径配置
- Hook timeout / failurehook 超时/失败
Agent SessionAgent 会话
- Startup handshake failure启动握手失败
- Turn failed / cancelledTurn 失败/取消
- Turn timeoutTurn 超时
- Subprocess exit子进程退出
- Stalled session会话停滞
TrackerTracker
- API transport errorsAPI 传输错误
- Non-200 status非 200 状态
- GraphQL errorsGraphQL 错误
- Malformed payloads畸形载荷
Observability可观测性
- Snapshot timeout快照超时
- Dashboard render errorsDashboard 渲染错误
- Log sink config failure日志 sink 配置失败
Recovery behavior恢复行为
stay alivedispatch validation
Skip new dispatches, keep service alive, continue reconciliation.跳过新分发;保持服务存活;继续调和。
worker failure
Convert to exponential-backoff retry.转化为指数退避重试。
tracker fetch failure
Skip this tick — try again on the next one.跳过本 tick,下一 tick 再试。
state-refresh failure
Keep current workers, retry on next tick.保留当前 worker,下一 tick 重试。
dashboard / log failure
Never crash the orchestrator.绝不让编排器崩溃。
process restart
No in-memory retry timer / live session survives. Recovery is tracker- and filesystem-driven.内存中的重试定时器/live session 不跨重启保留;恢复由 tracker 与文件系统驱动。