Skip to content

06 — Agentic swarms

Multi-agent orchestration via Temporal. Specialist agents with scoped tool sets, coordinated by a workflow that handles parallelism, timeouts, retries, and human-in-the-loop escalation.

Single-loop vs multi-agent

The insertion points doc (01-insertion-points.md) defines each layer as a single tool-use loop — one system prompt, one user turn, zero-to-few tool calls, one final answer. Correct for Layers A, B, and D.

Layer C, and the broader vision — underwriting triage, cross-domain deal analysis, portfolio risk — demands more. A single agent with fifteen tools cannot reason as well as three specialists with five tools each and tight system prompts.

Orchestration patterns

Pattern Shape When to use
Supervisor / worker One orchestrator dispatches to specialists; synthesises results Default. "Investigate and decide" workflows
Pipeline Agents process in sequence; each adds to growing context Document processing, staged enrichment
Debate / review Multiple agents independently assess; synthesiser reconciles High-stakes decisions needing diverse perspectives
Router Lightweight classifier routes to the correct specialist High-volume triage; most requests need only one specialist

Temporal as swarm orchestrator

The stack already runs Temporal for BespokeReviewWorkflow. Temporal is a natural swarm backbone:

  • Activities = agent calls. Each specialist is a Temporal activity. The workflow dispatches, handles timeouts, retries failures.
  • Signals = human-in-the-loop. When a specialist escalates, the workflow waits for a human signal — exactly as SignalHumanDecision already works.
  • Durability. If an LLM call times out at minute 3, Temporal retries the activity. The workflow does not restart from scratch.
  • Observability. Every activity execution is in the Temporal UI with input, output, and duration.

Worked example: bespoke review swarm

BespokeReviewSwarmWorkflow
  ├── Activity: IntakeParser (Layer A — Haiku, no tools)
  │     → advisory tags
  ├── Activity: MarginAnalyst (Sonnet, tools: simulate, explain_clamps)
  │     → margin assessment, alternative allocation proposals
  ├── Activity: PartnerAnalyst (Sonnet, tools: get_partner_history, query_analytics)
  │     → partner risk profile, historical pattern
  ├── Activity: FraudScreener (Sonnet, tools: score_fraud_risk, query_analytics)
  │     → fraud risk assessment
  ├── Activity: Synthesiser (Sonnet, no tools — reads specialist outputs)
  │     → structured decision: approve / counter / decline / escalate
  ├── [if escalate] Signal: wait for human decision
  ├── Activity: ApplyDecision (deterministic — no LLM)
  └── Activity: NarrativeWriter (Layer D — Haiku, no tools)
        → human-readable summary

Specialists run in parallel (MarginAnalyst, PartnerAnalyst, FraudScreener). The synthesiser sees structured specialist outputs, not raw tool traces.

Worked example: underwriting triage swarm

UnderwritingTriageSwarmWorkflow
  ├── Activity: DocumentParser (Haiku, no tools)
  │     → structured fields from free-text application
  ├── Activity: FraudAnalyst (Sonnet, tools: score_fraud_risk, query_analytics)
  │     → fraud assessment + supporting evidence from ClickHouse
  ├── Activity: CreditAnalyst (Sonnet, tools: query_analytics, get_credit_data)
  │     → credit risk assessment from analytical layer
  ├── Activity: ComplianceChecker (Haiku, tools: check_sanctions, check_pep)
  │     → compliance flags
  ├── Activity: Synthesiser (Sonnet, reads specialist outputs)
  │     → approve / refer / decline with rationale
  ├── [if refer] Signal: wait for human underwriter
  └── Activity: DecisionWriter (Haiku, no tools)
        → audit-ready decision narrative

The FraudAnalyst calls score_fraud_risk (existing ML model at ../../ml/underwriting-fraud-model/) and query_analytics (ClickHouse for historical fraud patterns). The ML model does the hard mathematics; the LLM reasons about what the score means in context.

Swarm configuration

// internal/temporal/swarm/config.go
package swarm

type AgentConfig struct {
    Name           string
    Model          string            // LiteLLM model alias
    SystemPrompt   string            // path to .md template
    Tools          []anthropic.Tool
    MaxToolRounds  int
    TimeoutSeconds int
}

type SwarmConfig struct {
    Name        string
    Agents      []AgentConfig
    Synthesiser AgentConfig
    MaxBudgetCents int  // per-execution cost cap
}

var BespokeReviewSwarm = SwarmConfig{
    Name: "bespoke-review",
    Agents: []AgentConfig{
        {Name: "intake-parser", Model: "claude-haiku",
         MaxToolRounds: 0, TimeoutSeconds: 10},
        {Name: "margin-analyst", Model: "claude-sonnet",
         Tools: []anthropic.Tool{SimulateAllocationToolSchema(), ExplainClampsToolSchema()},
         MaxToolRounds: 3, TimeoutSeconds: 30},
        {Name: "partner-analyst", Model: "claude-sonnet",
         Tools: []anthropic.Tool{PartnerHistoryToolSchema(), QueryAnalyticsToolSchema()},
         MaxToolRounds: 3, TimeoutSeconds: 30},
        {Name: "fraud-screener", Model: "claude-sonnet",
         Tools: []anthropic.Tool{FraudScoringToolSchema(), QueryAnalyticsToolSchema()},
         MaxToolRounds: 2, TimeoutSeconds: 30},
    },
    Synthesiser: AgentConfig{Name: "synthesiser", Model: "claude-sonnet",
        MaxToolRounds: 0, TimeoutSeconds: 15},
    MaxBudgetCents: 50, // $0.50 per review execution
}

Parallel specialist execution

// internal/temporal/swarm/workflow.go (sketch)
func BespokeReviewSwarmWorkflow(ctx workflow.Context, input SwarmInput) (SwarmOutput, error) {
    // Phase 1: intake parsing (sequential — feeds all specialists)
    var tags IntakeTags
    _ = workflow.ExecuteActivity(ctx, RunAgent, intake, input).Get(ctx, &tags)

    // Phase 2: specialists in parallel
    var margin, partner, fraud AgentOutput
    f1 := workflow.ExecuteActivity(ctx, RunAgent, marginAnalyst, input, tags)
    f2 := workflow.ExecuteActivity(ctx, RunAgent, partnerAnalyst, input, tags)
    f3 := workflow.ExecuteActivity(ctx, RunAgent, fraudScreener, input, tags)
    _ = f1.Get(ctx, &margin)
    _ = f2.Get(ctx, &partner)
    _ = f3.Get(ctx, &fraud)

    // Phase 3: synthesis
    var decision SynthesisOutput
    _ = workflow.ExecuteActivity(ctx, RunSynthesiser, margin, partner, fraud, tags).Get(ctx, &decision)

    if decision.Action == "escalate" {
        // Phase 4: human-in-the-loop
        var humanDecision HumanDecision
        ch := workflow.GetSignalChannel(ctx, "human_decision")
        ch.Receive(ctx, &humanDecision)
        decision = humanDecision.ToSynthesis()
    }

    // Phase 5: apply + narrate
    _ = workflow.ExecuteActivity(ctx, ApplyDecision, decision).Get(ctx, nil)

    var narrative string
    _ = workflow.ExecuteActivity(ctx, RunAgent, narrativeWriter, input, decision).Get(ctx, &narrative)

    return SwarmOutput{Decision: decision, Narrative: narrative}, nil
}

Open questions

  1. Swarm vs single agent for Layer C. Single agent is simpler; swarm is more capable. Ship single for Phase 3a (shadow), swarm for Phase 3b (autonomous)?
  2. UW triage swarm — real or aspirational? 6-month or 12-month horizon? Design with UW in mind, ship bespoke first?
  3. Swarm depth. Can specialists spawn sub-specialists, or is one level sufficient?
  4. Cross-swarm tool sharing. Centralised tool package or duplicated per swarm?
  5. Per-execution cost cap. $0.50/review — right order of magnitude?