07 — Cost controls, safety, and audit¶

Per-execution budgets, spend tracking, observability, API key management, prompt injection defence, and audit trail.

Cost envelope¶

At 100 reviews/day running all four layers:

Layer	Volume	Tokens/call	Model	Daily cost
A (intake parse)	100	~1.5k	Haiku	~$0.30
D (narrative)	100	~2.5k	Haiku	~$0.50 (cached)
B (advisory)	~10 (bursty)	~4k	Sonnet	~$0.40
C (shadow)	100	~8k	Sonnet	~$8.00
Total (shadow)				~$10/day
Full swarm	100 × 5 agents			~$50/day

Prompt caching is load-bearing. System prompts + tool schemas are stable across calls — >80% cache hit rate realistic for Layers A/D.

Per-execution budgets¶

Each swarm execution has a MaxBudgetCents cap. The tool dispatcher tracks cumulative token usage. If budget is exhausted mid-swarm, the synthesiser receives a "budget exhausted" signal and must decide with available information.

LiteLLM spend tracking¶

LiteLLM's Postgres database records every call: - model, provider, tokens (in/out/cached), cost - API key used, request tags (review_id, workflow_id) - full request/response payloads (optional; toggle per model)

Exposed via admin API: - GET /spend/logs — per-call log - GET /spend/tags — spend by tag - GET /budget/info — current budget consumption

Langfuse integration (optional)¶

LiteLLM supports Langfuse as a callback: - Trace visualisation (full tool-loop chains) - Cost dashboards by model, user, feature - Prompt versioning and A/B testing - Evaluation pipelines (concordance scoring for shadow mode)

Not required for v1; valuable for Phase 3a when concordance measurement becomes load-bearing.

Structured logging¶

Every LLM call logged with business context:

{
  "event": "llm_call",
  "agent": "margin-analyst",
  "workflow_id": "bespoke-review-abc123",
  "review_id": "rev_01JXYZ",
  "model": "claude-sonnet",
  "input_tokens": 3200,
  "output_tokens": 450,
  "cache_read_tokens": 1800,
  "tool_calls": ["simulate_allocation"],
  "latency_ms": 2340,
  "cost_cents": 4.2,
  "stop_reason": "end_turn"
}

Emitted from the tool dispatcher, not the HTTP client — captures which agent and which review alongside the technical metrics.

API key management¶

ANTHROPIC_API_KEY — stored in secrets manager (Vault / Azure Key Vault). Never in compose env files for production.
LiteLLM master key — rotated independently; scoped to internal traffic only.
Per-service virtual keys (LiteLLM feature) — each service gets its own key with its own budget. BFF cannot exhaust rev-sci's budget.

Prompt injection defence¶

LLM agents that process user-supplied text (Layer A intake parsing, Layer C reviewing partner justifications) are prompt injection surfaces. Mitigations:

User-supplied text is always in the user message, never interpolated into the system prompt.
Tool results are returned as tool_result content blocks, not concatenated into message text.
Agents that process untrusted text have no write-effect tools. Layer A has no tools at all; Layer C's write-effect tools (approve_review, decline_review) require the synthesiser to act — a specialist cannot approve directly.

Audit trail¶

Every LLM-influenced decision captures: - Full prompt (system + messages + tool schemas) - All tool calls and results - Final response - Token counts and cost - Workflow ID + review ID for correlation

Stored in llm_audit_log (ClickHouse, reading parquet on NFS / ADLS) for long-term retention, not in the transactional database. The audit parquet is a Dagster asset — materialised from the LiteLLM Postgres spend log, partitioned by date. Retention: 2 years minimum for compliance.

Open questions¶

Monthly spend budget and approval authority. Who approves?
Prompt caching validation. Prototype Phase 1 and measure actual cache hit rates before projecting Phase 3 costs?
Concordance threshold. What % agreement graduates shadow → autonomous? Who sets it?
Prompt governance. Who can edit prompt templates? PR review process?
Rate limiting LLM calls. Per-day or per-month cost cap? Does the cap interact with the narrow-band threshold?