07 — Cost controls, safety, and audit¶
Per-execution budgets, spend tracking, observability, API key management, prompt injection defence, and audit trail.
Cost envelope¶
At 100 reviews/day running all four layers:
| Layer | Volume | Tokens/call | Model | Daily cost |
|---|---|---|---|---|
| A (intake parse) | 100 | ~1.5k | Haiku | ~$0.30 |
| D (narrative) | 100 | ~2.5k | Haiku | ~$0.50 (cached) |
| B (advisory) | ~10 (bursty) | ~4k | Sonnet | ~$0.40 |
| C (shadow) | 100 | ~8k | Sonnet | ~$8.00 |
| Total (shadow) | ~$10/day | |||
| Full swarm | 100 × 5 agents | ~$50/day |
Prompt caching is load-bearing. System prompts + tool schemas are stable across calls — >80% cache hit rate realistic for Layers A/D.
Per-execution budgets¶
Each swarm execution has a MaxBudgetCents cap. The tool dispatcher
tracks cumulative token usage. If budget is exhausted mid-swarm, the
synthesiser receives a "budget exhausted" signal and must decide with
available information.
LiteLLM spend tracking¶
LiteLLM's Postgres database records every call: - model, provider, tokens (in/out/cached), cost - API key used, request tags (review_id, workflow_id) - full request/response payloads (optional; toggle per model)
Exposed via admin API:
- GET /spend/logs — per-call log
- GET /spend/tags — spend by tag
- GET /budget/info — current budget consumption
Langfuse integration (optional)¶
LiteLLM supports Langfuse as a callback: - Trace visualisation (full tool-loop chains) - Cost dashboards by model, user, feature - Prompt versioning and A/B testing - Evaluation pipelines (concordance scoring for shadow mode)
Not required for v1; valuable for Phase 3a when concordance measurement becomes load-bearing.
Structured logging¶
Every LLM call logged with business context:
{
"event": "llm_call",
"agent": "margin-analyst",
"workflow_id": "bespoke-review-abc123",
"review_id": "rev_01JXYZ",
"model": "claude-sonnet",
"input_tokens": 3200,
"output_tokens": 450,
"cache_read_tokens": 1800,
"tool_calls": ["simulate_allocation"],
"latency_ms": 2340,
"cost_cents": 4.2,
"stop_reason": "end_turn"
}
Emitted from the tool dispatcher, not the HTTP client — captures which agent and which review alongside the technical metrics.
API key management¶
ANTHROPIC_API_KEY— stored in secrets manager (Vault / Azure Key Vault). Never in compose env files for production.- LiteLLM master key — rotated independently; scoped to internal traffic only.
- Per-service virtual keys (LiteLLM feature) — each service gets its own key with its own budget. BFF cannot exhaust rev-sci's budget.
Prompt injection defence¶
LLM agents that process user-supplied text (Layer A intake parsing, Layer C reviewing partner justifications) are prompt injection surfaces. Mitigations:
- User-supplied text is always in the
usermessage, never interpolated into thesystemprompt. - Tool results are returned as
tool_resultcontent blocks, not concatenated into message text. - Agents that process untrusted text have no write-effect tools.
Layer A has no tools at all; Layer C's write-effect tools
(
approve_review,decline_review) require the synthesiser to act — a specialist cannot approve directly.
Audit trail¶
Every LLM-influenced decision captures: - Full prompt (system + messages + tool schemas) - All tool calls and results - Final response - Token counts and cost - Workflow ID + review ID for correlation
Stored in llm_audit_log (ClickHouse, reading parquet on NFS / ADLS)
for long-term retention, not in the transactional database. The audit
parquet is a Dagster asset — materialised from the LiteLLM Postgres
spend log, partitioned by date. Retention: 2 years minimum for
compliance.
Open questions¶
- Monthly spend budget and approval authority. Who approves?
- Prompt caching validation. Prototype Phase 1 and measure actual cache hit rates before projecting Phase 3 costs?
- Concordance threshold. What % agreement graduates shadow → autonomous? Who sets it?
- Prompt governance. Who can edit prompt templates? PR review process?
- Rate limiting LLM calls. Per-day or per-month cost cap? Does the cap interact with the narrow-band threshold?