Prompt Canary

Prompt Canary rolls a candidate prompt version out beside the current stable version, ramps traffic in deterministic steps, and auto-rolls back when the constitutional violation rate crosses a configured threshold. The pattern is inspired by Argo Rollouts' progressive delivery model.

Visit /prompts/{your-prompt-key}/canary after demo-login to drive a canary by hand.

State machine

            propose
                │
                ▼
          proposed ───── rollback ──► rolled_back
                │
                ▼
           ramping ◄──┐ ramp (10/25/50)
                │     │
                ├──── pause ────► analyzing ──── promote ──► stable
                │                       │
                └─────── rollback ──────┴────► rolled_back
TransitionEffect
proposeCreate a deployment row. Requires both canary_version_id and stable_version_id of the same prompt_key.
startproposed → ramping. Sets current_weight = 10 per PRD §9.4.
ramp(weight)Clamped to [0, max_weight]. Stays in ramping.
pauseramping → analyzing. Stops further sampling traffic.
promoteanalyzing → stable. Atomically swaps is_production on the underlying PromptVersion rows — implicit 100%.
rollback(reason)Any non-terminal → rolled_back. is_production on the stable version is untouched.

Terminal states (stable, rolled_back) cannot transition further. File a fresh canary against the same prompt_key after promotion or rollback.

How traffic-splitting works

The gateway pops audittrail_prompt_key from incoming chat completion requests and consults the active ramping deployment for that (user_id, prompt_key). A deterministic SHA-256 dice roll seeded by the request id picks canary or stable per the current weight. The result is tagged on the span as:

  • audittrail.prompt.key
  • audittrail.prompt.canary_deployment_id
  • audittrail.prompt.canary_variant_id
  • audittrail.prompt.canary_variant"canary" or "stable"
  • audittrail.prompt.weight_applied

In v2.9.5 these tags are observability-only — the dice roll decides which variant would have served the request, but the template string returned is whatever the caller supplied. Full traffic-splitting via SDK-side template selection ships in a follow-up tag.

Auto-rollback

A worker runs on the same online_eval_loop that drives drift detection and evaluates every active ramping deployment once per pass. For each, it counts ConstitutionalEvaluation rows tagged on canary-variant spans in the last cooldown_minutes window. When:

  • the sample size is ≥ canary_min_sample_size (default 200) and
  • the violation rate (amber + red) exceeds judge_threshold

the worker transitions the deployment to rolled_back, sets rollback_reason to the observed rate, and emits canary.rolled_back on the live WebSocket plus the standard webhook fan-out.

Anti-thrash

A hard cap of 3 auto-rollbacks per (user_id, prompt_key) per 24h prevents runaway flapping. When the cap is hit the worker emits canary.rollback_capped and leaves the deployment in ramping so a human can investigate.

API

All routes are user-scoped — cross-tenant access returns 404 (not 403) to avoid leaking membership.

MethodPathPurpose
GET/api/v1/prompts/canary/activeAll active deployments for the caller.
GET/api/v1/prompts/{prompt_key}/canary/activeActive deployment for one key, or 404.
POST/api/v1/prompts/{prompt_key}/canary/proposeCreate. Rejects 409 if another is active.
POST/api/v1/prompts/{prompt_key}/canary/startproposed → ramping.
POST/api/v1/prompts/{prompt_key}/canary/rampBody {weight}.
POST/api/v1/prompts/{prompt_key}/canary/pauseramping → analyzing.
POST/api/v1/prompts/{prompt_key}/canary/promoteanalyzing → stable.
POST/api/v1/prompts/{prompt_key}/canary/rollbackBody {reason}.

Mutating routes share a 10 / minute rate limit per user.

WebSocket events

The live channel emits:

  • canary.ramped — on start and explicit ramp
  • canary.paused — on pause
  • canary.promoted — on promote
  • canary.rolled_back — on manual rollback or auto-rollback worker
  • canary.rollback_capped — anti-thrash cap reached (rare)

Every event payload includes the full CanaryDeploymentOut row so the dashboard can refresh without an extra fetch.

Settings

Tune via environment variables:

VariableDefaultEffect
AUDITTRAIL_CANARY_AUTO_ROLLBACK_ENABLEDtrueMaster switch for the worker.
AUDITTRAIL_CANARY_MIN_SAMPLE_SIZE200Floor before the violation-rate check fires.
AUDITTRAIL_CANARY_MAX_AUTOROLLBACKS_PER_24H3Anti-thrash cap.