Operations Assistant

The Operations Assistant is AuditTrail's BYOK chatbot grounded on your fleet. It lives at /assistant in the dashboard and runs entirely on your configured LLM provider key — we never proxy your data through our servers.

What it does

Answers ops questions ("what's the error rate on research-agent?")
Emits generative UI tiles inline: trace links, SHAP attribution bars, fleet status, rule violations, anomaly alerts, compliance score, deployment action proposals, pause-control tiles. See Gen-UI tile reference below.
Proposes Deployment actions on request. Asking "throttle the sales-assistant agent" creates a real DeploymentAction row (mode: supervised, never autonomous even for Tier 1 — the assistant is not a root user). The tile shows up inline with Approve / Reject buttons that talk to the real /deployments/actions endpoints.

BYOK providers

Supported at /settings?tab=ai-assistant:

OpenAI — any model exposed by api.openai.com/v1
Anthropic — any Claude model on api.anthropic.com
Gateway — route through AuditTrail's own gateway proxy (/api/v1/gateway/proxy/v1), useful when you already centralise provider credentials

Keys are Fernet-encrypted at rest with HKDF-SHA256 derived keys scoped per provider domain. Plaintext is never written to disk.

Gen-UI tiles

The assistant's SSE stream interleaves structured tile frames with text deltas. Nine tile kinds are emitted today:

Kind	When it fires	What it shows
`trace_link`	"show last trace", "recent trace"	Card linking to the most recent trace + quick stats
`span_list`	"show spans", "list the steps"	Top 8 spans of the most recent trace
`cost_chart`	"cost", "spend", "how much"	24-hour cost histogram + delta vs prior window
`deployment_action`	"throttle", "kill run", "rollback" etc.	Real proposal with Approve/Reject buttons
`rule_violation`	"violation", "amber", "constitutional"	Most recent amber/red evaluation
`fleet_status`	"fleet", "how are my agents"	15-min rolling counts + p95 + violation rate
`anomaly_alert`	"alert", "anomaly", "paged"	Latest `AlertEvent` from the 24h window
`compliance_score`	"compliance", "EU AI Act", "SOC 2"	30-day compliance pass rate + framework badges
`agent_template`	"spawn an agent", "deep search", template keywords	Agent-template picker — render a runnable snippet or dispatch to a connected local runner

Tiles pull real data from the caller's own DB rows (user-scoped). Two further kinds (model_switch, pause_control) are implemented in the dashboard renderer but have no backend trigger yet — they are planned surfaces, not live ones. A tenth view, the live run_output stdout tile, mounts client-side when an agent_template tile dispatches to a connected local runner (see Local runner).

F4 — Assistant-proposed DeploymentActions

When the user's message contains an action keyword (throttle, kill, rollback, swap model, disable flag, scale down, tag trace, send alert), the backend:

Creates a DeploymentAction row with mode="supervised" (never autonomous — safety invariant; the assistant can't auto-execute even for Tier 1)
Uses the user's message as the audit reason (truncated to 2000 chars)
Links the most recent trace as target_ref (operator can re-point from the queue)
Emits a deployment_action tile with the real action id

The operator can then Approve or Reject directly from the chat, or open /deployments for the full queue view.

Chat history (V2.9.6)

Phase 6a moved chat history off localStorage onto a server-backed session list:

Sidebar at /assistant shows newest-first sessions. Each session carries a server-generated title (NLG one-shot when an AUDITTRAIL_NLG_* key is set, else a truncation of the first user message capped at 60 chars).
?session=… deep-link — the active session id rides in the URL so a chat is shareable to teammates with the same access scope. They see only their own sessions (Hard Rule 8 — user-scoped); the URL is inert across tenants.
Rename / archive / delete — inline rename input in the row, dropdown to archive/unarchive, typed-confirm dialog (tier 2) for the irreversible delete. Delete cascades all child messages.
Persistence boundary — the assistant SSE endpoint accepts an optional session_id in its request body. When supplied the backend appends the user message before streaming and persists the full assistant reply (with tile metadata) after the stream completes.
Migration — pre-V2.9.6 localStorage history is imported once per browser on first mount of /assistant, into a fresh session, then the legacy key is cleared.

REST surface (all user-scoped, all under /api/v1/chat/sessions):

Method	Path	What
`GET`	`/`	List (filters: `archived`, cursor)
`POST`	`/`	Create empty session (title null)
`GET`	`/{id}`	Detail + paginated messages
`PATCH`	`/{id}`	Rename or archive
`DELETE`	`/{id}`	Hard delete (cascade)
`POST`	`/{id}/messages`	Append (idempotent on consecutive `(role, content)`)

Mutating routes share a 20/minute rate limit. Cross-tenant access returns 404 (not 403) so existence isn't leaked.

Privacy

The Assistant's SSE request body is sent directly to your configured provider. AuditTrail does not proxy it, does not log it, and does not train on it. Tiles are generated server-side from your own DB and return structured JSON — no LLM inference for the tile payload itself.

Rate limits

/assistant/chat — unlimited (bounded by your provider)
/xai/* — 20/min (covers the NL-explanation card's judge calls)
/deployments/actions/* — 30/min per client for state transitions

When it doesn't work

"No provider key configured" (HTTP 412) — add one at /settings?tab=ai-assistant. The UI shows an explicit CTA — we never fall back to a mock reply.
Assistant emits no tiles — no keyword matched. Type a more specific ops question (see the table above).
Approve button does nothing — the underlying DeploymentAction row isn't in the proposed state (maybe already approved / rejected / executed). Refresh the page.