Don't just trace what happened.
Understand why.

Open-source explainable observability for AI agent systems. Causal attribution + constitutional governance + visual debugging.

57%

↑

of orgs have AI agents in production

45 → 5 min

↓

debugging time reduction

<5 ms

→

middleware overhead

Aug 2026

↑

EU AI Act enforcement deadline

Integrated with the frameworks you already use

LangChainLangGraphOpenAIAnthropicOpenTelemetryLangfuseAutoGenCrewAIDSPyLlamaIndexHaystackSemantic KernelLangChainLangGraphOpenAIAnthropicOpenTelemetryLangfuseAutoGenCrewAIDSPyLlamaIndexHaystackSemantic Kernel

OpenTelemetryLangfuseAutoGenCrewAIDSPyLlamaIndexHaystackSemantic KernelLangChainLangGraphOpenAIAnthropicOpenTelemetryLangfuseAutoGenCrewAIDSPyLlamaIndexHaystackSemantic KernelLangChainLangGraphOpenAIAnthropic

DSPyLlamaIndexHaystackSemantic KernelLangChainLangGraphOpenAIAnthropicOpenTelemetryLangfuseAutoGenCrewAIDSPyLlamaIndexHaystackSemantic KernelLangChainLangGraphOpenAIAnthropicOpenTelemetryLangfuseAutoGenCrewAI

The gap · Two lenses · One platform

“What happened” is easy. “Why” is the bar.

Same trace data, two perspectives. Hover either lens to focus it; the components below are the real /traces and /sankey views, not screenshots.

WHAT

Traces tell you what the agent did.

Ordered chain of LLM + tool calls with timing, cost, and tokens. The diary of the run — the same story every observability tool is happy to tell.

WHY

AuditTrail explains why it did it.

Causal attribution over the prompt. Every decision has a trail back to the words that drove it, the rule that caught it, and the edit that would have flipped it.

Side-by-side

Debugging, before and after.

Same failure mode. Same agent. Drag to compare the investigation loop without AuditTrail (left) vs. with it (right).

[ERROR] 2026-04-19T14:22:08Z customer-support.agent step=synth

[INFO ] parent_run_id=a3f2… span_id=b1… tokens_in=1284

[INFO ] child_run a3f2…b2 type=tool name=search

[INFO ] child_run a3f2…b3 type=tool name=calculator

[WARN ] confidence=0.42 below_threshold=0.6 skip_chain

[INFO ] child_run a3f2…b4 type=llm name=synth model=gpt-4o

[INFO ] retries: 0 1 2 3 …

Exception: confidence_gate_failed — agent bailed after 3 retries

[DEBUG] full trace 6.2 MB — grep if you can

3 terminals open · you're diffing JSON by eye

without AuditTrail

Trace b1c8… · customer-support · sankeywarning · 1 rule breach

Prompt PhrasesReasoningTool Calls

Hover a phrase or tool to highlight its causal flow

with AuditTrail · 5 min

Drag the slider to compare. Mobile: swipe horizontally.

Compare platforms.

Feature-by-feature against the OSS + SaaS LLM observability pack. Uncertain cells are shown as partial; every claim is cross-checked against published docs. Hover any cell icon for footnotes.

AuditTrail 0/20

LangSmith 0/20

Langfuse 0/20

OPIK 0/20

Helicone 0/20

Feature	AuditTrail	LangSmith	Langfuse	OPIK	Helicone
Observability4 features· pinned
Trace capture
Interactive DAG viewer
Sankey flow attribution
Real-time streaming
Explainability (XAI)4 features· pinned
Causal attribution (SHAP + ablation)
Counterfactual explanations
Mechanistic XAI (SAE features)
Natural language explanations
Governance4 features· pinned
Constitutional rule engine
EU AI Act compliance mode
REGO / OPA policy engine
PDF audit export
Operations4 features· pinned
Live Fleet dashboard
Operations Assistant chatbot
3-tier deployment actions
OpenAI-compatible gateway
Infrastructure4 features· pinned
Self-hostable
Open-source license
6-language first-party SDK
OTel OTLP ingest

supported partial TBD — research pending not supported· hover a group to peek · click to pin · hover any row or footnote icon for details

The full tour.

Every surface the product ships — observe, explain, operate, govern, integrate. Screenshots captured against the live dashboard.

See every agent, every span, live.

auditrail.imaginaerium.in/overview

01/04 Dashboard Overview

Everything you need to understand your agents —
observe, explain, and operate.

Causal attribution, constitutional governance, live fleet ops, and a BYOK assistant chatbot — all on one canvas.

Every agent, live

Watch your fleet tick in real time — per-agent error rate, p95 latency, violation rate, and one-click jump to the most recent trace.

Chat with your fleet

BYOK chatbot grounded on your traces. Bring your own OpenAI or Anthropic key — your data stays put.

Right-click to operate

Throttle nodes, swap models, file incidents — 3-tier safety baked in. Every action audited.

EU AI Act ready

Article 12/13/50 logs, 180-day retention floor, PDF regulator export, incident reporting.

openai

Drop-in proxy

Point any OpenAI-compatible client at our gateway. Zero SDK changes. Full traces.

Paged in 60s

Cost, latency, violations, errors, traffic — fan out to Slack, email, or PagerDuty.

Mechanistic XAI · v2.0+

See the concepts the model was actually thinking about.

When your agent runs on a supported open-source model we attach a sparse autoencoder trained by SAELens and surface the top-activated features per span. Behavioural features, safety features, chain-of-thought cues — the interpretable units modern mech-interp research has learned to find.

Works with Llama 3.x, Gemma 2/3, Mistral Small.
API models (GPT, Claude) fall back to ablation + counterfactuals.
Feature labels come from Neuronpedia dictionaries when available.

SAE · layer 15 · top-7

Preview

f_15_4109 · refusal / policy-adjacent0.92
Active near SYSTEM block discussing allowed topics.
f_15_2774 · user wants calculation0.78
Fires on the literal numeric tokens in the user turn.
f_15_9210 · recency / time-bounded query0.63
Fires on phrases like "today" and "latest".
f_15_1044 · chain-of-thought cue0.55
Rises inside `<thinking>` style wrappers.
f_15_7702 · JSON output scaffolding0.47
Precedes tool-argument emission.
f_15_3388 · cite / attribute source0.34
Fires after retrieved-context block.
f_15_5821 · instructions override0.22
Partial — the model is weighing a user-prompt nudge.

Pricing

Run it where it lives best.

Apache 2.0 self-host has every feature we ship. Cloud saves your ops team the DB + retention + scaling work. Enterprise layers SSO, SCIM, and private-deploy ceremony on top.

Self-host

Your infra, your rules. Apache 2.0.

Free

Forever. No seat limit, no feature gates.

Docker Compose or Helm install
Unlimited traces + spans, ad-hoc retention
All evaluators, governor rules, Sankey/DAG views
Community support via GitHub Discussions

View on GitHub

Most teams start here

Cloud

Hosted by us. You bring the agents.

$49

per org / month · billed annually · 100k traces / mo

Everything in self-host
Managed Postgres + retention up to 180 days
Gateway proxy + BYOK provider key pool
Email + Slack support, 1-business-day SLO

Start 14-day trial

Enterprise

SSO, SCIM, SOC 2 docs, private deploy.

Talk to us

Custom terms · multi-year available

Everything in Cloud
SAML 2.0 SSO + SCIM v2 provisioning
Private VPC deploy, region pinning, DPA on file
Dedicated CSM, 1-hour SLO, security review package

Contact sales

Full feature matrix + annual-prepay discount on /pricing.

Ship agents that explain themselves.Self-host or cloud. Your call.

agent.py

import audittrail

# Initialize — one line, zero config
audittrail.init(frameworks=["langgraph"])

@audittrail.traceable
async def run_agent(prompt: str):
    result = await graph.ainvoke({"input": prompt})
    return result

# Full traces, DAG, Sankey — automatic

Terminal

$ docker compose up -d

$ open http://localhost:3000

# Dashboard ready. Start tracing.

1. Install the SDK

pip install audittrail — works with LangGraph, LangChain, AutoGen, OpenAI Agents SDK, and the raw OpenAI / Anthropic SDKs.

2. Launch the dashboard

One Docker command. Full observability UI on localhost:3000. No cloud, no config, no call-home.

install-get-started.sh

Instrument

# 2. Instrument — one decorator, zero config
from audittrail import traceable

@traceable(name="research-agent")
async def run(query: str) -> str:
    ...

Instrument. Spans flow automatically: LLM calls, tool invocations, outputs, token counts, costs.

python-quickstart.py

$ pip install audittrail

from audittrail import traceable

@traceable(name="research-agent")
async def deep_search(query: str) -> str:
    plan = await llm.complete(f"Plan for: {query}")
    docs = await search_tool(plan)
    return await llm.complete(f"Synthesize: {docs}")

One contract. Every language.

Three lines of code. Full explainability. No configuration.

Drop the SDK in, point traces at AuditTrail, and start observing — and operating — your fleet in minutes. Helm chart ships with the repo. Docker Compose for dev. Cloud tier coming soon. Either way — same data model, same SDKs, same dashboard.