Gateway Proxy
AuditTrail ships an OpenAI-compatible gateway proxy at
/api/v1/gateway/proxy/v1. Drop your existing OpenAI / Anthropic /
Azure SDK base URL onto it and get full-fidelity traces with zero
code changes. No re-instrumentation; no @traceable decorators.
Why it's useful
- Zero-code integration — any OpenAI-compatible client (SDK or cURL) routes through the proxy and is automatically traced.
- Virtual keys per user — issue scoped
at-gw-…keys from/settings/gateway. The real provider key stays on the server. - Rate / spend limits — bound per-user call rate and monthly spend without asking every SDK caller to implement quotas.
- Routing — swap providers (OpenAI ↔ Anthropic ↔ local vLLM) without touching client code.
Quickstart
# 1. Configure a provider key on the server (once, via /settings/gateway)
# AuditTrail encrypts + stores under a tenant virtual key.
# 2. Point your OpenAI SDK at the proxy
export OPENAI_BASE_URL="https://auditrail.yourco.com/api/v1/gateway/proxy/v1"
export OPENAI_API_KEY="at-gw-YOUR-VIRTUAL-KEY"
# 3. Call as usual — traces flow into AuditTrail automatically
python -c "from openai import OpenAI; c = OpenAI(); \
print(c.chat.completions.create(model='gpt-4o', messages=[{'role':'user','content':'hi'}]).choices[0].message.content)"How it works under the hood
- Client sends
POST /api/v1/gateway/proxy/v1/chat/completionswith anat-gw-*virtual key in theAuthorizationheader. - Gateway resolves the virtual key → tenant → provider key (Fernet- encrypted at rest, same storage machinery as the Assistant).
- Streams the request to the real provider endpoint.
- Observes the stream, buffers span events, writes a
Trace+ childSpanrows tagged with the tenant's user id. - Returns the provider's response bytes untouched.
Spans carry the standard OTel gen_ai.* attribute contract so the
exact same trace shape lands regardless of whether you use the
gateway, one of the first-party SDKs, or an external OTel exporter.
Supported endpoints
Initial v1.1 coverage:
POST /api/v1/gateway/proxy/v1/chat/completions(streaming + non-streaming)GET /api/v1/gateway/proxy/v1/models— proxied from the real provider
Not yet supported: embeddings, audio, images, fine-tuning — tracked on the roadmap.
Virtual key management
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/gateway/keys | List virtual keys (last4 only) |
POST | /api/v1/gateway/keys | Create a new at-gw-* key with optional rate / spend caps |
DELETE | /api/v1/gateway/keys/{id} | Revoke immediately |
Each key has its own rate limit and monthly spend cap so you can hand one to a noisy consumer without risking the bill.
Tenant isolation
Every trace and span is tagged with the virtual key's owning user. Cross-tenant data is impossible by construction — the proxy never looks up rows on another user's scope.
Rate limits
The proxy inherits the downstream provider's rate limits plus the virtual key's own configured cap (default: 60/min, configurable at creation). Exceeding either returns 429.