Gateway Proxy

AuditTrail ships an OpenAI-compatible gateway proxy at /api/v1/gateway/proxy/v1. Drop your existing OpenAI / Anthropic / Azure SDK base URL onto it and get full-fidelity traces with zero code changes. No re-instrumentation; no @traceable decorators.

Why it's useful

  • Zero-code integration — any OpenAI-compatible client (SDK or cURL) routes through the proxy and is automatically traced.
  • Virtual keys per user — issue scoped at-gw-… keys from /settings/gateway. The real provider key stays on the server.
  • Rate / spend limits — bound per-user call rate and monthly spend without asking every SDK caller to implement quotas.
  • Routing — swap providers (OpenAI ↔ Anthropic ↔ local vLLM) without touching client code.

Quickstart

bash
# 1. Configure a provider key on the server (once, via /settings/gateway)
#    AuditTrail encrypts + stores under a tenant virtual key.
 
# 2. Point your OpenAI SDK at the proxy
export OPENAI_BASE_URL="https://auditrail.yourco.com/api/v1/gateway/proxy/v1"
export OPENAI_API_KEY="at-gw-YOUR-VIRTUAL-KEY"
 
# 3. Call as usual — traces flow into AuditTrail automatically
python -c "from openai import OpenAI; c = OpenAI(); \
  print(c.chat.completions.create(model='gpt-4o', messages=[{'role':'user','content':'hi'}]).choices[0].message.content)"

How it works under the hood

  1. Client sends POST /api/v1/gateway/proxy/v1/chat/completions with an at-gw-* virtual key in the Authorization header.
  2. Gateway resolves the virtual key → tenant → provider key (Fernet- encrypted at rest, same storage machinery as the Assistant).
  3. Streams the request to the real provider endpoint.
  4. Observes the stream, buffers span events, writes a Trace + child Span rows tagged with the tenant's user id.
  5. Returns the provider's response bytes untouched.

Spans carry the standard OTel gen_ai.* attribute contract so the exact same trace shape lands regardless of whether you use the gateway, one of the first-party SDKs, or an external OTel exporter.

Supported endpoints

Initial v1.1 coverage:

  • POST /api/v1/gateway/proxy/v1/chat/completions (streaming + non-streaming)
  • GET /api/v1/gateway/proxy/v1/models — proxied from the real provider

Not yet supported: embeddings, audio, images, fine-tuning — tracked on the roadmap.

Virtual key management

MethodPathPurpose
GET/api/v1/gateway/keysList virtual keys (last4 only)
POST/api/v1/gateway/keysCreate a new at-gw-* key with optional rate / spend caps
DELETE/api/v1/gateway/keys/{id}Revoke immediately

Each key has its own rate limit and monthly spend cap so you can hand one to a noisy consumer without risking the bill.

Tenant isolation

Every trace and span is tagged with the virtual key's owning user. Cross-tenant data is impossible by construction — the proxy never looks up rows on another user's scope.

Rate limits

The proxy inherits the downstream provider's rate limits plus the virtual key's own configured cap (default: 60/min, configurable at creation). Exceeding either returns 429.