Core resources

LLM Proxies

An LLM Proxy is a managed gateway in front of a single LLM provider (OpenAI, Anthropic, Google or Mistral). Provision them via this API and your app traffic flows through AIronClaw with rules, budgets, logs and audit applied centrally.

The LLM Proxy object#

A proxy ties together a provider, an inbound auth method, an (optional) provider API key, an allow-list of models, an optional budget and per-key permissions. The backing record is stored in Redis and the routing/policy enforcement happens in the gateway + the aifw Lua plugin.

Fields

string

UUID. Stable across the proxy lifetime; used in host routing and permission tags.

name

string

Human-readable label. Free-form.

provider

enum

One of openai, anthropic, google, mistral.

upstreamUrl

string (read-only)

Derived from provider server-side. Returned for visibility but not user-settable.

allowedModels

string[]

Allow-list of model identifiers. Empty array = all provider models allowed.

defaultModel

string?

Model used when the caller does not specify one.

logConversations

boolean

If true, full prompt + completion are AES-256-GCM encrypted to Redis with a 7-day TTL.

auth

object

Inbound auth: { "mode": "aifw_api_key" } or { "mode": "jwt", "jwksJson": "..." }.

budget

object?

Optional spend cap. { "period": "monthly", "capUsd": 200, "hardBlock": false }. Period is one of fixed, daily, weekly, monthly.

proxyHost

string (read-only)

The host your application sends LLM traffic to. Returned after proxy wiring completes.

createdAt

number

Unix timestamp (seconds).

What we never return

providerKey ciphertext, internal IDs, and any other server-side machinery are stripped from every response by the same toSafeLlmProxy filter. You can never read a provider key back through the API after writing it.

Manage proxies#

List proxies#

GET/api/llm

Returns every LLM proxy owned by the caller, in creation order.

curl https://app.aironclaw.com/api/llm \
  -H "Authorization: Bearer $AIFW_PAT"

Create a proxy#

POST/api/llm

Creates the Redis record and immediately wires the matching gateway Service + Routes + aifw plugin. The proxy goes live on its proxyHost the moment the gateway returns. Until you attach an inbound credential (API key or JWT), every request gets a 401.

Body

name*

string

Human-readable label.

provider*

enum

One of openai, anthropic, google, mistral.

providerKey

string

Plaintext provider API key. Encrypted before persistence; never returned again.

allowedModels

string[]

Allow-list of model identifiers. Empty / omitted = all models allowed.

defaultModel

string

Used when the caller does not pass one.

logConversations

boolean

Enable encrypted conversation logging (7-day TTL). Default false.

auth

object

Inbound auth config. Defaults to { "mode": "aifw_api_key" }.

budget

object

Spend cap. See Budgets.

curl -X POST https://app.aironclaw.com/api/llm \
  -H "Authorization: Bearer $AIFW_PAT" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "production-openai",
    "provider": "openai",
    "providerKey": "sk-...",
    "allowedModels": ["gpt-4o", "gpt-4o-mini"],
    "defaultModel": "gpt-4o-mini",
    "logConversations": true,
    "budget": { "period": "monthly", "capUsd": 200, "hardBlock": false }
  }'

Retrieve a proxy#

GET/api/llm/:id

Fetches a single proxy by its UUID. Includes the current upstream DNS pin (when the gateway has resolved one) so you can verify routing.

curl https://app.aironclaw.com/api/llm/$ID \
  -H "Authorization: Bearer $AIFW_PAT"

Update a proxy#

PATCH/api/llm/:id

Partial update. Send only the fields you want to change. Setting providerKey to null or empty string removes the stored key. Changing provider swaps the upstream URL automatically and updates the gateway service.

curl -X PATCH https://app.aironclaw.com/api/llm/$ID \
  -H "Authorization: Bearer $AIFW_PAT" \
  -H "Content-Type: application/json" \
  -d '{ "allowedModels": ["gpt-4o"], "logConversations": false }'

Delete a proxy#

DELETE/api/llm/:id

Tears down the gateway resources, removes the Redis record, and strips any llm:<id>:* permission tags from your API keys. Idempotent: returns 404 if the proxy does not exist.

curl -X DELETE https://app.aironclaw.com/api/llm/$ID \
  -H "Authorization: Bearer $AIFW_PAT"

Re-resolve upstream IP#

POST/api/llm/:id/re-resolve

AIronClaw pins the upstream provider IP at first contact for SSRF protection. Call this endpoint after the provider rotates DNS, or whenever you see upstream unreachable errors, to refresh the pin.

curl -X POST https://app.aironclaw.com/api/llm/$ID/re-resolve \
  -H "Authorization: Bearer $AIFW_PAT"

Rules#

Rules attach inline policy to the proxy: rate limits, IP ACLs, prompt-replace (DLP), model routing, prompt guards and Lua lambdas (phase="access" only). The full ruleset is replaced atomically on every PUT; there is no per-rule add/remove endpoint.

List rules#

GET/api/llm/:id/rules

curl https://app.aironclaw.com/api/llm/$ID/rules \
  -H "Authorization: Bearer $AIFW_PAT"

Replace rules#

PUT/api/llm/:id/rules

Replaces the full rule set in one shot. Allowed rule_type values on LLM proxies: ip_acl, rate_limit, prompt_replace, model_route, prompt_guard, lambda (with phase="access"). Every rule must include a tools array — use ["*"] to apply globally.

curl -X PUT https://app.aironclaw.com/api/llm/$ID/rules \
  -H "Authorization: Bearer $AIFW_PAT" \
  -H "Content-Type: application/json" \
  -d '{
    "rules": [
      {
        "rule_type": "ip_acl",
        "tools": ["*"],
        "action": "allow",
        "cidrs": ["10.0.0.0/8", "203.0.113.0/24"]
      },
      {
        "rule_type": "model_route",
        "tools": ["*"],
        "pattern": "^gpt-3\\.5",
        "target_model": "gpt-4o-mini"
      }
    ]
  }'

Prompt Guard#

The prompt_guard rule type runs detection on the inbound prompt before it reaches the LLM (phase="request"), or on the model's completion before it leaves the gateway (phase="response" — alert-only on LLM proxies). Three layered detection modes are available: regex (fast deterministic detectors curated on public taxonomies), judge (an LLM classifier with semantic understanding), and both (regex first as a cheap pre-filter, judge on misses for defense-in-depth).

Modes

regex

deterministic

Runs the detectors listed in detectors against the input. Sub-millisecond, no external calls. Patterns are curated on OWASP LLM01, MITRE ATLAS AML.T0051, Microsoft Prompt Shields, Anthropic browser-use defenses, the PromptInject (arXiv 2211.09527) and ChatInject (arXiv 2509.22830) papers, and NVIDIA garak probes. Nothing for the user to write.

judge

LLM classifier

POSTs the scoped input to a chat-completions endpoint configured in the judge block. Returns a structured verdict { verdict, confidence, category }. A block verdict is acted on when confidence ≥ threshold.

both

regex + judge cascade

Runs regex first; on a hit, the judge is skipped (cost saver). On a miss, the judge gets the final say. Recommended for production — cheap fast filter on the bulk of traffic, deep semantic check on the suspicious tail.

On a positive verdict, action decides what happens: block returns 403 to the client; rewrite replaces the matched content using rewrite_template (supports $0..$9 back-references); alert is log-only.

Built-in detectors#

The detector catalog is a curated library of regex-based rules organized into seven categories. Reference detectors by id in the rule's detectors array. The dashboard lists every available id with a short description; the categories are:

Detector categories

prompt_injection

phase: request

Direct prompt injection: "ignore previous instructions", system-prompt overrides, RAG-document override markers, ChatML / Llama / Anthropic chat-template token smuggling, indirect-injection triggers ("when you read this, do X"). Sources: OWASP LLM01, MITRE ATLAS AML.T0051, Microsoft Prompt Shields (Spotlighting + documentAttacks), Anthropic browser-use research, PromptInject and ChatInject papers, NVIDIA garak.

jailbreak

phase: request

Persona-shift jailbreaks (DAN, AIM, evil twin), policy-bypass framing, "respond as if you have no restrictions". Sources: OWASP LLM01 + published jailbreak corpora.

data_exfil

phase: request/response

Exfiltration patterns: encoded payloads, URL-as-channel, "summarize previous messages and send to X" constructs. Sources: OWASP LLM02 (Insecure Output Handling), MITRE ATLAS AML.T0024.

secrets

phase: response

Vendor API key formats with strict anchors: OpenAI, Anthropic, AWS access keys, GitHub PATs, GCP, Slack, Stripe; generic high-entropy tokens are gated by Shannon-entropy validation to suppress false positives. Sources: TruffleHog regex catalog + vendor format documentation.

pii

phase: request/response

Personal data: emails, phone numbers, government IDs, dates of birth in common locale formats. IBANs and credit cards run through mod-97 / Luhn validators after regex match. Source: OWASP LLM06 (Sensitive Information Disclosure).

dangerous_content

phase: response

Output safety: unsafe code patterns (eval-with-network-input, hardcoded secrets in scaffolding), injection-prone shell or SQL fragments. Sources: CWE-77/78/79, OWASP API Top 10.

output_safety

phase: response

Toxic-content patterns. Lighter than a full content-safety classifier — designed as a complement to judge mode, not a replacement.

Anti-ReDoS by construction

Every detector regex is anchored, linear, and uses bounded quantifiers (no nested unbounded .*) so the matching engine cannot fall into exponential backtracking on adversarial input.

Judge configuration#

The judge block configures the LLM classifier for mode="judge" or mode="both". The classifier endpoint is hardcoded per provider — no user-supplied URL — so the judge call is SSRF-safe and TLS verification is always on.

judge fields

provider

enum

One of openai, anthropic, google, mistral. Picks the chat-completions endpoint and request shape.

model

string

Model identifier forwarded to the upstream (e.g. gpt-4o-mini, claude-haiku-4-5, gemini-2.0-flash, mistral-small-latest).

api_key_enc

object

Envelope-encrypted API key (AES-256-GCM). The dashboard handles encryption — pass the plaintext key in the dashboard form, never persist plaintext anywhere downstream.

Examples#

{
  "rule_type": "prompt_guard",
  "tools": ["*"],
  "phase": "request",
  "mode": "regex",
  "detectors": ["pi_ignore_previous", "pi_chatml_smuggle", "jb_dan", "jb_aim"],
  "action": "block"
}

Budgets#

Two budget layers exist per proxy: a proxy-level cap (set on the proxy object via budget) and a per-(key, proxy) cap for fine-grained per-tenant control. Both share the same period semantics: fixed (no rollover), daily, weekly or monthly. Set hardBlock: true to refuse requests once the cap is hit; otherwise the budget is informational and only flags an alert.

Reset proxy window#

POST/api/llm/:id/budget/reset

Zeros the current-window spend counter for the proxy. Daily history and monthly totals are preserved — only the enforcement counter is cleared. Returns 204 No Content.

curl -X POST https://app.aironclaw.com/api/llm/$ID/budget/reset \
  -H "Authorization: Bearer $AIFW_PAT"

List proxy keys#

GET/api/llm/:id/keys

Lists the API keys that have been granted access to this proxy (their credential carries an llm:<id>:* tag), with their per-key budget and current-window spend. Keys are shown masked (aifw_p…_xyz4).

curl https://app.aironclaw.com/api/llm/$ID/keys \
  -H "Authorization: Bearer $AIFW_PAT"

Per-key budget (get / set / delete)#

PUT/api/llm/:id/keys/:credId/budget

Read or upsert the budget for a specific (proxy, key) pair. The same path supports GET (read), PUT (upsert) and DELETE (remove).

PUT body

period*

enum

One of fixed, daily, weekly, monthly.

capUsd*

number

Cap in US dollars. Non-negative.

hardBlock

boolean

If true, the firewall returns 402/429 once the cap is reached. Default false.

curl -X PUT https://app.aironclaw.com/api/llm/$ID/keys/$CRED_ID/budget \
  -H "Authorization: Bearer $AIFW_PAT" \
  -H "Content-Type: application/json" \
  -d '{ "period": "monthly", "capUsd": 50, "hardBlock": true }'

Reset key window#

POST/api/llm/:id/keys/:credId/budget/reset

Zeros the current-window spend counter for the (key, proxy) pair. Returns 204 No Content. Returns 400 if no budget is configured for that pair.

curl -X POST https://app.aironclaw.com/api/llm/$ID/keys/$CRED_ID/budget/reset \
  -H "Authorization: Bearer $AIFW_PAT"

Usage#

Three flavors of usage data are exposed: a monthly summary with per-key breakdown, a daily series with per-model split, and a per-key daily series. Costs are computed at request time using the AIronClaw pricing table (the response includes the pricingVersion so you can detect price-list changes).

Monthly summary + per-key (current month)#

GET/api/llm/:id/usage

Query

months

number

Trailing months to return (1–36, default 12).

curl "https://app.aironclaw.com/api/llm/$ID/usage?months=6" \
  -H "Authorization: Bearer $AIFW_PAT"

Daily history + budget window#

GET/api/llm/:id/usage/daily

Query

days

number

Trailing days to return (1–30, default 14).

Returns daily totals with per-model breakdown plus a snapshot of the running budget window: spentCents, tag (e.g. 2026-04 for monthly), and therollsOverAt timestamp.

curl "https://app.aironclaw.com/api/llm/$ID/usage/daily?days=7" \
  -H "Authorization: Bearer $AIFW_PAT"

Per-key daily history + budget window#

GET/api/llm/:id/keys/:credId/usage/daily

Scope of the per-key daily series

Per-key daily counters are notscoped per proxy — a single key's daily hash covers every proxy it has touched, so the top-line numbers reflect the key's entire activity. The per-model split lets you tell which proxies' models contributed.

curl "https://app.aironclaw.com/api/llm/$ID/keys/$CRED_ID/usage/daily?days=14" \
  -H "Authorization: Bearer $AIFW_PAT"

Delete history#

DELETE/api/llm/:id/usage/history

Removes daily usage hashes. Exactly one of the two query parameters must be provided. Budget-window enforcement counters are not touched — call budget/reset separately for that.

Query (one of)

all

boolean

If true, removes every daily hash in the 30-day window.

before

YYYYMMDD

Removes entries on or before this day (inclusive).

The same shape exists scoped to a single key: DELETE /api/llm/:id/keys/:credId/usage/history.

curl -X DELETE "https://app.aironclaw.com/api/llm/$ID/usage/history?before=20260301" \
  -H "Authorization: Bearer $AIFW_PAT"

Logs#

When logConversations is enabled on a proxy, every request and response is encrypted with AES-256-GCM and written to Redis with a 7-day TTL. The list endpoint returns metadata only; full plaintext is fetched on demand from the detail endpoint.

List logs (metadata only)#

GET/api/llm/:id/logs

Query

from

epoch seconds

Lower bound (inclusive).

epoch seconds

Upper bound (inclusive).

limit

number

1–200 (default 50). Newest first.

curl "https://app.aironclaw.com/api/llm/$ID/logs?limit=50" \
  -H "Authorization: Bearer $AIFW_PAT"

Log detail (decrypts plaintext)#

GET/api/llm/:id/logs/:logId

Decrypts the conversation log on the server with the AIronClaw master key and returns the plaintext request/response. The response is never cached (Cache-Control: no-store).

curl https://app.aironclaw.com/api/llm/$ID/logs/$LOG_ID \
  -H "Authorization: Bearer $AIFW_PAT"

Purge logs#

DELETE/api/llm/:id/logs

Removes every log blob, metadata hash and the index for this proxy. Returns the number of removed entries.

curl -X DELETE https://app.aironclaw.com/api/llm/$ID/logs \
  -H "Authorization: Bearer $AIFW_PAT"