Skip to main content
The Revenue Agents API enforces per-token rate limits in two pools — one for reads, one for writes. Every response includes headers describing the current state of the pool that handled the request, so you can adapt to limits without guessing. Rate limits are separate from credits. Credits cap monthly agent usage; rate limits cap requests per minute.

Pools

PoolMethodsLimit
ReadGET, HEAD600 requests / minute
WritePOST, PUT, PATCH, DELETE60 requests / minute
Each request consumes one point from the matching pool. Pools refill on a 60-second rolling window and are scoped per API token. Multiple tokens in the same workspace each get their own budget. Polling an async message job is a GET and counts against the read pool. The POST that created the job counts against the write pool.

Response headers

Every authenticated response — both 2xx and 429 — includes:
HeaderMeaning
X-RateLimit-PoolPool that served this request: read or write.
X-RateLimit-LimitMaximum requests in the current window for this pool.
X-RateLimit-RemainingRequests remaining before the next refill.
X-RateLimit-ResetUnix timestamp (seconds) when the bucket refills.
429 responses additionally include Retry-After — the number of seconds until the next request will succeed.

Handling 429

A 429 response looks like:
HTTP/1.1 429 Too Many Requests
Retry-After: 23
X-RateLimit-Pool: write
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1747920000
Content-Type: application/json

{
  "error": {
    "code": "rate_limit.exceeded",
    "category": "rate_limited",
    "message": "Rate limit exceeded."
  }
}
Recommended client behavior:
  1. Sleep for Retry-After seconds. This is the shortest interval that guarantees the next request lands after a refill.
  2. Retry the same request. The original was rejected before doing any work — it is safe to repeat, including POST and DELETE.
  3. Back off if you hit 429 repeatedly. Multiple consecutive limit hits usually mean you are exceeding the pool’s steady-state capacity, not just bursting. Add a backoff multiplier on top of Retry-After.

Staying under the limit

  • Watch X-RateLimit-Remaining. When it dips below 10% of the limit, slow down before you hit zero.
  • Cache reads. GET /me, agent state, and conversation lists rarely change second-to-second. Caching for even a few seconds cuts read traffic significantly.
  • Poll async jobs on a backoff. Start at 1 second, double up to 5 seconds. A tight poll loop burns read points for no benefit — the agent is not faster because you asked sooner.
  • Use streaming for long turns. Accept: text/event-stream on the message POST returns incremental tokens over one connection instead of N poll requests. See the changelog entry on async message jobs for the trade-offs.
  • Parallelize cautiously. Ten concurrent writers share the same 60-per- minute write pool. Coordinate at the worker level, not the request level.

Rate limiter unavailable

In the rare event the rate limiter store itself is down, the API returns 503 with system.rate_limit_unavailable. Treat this as a transient error and retry with backoff — the limiter is fail-closed by design.

Unauthenticated routes

GET /health, GET /openapi.json, and GET /openapi.yaml do not consume token points — they use a separate per-IP limiter intended for monitoring. Do not rely on these endpoints for application traffic.