Errors and Limits
Error format
All errors are returned inside an error object with:
messagetypecoderequest_id
Important HTTP status codes
| Code | Typical meaning |
|---|---|
400 | invalid request |
401 | missing or invalid bearer token |
403 | access denied |
404 | resource or identifier not found |
410 | resource is no longer available |
413 | payload too large |
429 | rate or concurrency limit exceeded |
502 | upstream error |
503 | service unavailable |
504 | upstream timeout |
Common validation errors
Typical examples include:
modeldoes not use thegroup:<uuid>formatmessagesis empty- no
usermessage is present temperatureis outside the allowed rangeapproval_idis missing during a resume request
Rate and concurrency limits
The API can reject requests with 429 when:
- too many calls are sent during a given time window
- too many simultaneous requests are open
A client should handle these cases with:
- controlled retries
- backoff
- client-side throttling
SSE error handling
In streaming mode, an error can happen after the stream has already started. The client should therefore:
- inspect incoming events continuously
- detect an
errorpayload - log
request_id - close the stream cleanly
Good practices
- Always log
request_id. - Validate payloads before sending them.
- Avoid opening too many parallel streams with the same token.
- Treat
tool_callsas a dedicated state, not as a regular text response.