Skip to content

Conventions

Base format

Chat endpoints consume and return JSON. In streaming mode, the response is sent as text/event-stream.

The model field

The model field is required and must follow this format:

text
group:<uuid>

Example:

text
group:123e4567-e89b-12d3-a456-426614174000

Messages

The request expects a messages array.

Accepted roles:

  • system
  • user
  • assistant
  • tool

Important rules:

  • the array must contain at least one user message
  • the last user message is treated as the current prompt

Content formats

The content field of a message can be:

  • a string
  • null
  • an array of content parts with properties such as type and text

Streaming

If stream is true, the API returns SSE events.

The flow generally looks like this:

  1. one initial chunk opens the assistant response
  2. additional chunks append text in choices[0].delta.content
  3. one final chunk closes the response
  4. the stream ends with data: [DONE]

metadata

The metadata field is optional. It is especially useful for:

  • passing a request_id
  • indicating a mode through metadata.mode

Accepted metadata.mode values:

  • auto
  • normal
  • orch
  • synth

Useful input limits

  • messages: between 1 and 500 items
  • temperature: between 0 and 2
  • max_tokens: between 1 and 200000
  • max_completion_tokens: between 1 and 200000

If both max_tokens and max_completion_tokens are present, max_completion_tokens is the clearest field to use for output control.