Generate responses from a large language model based on a conversation.
Authorization
The AccessToken must be included in the request as a header when making REST API requests, along with the Content-Type header. You can use the following format for authorization:
--header 'Authorization: Bearer <your_token_here>'
--header 'Content-Type: application/json'
Note: Replace your_token_here with your actual AccessToken. It contains information that allows the server to verify your identity and permissions.
You can create your API key here.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Identifier of the model to use for generation. Must match one of the models served by the backend. |
max_tokens | integer | No | The maximum number of tokens to generate in the response. (minimum: 1) |
messages | array | Yes | An array of message objects that make up the conversation history. Each message object should have a role (e.g., "user", "assistant") and content. |
metadata | object | No | An object describing metadata about the request. |
stop_sequences | array | No | One or more sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence. |
stream | bool | No | Whether to stream the response back as it's generated (default: false) |
system | string, object | No | Optional system-level instruction that guides the model’s behavior. Can be provided as a plain string or structured object. |
temperature | number, null | No | Controls randomness in generation. Higher values (e.g., 0.8) produce more diverse outputs, while lower values (e.g., 0.2) make outputs more deterministic. Required range: 0<=x<=2 |
tool_choice | object | No | Specifies how the model should select tools. Can enforce a specific tool call or allow the model to decide automatically. |
tools | object | No | Definition of available tools/functions that the model may call. Typically includes name, description, and JSON schema for parameters. |
top_k | integer, null | No | The number of highest probability tokens to keep for top-k sampling when generating response, helping to control the randomness of the output. Required range: 0 <= x <= 100 |
top_p | number, null | No | The nucleus sampling probability to use when generating response, helping balance randomness and coherence in the generated response. Required range: 0<=x<=1 |
Message Configuration
| Field | Type | Required | Description |
|---|---|---|---|
role | string | Yes | The role of the message sender (e.g., "user", "assistant") |
content | string | Yes | The content of the message |
Tool
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | name of the tool |
description | string | No | description of the tool (strongly recommended) |
input_schema | dict | Yes | JSON Schema for the tool input shape that the model will produce in tool_use output content blocks. |
Example for tools
[
{
"name": "get_weather",
"description": "Get the current weather information for a given city.",
"input_schema": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The name of the city, e.g. Singapore."
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The temperature unit to use."
},
"include_forecast": {
"type": "boolean",
"description": "Whether to include a short-term weather forecast."
}
},
"required": ["city"]
}
}
]
Tool Choice Configuration
Tool Choice Configuration
| Field | Type | Required | Description |
|---|---|---|---|
type | string | Yes | How the model should use the provided tools. The model can use a specific tool, any available tool, decide by itself, or not use tools at all. (auto, any, tool, none) |
name | string | Yes | only used when type is tool |
Example Request
{
"model": "minimax/minimax-m2.5",
"max_tokens": 128,
"messages": [
{
"role": "user",
"content": "What is the weather in Singapore today?"
}
],
"metadata": {
"request_id": "test-123"
},
"stop_sequences": ["\n\n"],
"system": "You are a helpful assistant that can use tools when needed."
"stream": false,
"temperature": 0.7,
"top_k": 50,
"top_p": 0.9,
"tools": [
{
"name": "get_weather",
"description": "Get the current weather information for a given city.",
"input_schema": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The name of the city, e.g. Singapore."
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["city"]
}
}
],
"tool_choice": {
"type": "auto"
}
}
Response
Success Response
| Field | Type | Description |
|---|---|---|
id | string | A unique identifier for the message response. Typically prefixed with msg_ and generated by the server. |
type | string | The object type. Always "message" for Anthropic Messages API responses. |
role | string | The role of the message author. Always "assistant" for model-generated responses. |
model | string | The name of the model used to generate the response |
content | array | An array of response choices generated by the model. Each choice includes the generated message, the reason for finishing, and the index of the choice. |
stop_reason | string | The reason why the model stopped generating. |
usage | object | An object containing information about token usage for the request and response, including the number of tokens in the prompt, the number of tokens in the completion, and the total number of tokens used. |