Tool Calling
Tool calling, also known as function calling, lets a model decide when to use external tools and return structured arguments for those tools. This is useful when the model needs to fetch live data, query internal systems, perform calculations, or take actions outside the model itself.
Tool calling is supported through the OpenAI-compatible /chat/completions API by passing tool definitions in the tools field.
Why Use Tool Calling
Tool calling helps when the model needs capabilities it does not natively have, such as:
- Retrieving real-time information
- Calling internal APIs or databases
- Running business logic or workflows
- Performing precise calculations
- Triggering actions in external systems
Instead of asking the model to guess, you let it choose a tool and provide structured arguments that your application executes safely.
How It Works
A typical tool-calling flow has four steps:
- Define one or more tools with names, descriptions, and JSON Schema parameters.
- Send a chat completion request with
tools. - If the model decides to use a tool, it returns
tool_callsinstead of a final answer. - Execute the tool in your application, append the tool result to
messages, and send another request to get the final response.
Define Tools
Each tool must be defined as a function with:
name: A short, stable function namedescription: A clear explanation of when the tool should be usedparameters: A JSON Schema object describing the allowed arguments
Example:
[
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city to look up, such as San Francisco."
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The temperature unit to return."
}
},
"required": ["city"]
}
}
}
]
Good tool definitions are important. The model relies on your descriptions and schema to choose the right tool and generate valid arguments.
Basic Example
The following request gives the model access to one tool:
curl 'https://api.hpc-ai.com/inference/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <your_token_here>' \
--data '{
"model": "minimax/minimax-m2.5",
"messages": [
{
"role": "user",
"content": "What is the weather in Singapore right now?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city to look up."
}
},
"required": ["city"]
}
}
}
],
"tool_choice": "auto",
"temperature": 0.1
}'
If the model decides a tool is needed, the response will usually include tool_calls in the assistant message:
{
"choices": [
{
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"city\":\"Singapore\"}"
}
}
]
},
"finish_reason": "tool_calls"
}
]
}
Complete Workflow
The full tool-calling loop looks like this:
import json
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.hpc-ai.com/inference/v1",
)
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city to look up."
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit."
}
},
"required": ["city"]
}
}
}
]
def get_weather(city: str, unit: str = "celsius"):
return {
"city": city,
"temperature": 29,
"condition": "Partly cloudy",
"unit": unit
}
messages = [
{"role": "user", "content": "What's the weather in Singapore?"}
]
response = client.chat.completions.create(
model="minimax/minimax-m2.5",
messages=messages,
tools=tools,
tool_choice="auto",
temperature=0.1,
)
assistant_message = response.choices[0].message
if assistant_message.tool_calls:
tool_call = assistant_message.tool_calls[0]
tool_args = json.loads(tool_call.function.arguments)
tool_result = get_weather(**tool_args)
messages.append(assistant_message)
messages.append(
{
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(tool_result),
}
)
final_response = client.chat.completions.create(
model="minimax/minimax-m2.5",
messages=messages,
tools=tools,
temperature=0.1,
)
print(final_response.choices[0].message.content)
Important: execute tools through an explicit function map in your application. Do not use eval on model-generated arguments or function names.
Message Pattern for Tool Calls
After the model returns a tool call, the conversation history usually looks like this:
[
{
"role": "user",
"content": "What's the weather in Singapore?"
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"city\":\"Singapore\"}"
}
}
]
},
{
"role": "tool",
"tool_call_id": "call_123",
"content": "{\"city\":\"Singapore\",\"temperature\":29,\"condition\":\"Partly cloudy\",\"unit\":\"celsius\"}"
}
]
You then send that updated messages array back to the model so it can produce a final natural-language response.
tool_choice
The tool_choice field controls how the model uses tools.
auto: The model decides whether to call a tool or answer directlynone: The model is not allowed to call toolsrequired: The model must call at least one tool
Some OpenAI-compatible implementations also support forcing a specific function:
{
"tool_choice": {
"type": "function",
"function": {
"name": "get_weather"
}
}
}
Use forced tool selection only when your workflow requires it.
Multiple Tools
You can provide more than one tool in the same request. For example, a travel assistant might expose:
get_weathersearch_hotelssearch_restaurantsget_exchange_rate
The model chooses the tool that best matches the user request, based on your tool descriptions and parameter schemas.
When multiple tools overlap, write the descriptions carefully so the model can tell them apart.
Streaming Tool Calls
Tool calls also work with stream=true. In streaming mode, tool call arguments may arrive incrementally and need to be assembled before execution.
import json
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.hpc-ai.com/inference/v1",
)
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city.",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "The city to look up."}
},
"required": ["city"]
}
}
}
]
tool_calls = {}
with client.chat.completions.create(
model="minimax/minimax-m2.5",
messages=[{"role": "user", "content": "What's the weather in Singapore?"}],
tools=tools,
stream=True,
temperature=0.1,
) as stream:
for chunk in stream:
choices = getattr(chunk, "choices", None)
if not choices:
continue
choice = choices[0]
delta = getattr(choice, "delta", None)
if delta and getattr(delta, "tool_calls", None):
for tc in delta.tool_calls:
index = tc.index
if index not in tool_calls:
tool_calls[index] = {"id": "", "name": "", "arguments": ""}
if getattr(tc, "id", None):
tool_calls[index]["id"] = tc.id
fn = getattr(tc, "function", None)
if fn and getattr(fn, "name", None):
tool_calls[index]["name"] = fn.name
if fn and getattr(fn, "arguments", None):
tool_calls[index]["arguments"] += fn.arguments
finish_reason = getattr(choice, "finish_reason", None)
if finish_reason == "tool_calls":
break
for tc in tool_calls.values():
try:
args = json.loads(tc["arguments"])
print(f"Call {tc['name']} with {args}")
except json.JSONDecodeError:
print(f"Incomplete tool args for {tc['name']}: {tc['arguments']}")
Streaming is helpful when the model may generate large tool arguments or when you want lower perceived latency.
Tool Schema Best Practices
Well-designed schemas improve tool accuracy.
- Use specific function names
- Write clear descriptions for the tool and every parameter
- Mark required fields correctly
- Use
enumwhen values come from a fixed set - Keep schemas as simple as possible
- Avoid overlapping tools with ambiguous responsibilities
Example:
{
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, for example San Francisco."
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit."
}
},
"required": ["location"]
}
Parameter Recommendations
Tool calling usually works best with low randomness:
{
"temperature": 0.0,
"top_p": 1.0
}
Lower temperature reduces hallucinated argument values and makes tool selection more deterministic.
Safety Recommendations
Tool calling adds real-world side effects, so treat tool execution as part of your application security boundary.
- Validate tool arguments before execution
- Restrict tools to the minimum permissions they need
- Never directly execute arbitrary shell commands from model output
- Use an allowlist of supported tool names
- Sanitize external inputs before passing them to downstream systems
- Log tool requests and tool outputs for debugging and audits
Common Issues
The Model Does Not Call a Tool
Try:
- Improving the tool description
- Making the user request more explicit
- Setting
tool_choicetorequired - Lowering
temperature - Confirming that the selected model supports tool calling
Tool Arguments Are Missing or Incorrect
Try:
- Adding better parameter descriptions
- Using
enumfor constrained values - Making required fields explicit
- Lowering
temperature
The Model Answers Directly Instead of Waiting for Tool Results
This usually happens when:
- The tool description is too vague
- The model thinks it can answer from prior knowledge
- The workflow did not append the
toolrole message correctly
Make sure the assistant tool call and the corresponding tool message are both included before requesting the final answer.
Tool Calling in Multi-Turn Flows Becomes Confusing
Keep the conversation state clean:
- Preserve the assistant message that contains
tool_calls - Append one
toolmessage per tool result - Send the full updated message history in the follow-up request