Quickstart
Make your first API call in minutes.
This quickstart is designed for an OpenAI-compatible inference service. Replace the placeholder values below with your API key and model name.
Prerequisites
Before you begin, make sure you have:
- An API key for your inference service
curlinstalled, or Python / Node.js if you want to use an SDK
Step 1: Create and export an API key
Store your API key in an environment variable instead of hardcoding it in source files.
export INFERENCE_BASE_URL="https://api.hpc-ai.com/inference/v1"
export INFERENCE_API_KEY="your_api_key_here"
export INFERENCE_MODEL="minimax/minimax-m2.5"
Step 2: Make your first API call
curl
curl --request POST \
--url "$INFERENCE_BASE_URL/chat/completions" \
--header "Authorization: Bearer $INFERENCE_API_KEY" \
--header "Content-Type: application/json" \
--data @- <<EOF
{
"model": "${INFERENCE_MODEL}",
"messages": [
{
"role": "user",
"content": "How are you?"
}
],
"temperature": 0.7,
"max_tokens": 128
}
EOF
A successful response usually looks like this:
{
"id": "e63095aef9bc4d7292b769edb2cb6583",
"object": "chat.completion",
"created": 1773651537,
"model": "minimax/minimax-m2.5",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hi there! I'm doing well, thank you for asking. How about you? How's your day going so far? Is there anything I can help you with today?",
"reasoning_content": null,
"tool_calls": null
},
"logprobs": null,
"finish_reason": "stop",
"matched_stop": 248046
}
],
"usage": {
"prompt_tokens": 15,
"total_tokens": 692,
"completion_tokens": 677,
"prompt_tokens_details": null,
"reasoning_tokens": 0
},
"metadata": {
"weight_version": "default"
}
}
Python (OpenAI SDK)
Install the SDK:
pip install openai
Then call your service:
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["INFERENCE_API_KEY"],
base_url=os.environ["INFERENCE_BASE_URL"],
)
response = client.chat.completions.create(
model=os.environ["INFERENCE_MODEL"],
messages=[
{"role": "user", "content": "Say hello in Spanish."}
],
temperature=0.7,
max_tokens=128,
)
print(response.choices[0].message.content)
JavaScript / TypeScript (OpenAI SDK)
Install the SDK:
npm install openai
Then call your service:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.INFERENCE_API_KEY,
baseURL: process.env.INFERENCE_BASE_URL,
});
const response = await client.chat.completions.create({
model: process.env.INFERENCE_MODEL,
messages: [
{
role: "user",
content: "Say hello in Spanish.",
},
],
temperature: 0.7,
max_tokens: 128,
});
console.log(response.choices[0].message.content);