Skip to main content

Quickstart

Make your first API call in minutes.

This quickstart is designed for an OpenAI-compatible inference service. Replace the placeholder values below with your API key and model name.


Prerequisites

Before you begin, make sure you have:

  • An API key for your inference service
  • curl installed, or Python / Node.js if you want to use an SDK

Step 1: Create and export an API key

Store your API key in an environment variable instead of hardcoding it in source files.

export INFERENCE_BASE_URL="https://api.hpc-ai.com/inference/v1"
export INFERENCE_API_KEY="your_api_key_here"
export INFERENCE_MODEL="minimax/minimax-m2.5"

Step 2: Make your first API call

curl

curl --request POST \
--url "$INFERENCE_BASE_URL/chat/completions" \
--header "Authorization: Bearer $INFERENCE_API_KEY" \
--header "Content-Type: application/json" \
--data @- <<EOF
{
"model": "${INFERENCE_MODEL}",
"messages": [
{
"role": "user",
"content": "How are you?"
}
],
"temperature": 0.7,
"max_tokens": 128
}
EOF

A successful response usually looks like this:

{
"id": "e63095aef9bc4d7292b769edb2cb6583",
"object": "chat.completion",
"created": 1773651537,
"model": "minimax/minimax-m2.5",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hi there! I'm doing well, thank you for asking. How about you? How's your day going so far? Is there anything I can help you with today?",
"reasoning_content": null,
"tool_calls": null
},
"logprobs": null,
"finish_reason": "stop",
"matched_stop": 248046
}
],
"usage": {
"prompt_tokens": 15,
"total_tokens": 692,
"completion_tokens": 677,
"prompt_tokens_details": null,
"reasoning_tokens": 0
},
"metadata": {
"weight_version": "default"
}
}

Python (OpenAI SDK)

Install the SDK:

pip install openai

Then call your service:

import os
from openai import OpenAI

client = OpenAI(
api_key=os.environ["INFERENCE_API_KEY"],
base_url=os.environ["INFERENCE_BASE_URL"],
)

response = client.chat.completions.create(
model=os.environ["INFERENCE_MODEL"],
messages=[
{"role": "user", "content": "Say hello in Spanish."}
],
temperature=0.7,
max_tokens=128,
)

print(response.choices[0].message.content)

JavaScript / TypeScript (OpenAI SDK)

Install the SDK:

npm install openai

Then call your service:

import OpenAI from "openai";

const client = new OpenAI({
apiKey: process.env.INFERENCE_API_KEY,
baseURL: process.env.INFERENCE_BASE_URL,
});

const response = await client.chat.completions.create({
model: process.env.INFERENCE_MODEL,
messages: [
{
role: "user",
content: "Say hello in Spanish.",
},
],
temperature: 0.7,
max_tokens: 128,
});

console.log(response.choices[0].message.content);