推理 Token
对于支持它的模型,OpenRoute API 可以返回推理 Token,也称为思考 token。OpenRoute 标准化了自定义模型将使用的推理 token 数量的不同方式,为不同提供商提供统一的接口。
推理 token 提供了对模型采取的推理步骤的透明视图。推理 token 被视为输出 token 并相应收费。
如果模型决定输出推理 token,它们默认包含在响应中。推理 token 将出现在每条消息的 reasoning
字段中,除非您决定排除它们。
Some reasoning models do not return their reasoning tokens
While most models and providers make reasoning tokens available in the response, some (like the OpenAI o-series and Gemini Flash Thinking) do not.
Controlling Reasoning Tokens
You can control reasoning tokens in your requests using the reasoning
parameter:
import requests
import json
url = "https://api.openroute.cn/v1/chat/completions"
headers = {
"Authorization": f"Bearer {API_KEY_REF}",
"Content-Type": "application/json"
}
payload = {
"model": "your-model",
"messages": [],
"reasoning": {
# One of the following (not both):
"effort": "high", # Can be "high", "medium", or "low" (OpenAI-style)
"max_tokens": 2000, # Specific token limit (Anthropic-style)
# Optional: Default is false. All models support this.
"exclude": False, # Set to true to exclude reasoning tokens from response
# Or enable reasoning with the default parameters:
"enabled": True # Default: inferred from `effort` or `max_tokens`
}
}
response = requests.post(url, headers=headers, data=json.dumps(payload))
interface ReasoningConfig {
// One of the following (not both):
effort?: "high" | "medium" | "low"; // OpenAI-style
max_tokens?: number; // Specific token limit (Anthropic-style)
// Optional: Default is false. All models support this.
exclude?: boolean; // Set to true to exclude reasoning tokens from response
// Or enable reasoning with the default parameters:
enabled?: boolean; // Default: inferred from `effort` or `max_tokens`
}
const payload = {
model: "your-model",
messages: [],
reasoning: {
effort: "high" as const,
max_tokens: 2000,
exclude: false,
enabled: true
} as ReasoningConfig
};
const response = await fetch("https://api.openroute.cn/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${API_KEY_REF}`,
"Content-Type": "application/json"
},
body: JSON.stringify(payload)
});
const payload = {
model: "your-model",
messages: [],
reasoning: {
// One of the following (not both):
effort: "high", // Can be "high", "medium", or "low" (OpenAI-style)
max_tokens: 2000, // Specific token limit (Anthropic-style)
// Optional: Default is false. All models support this.
exclude: false, // Set to true to exclude reasoning tokens from response
// Or enable reasoning with the default parameters:
enabled: true // Default: inferred from `effort` or `max_tokens`
}
};
const response = await fetch("https://api.openroute.cn/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${API_KEY_REF}`,
"Content-Type": "application/json"
},
body: JSON.stringify(payload)
});
Basic Usage with Reasoning Tokens
import requests
import json
url = "https://api.openroute.cn/v1/chat/completions"
headers = {
"Authorization": f"Bearer {API_KEY_REF}",
"Content-Type": "application/json"
}
payload = {
"model": "openai/o3-mini",
"messages": [
{"role": "user", "content": "How would you build the world's tallest skyscraper?"}
],
"reasoning": {
"effort": "high" # Use high reasoning effort
}
}
response = requests.post(url, headers=headers, data=json.dumps(payload))
print(response.json()['choices'][0]['message']['reasoning'])
import OpenAI from 'openai';
const openai = new OpenAI({
baseURL: 'https://api.openroute.cn/v1',
apiKey: API_KEY_REF,
});
async function getResponseWithReasoning() {
const response = await openai.chat.completions.create({
model: 'openai/o3-mini',
messages: [
{
role: 'user',
content: "How would you build the world's tallest skyscraper?",
},
],
reasoning: {
effort: 'high', // Use high reasoning effort
},
});
console.log('REASONING:', response.choices[0].message.reasoning);
console.log('CONTENT:', response.choices[0].message.content);
}
getResponseWithReasoning();
Last updated on