Creates a model response for the given input. Designed for agentic workflows with structured output items (messages, reasoning, function calls). Only type: "function" tools are supported. Stateless API - supply full conversation history via input[] on each request.
Documentation Index
Fetch the complete documentation index at: https://docs.infercom.ai/llms.txt
Use this file to discover all available pages before exploring further.
Infercom API Key
Response creation parameters
Request body for creating a model response.
The model ID to use (e.g., MiniMax-M2.5, gpt-oss-120b).
"MiniMax-M2.5"
Plain text input (equivalent to a user message).
System message prepended to input.
If true, stream response as Server-Sent Events.
Maximum tokens to generate.
Randomness control (0-2).
0 <= x <= 2Nucleus sampling cutoff (0-1).
0 <= x <= 1Top-K sampling (1-100).
1 <= x <= 100Function tools available to the model (max 128).
128Controls how the model uses tools.
none, auto, required Allow multiple tool calls in parallel.
Response format configuration.
Reasoning configuration for supported models.
User identifier (echoed in response).
Successful response. Returns a ResponseResponse object (non-streaming), or a stream of Server-Sent Events (when stream: true).
Response object from POST /responses.
Unique response identifier.
Object type, always "response".
response Response lifecycle status.
completed, failed, in_progress, incomplete Unix timestamp when created.
Model ID used.
Output items (messages, reasoning, function calls).
An output item in the response (message, reasoning, or function_call).
Unix timestamp when completed.
Token usage statistics for the response.
Error details when status is "failed".
Echoed system instructions.
Controls how the model uses tools.
none, auto, required Always false (stateless API).