Build agentic AI applications with the Infercom Responses API. Structured outputs, function calling, reasoning, and streaming for coding agents and tool-capable integrations.
Use this file to discover all available pages before exploring further.
The Responses API (POST /v1/responses) is designed for agentic workflows and tool-capable integrations. It structures model output as typed items—messages, function calls, and reasoning—rather than a single text field, enabling sophisticated multi-step agent interactions.
The Responses API complements the Chat Completions API and does not replace it. Use Responses API for agentic workflows and tool calling; use Chat Completions for simpler conversational needs.
The simplest usage passes a string input and receives a structured response.
from openai import OpenAIclient = OpenAI( base_url="https://api.infercom.ai/v1", api_key="your-infercom-api-key")response = client.responses.create( model="MiniMax-M2.5", input="Explain the difference between supervised and unsupervised learning.")# Access the text outputprint(response.output_text)
Use the instructions parameter to provide system-level guidance:
response = client.responses.create( model="MiniMax-M2.5", instructions="You are a helpful assistant that speaks like a pirate.", input="How are you today?")
Since the API is stateless, include the full conversation history in the input array:
# Turn 1response_1 = client.responses.create( model="MiniMax-M2.5", input=[{"role": "user", "content": "My name is Thomas."}])# Turn 2 - include prior messagesresponse_2 = client.responses.create( model="MiniMax-M2.5", input=[ {"role": "user", "content": "My name is Thomas."}, response_1.output[0], # Include assistant's response {"role": "user", "content": "What is my name?"} ])print(response_2.output_text) # "Your name is Thomas..."
# Execute the function locallydef get_weather(city: str) -> dict: # Your actual weather API call here return {"city": city, "temperature": "18°C", "condition": "Cloudy"}# Find the function call in the responsetool_call = next(item for item in response.output if item.type == "function_call")args = json.loads(tool_call.arguments)result = get_weather(args["city"])# Send result back to the modelfollow_up = client.responses.create( model="MiniMax-M2.5", input=[ {"role": "user", "content": "What's the weather in Berlin?"}, tool_call, # Include the function call { "type": "function_call_output", "call_id": tool_call.call_id, "output": json.dumps(result) } ], tools=tools)print(follow_up.output_text) # "The weather in Berlin is 18°C and cloudy."
Enable streaming for real-time output with stream: true. The API emits Server-Sent Events:
stream = client.responses.create( model="MiniMax-M2.5", input="Write a short poem about speed.", stream=True)for event in stream: if event.type == "response.output_text.delta": print(event.delta, end="", flush=True)