API endpoints
Text generation
| Endpoint | Description | Best for |
|---|---|---|
/v1/chat/completions | OpenAI-compatible chat API | Conversational applications, OpenAI SDK users |
/v1/messages | Anthropic-compatible Messages API | Claude Code, LangChain Anthropic, Anthropic SDK users |
/v1/responses | Structured output API for agentic workflows | Coding agents, tool calling, multi-step reasoning |
Embeddings
| Endpoint | Description | Best for |
|---|---|---|
/v1/embeddings | Generate vector embeddings | RAG, semantic search, classification |
Audio
| Endpoint | Description | Best for |
|---|---|---|
/v1/audio/transcriptions | Transcribe audio to text | Speech-to-text, meeting transcription |
/v1/audio/translations | Translate audio to English | Multilingual audio processing |
Infercom Inference Service
Code examples in this API reference include the specific URLs for developing on Infercom Inference Service.Model sovereignty metadata
The/v1/models endpoint supports a ?verbose=true query parameter that returns detailed metadata for each model, including sovereignty information. Use the sn_metadata.region field to determine where a model is hosted:
"EU"— the model runs on Infercom’s EU infrastructure in Germany with full data sovereignty- Empty or absent — the model is available via SambaNova’s global infrastructure