Skip to main content
The Infercom Developer guide and API reference provide the tools you need to build applications using Infercom as an inference service.

API endpoints

Text generation

EndpointDescriptionBest for
/v1/chat/completionsOpenAI-compatible chat APIConversational applications, OpenAI SDK users
/v1/messagesAnthropic-compatible Messages APIClaude Code, LangChain Anthropic, Anthropic SDK users
/v1/responsesStructured output API for agentic workflowsCoding agents, tool calling, multi-step reasoning
See OpenAI compatibility for the Chat Completions API, Anthropic compatibility for the Messages API, or Responses API for agentic workflows.

Embeddings

EndpointDescriptionBest for
/v1/embeddingsGenerate vector embeddingsRAG, semantic search, classification
See Embeddings for usage examples.

Audio

EndpointDescriptionBest for
/v1/audio/transcriptionsTranscribe audio to textSpeech-to-text, meeting transcription
/v1/audio/translationsTranslate audio to EnglishMultilingual audio processing
See Audio for usage examples.

Infercom Inference Service

Code examples in this API reference include the specific URLs for developing on Infercom Inference Service.

Model sovereignty metadata

The /v1/models endpoint supports a ?verbose=true query parameter that returns detailed metadata for each model, including sovereignty information. Use the sn_metadata.region field to determine where a model is hosted:
  • "EU" — the model runs on Infercom’s EU infrastructure in Germany with full data sovereignty
  • Empty or absent — the model is available via SambaNova’s global infrastructure
See Supported models for details and code examples.