Inference

inference

Methods

Chat Completion -> { completion_message, logprobs } | { event }
post/alpha/inference/chat-completion
Parameters
X-LlamaStack-Client-Version: string
Optional
X-LlamaStack-Provider-Data: string
Optional
Response fields
ChatCompletionResponse = { completion_message, logprobs }
ChatCompletionResponseStreamChunk = { event }
Request example
Completion -> | { delta, logprobs, stop_reason }
post/alpha/inference/completion
Embeddings ->
post/alpha/inference/embeddings

Domain types

CompletionResponse = { content, stop_reason, logprobs }
EmbeddingsResponse = { embeddings }
TokenLogProbs = { logprobs_by_token }