Parameters - Anannas

Overview

This document describes all parameters available in the /v1/chat/completions endpoint. Parameters are validated server-side; unsupported parameters for a given provider are ignored.

Required Parameters

`model`

Type: string
Required: Yes
Description: Model identifier in format provider/model-name Examples:

openai/gpt-5-mini
anthropic/claude-3-sonnet
openai/gpt-3.5-turbo

`messages`

Type: Message[]
Required: Yes
Minimum: 1 message
Description: Array of message objects with role and content Message roles:

system: System instructions (typically first message)
user: User input
assistant: Model responses (for conversation history)
tool: Tool execution results

Sampling Parameters

`temperature`

Type: number
Range: 0.0 - 2.0
Default: 1.0
Description: Controls randomness. Lower values make output more deterministic.

0.0: Most deterministic
1.0: Balanced
2.0: Most random

`top_p`

Type: number
Range: 0.0 - 1.0
Default: 1.0
Description: Nucleus sampling - considers tokens with cumulative probability mass.

`max_tokens`

Type: integer
Minimum: 1
Description: Maximum tokens to generate. Model-specific limits apply.

`max_completion_tokens`

Type: integer
Description: Alternative to max_tokens (provider-specific).

`stop`

Type: string | string[]
Description: Stop sequences that halt generation. Can be a single string or array.

{
  "stop": ["\n\n", "Human:"]
}

`seed`

Type: integer
Description: Random seed for deterministic outputs. Only supported by some models.

Penalties

`frequency_penalty`

Type: number
Range: -2.0 - 2.0
Description: Reduces likelihood of repeating tokens. Positive values decrease repetition.

`presence_penalty`

Type: number
Range: -2.0 - 2.0
Description: Reduces likelihood of discussing new topics. Positive values encourage new topics.

`repetition_penalty`

Type: number
Range: (0, 2]
Description: Provider-specific repetition control.

Advanced Sampling

`top_k`

Type: integer
Minimum: 0
Description: Limits sampling to top K tokens by probability.

Not allowed with reasoning: When reasoning is enabled, top_k is not supported and will result in an error.

`top_a`

Type: number
Range: [0, 1]
Description: Provider-specific sampling parameter.

`min_p`

Type: number
Range: [0, 1]
Description: Minimum probability threshold for token selection.

`logit_bias`

Type: { [token_id: number]: number }
Description: Bias specific tokens by ID. Values typically range from -100 to 100.

{
  "logit_bias": {
    "1234": 10,   // Increase likelihood
    "5678": -10   // Decrease likelihood
  }
}

Tool Calling

`tools`

Type: Tool[]
Description: Array of function definitions for tool calling.

{
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get weather for a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {"type": "string"}
        },
        "required": ["location"]
      }
    }
  }]
}

`tool_choice`

Type: "none" | "auto" | "required" | { type: "function", function: { name: string } }
Default: "auto"
Description: Controls tool usage behavior.

"none": No tools called
"auto": Model decides
"required": Model must call a tool
Object: Force specific function

Restrictions with reasoning: When reasoning is enabled, only "auto" or "none" are allowed. Using "required" or forcing a specific tool will result in an error. This restriction enables interleaved thinking, which allows reasoning between tool calls.

`parallel_tool_calls`

Type: boolean
Default: true
Description: Allow multiple tool calls in a single response.

Structured Outputs

`response_format`

Type: { type: "json_object" } | { type: "json_schema", json_schema: object }
Description: Enforce JSON output format. JSON Object:

{
  "response_format": {
    "type": "json_object"
  }
}

JSON Schema:

{
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "response",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "answer": {"type": "string"}
        }
      }
    }
  }
}

Streaming

`stream`

Type: boolean
Default: false
Description: Enable Server-Sent Events streaming. See Streaming documentation.

Reasoning

`reasoning`

Type: object
Description: Configure reasoning for models that support it.

Check Reasoning Support

For models that support reasoning and their configuration options, visit anannas.ai/models.

{
  "reasoning": {
    "effort": "high",      // "low" | "medium" | "high"
    "max_tokens": 10000,   // Maximum reasoning tokens
    "enabled": true,       // Enable reasoning
    "exclude": false       // Exclude from response
  }
}

Interleaved Thinking with ToolsWhen using reasoning with tools on Claude 4 models (Sonnet 4.5, Opus 4.5, Haiku 4.5), interleaved thinking is automatically enabled. This allows the model to reason between tool calls. With interleaved thinking, max_tokens in the reasoning config can exceed the request’s max_tokens parameter, as it represents the total budget across all thinking blocks within one assistant turn.Interleaved thinking requires tool_choice: "auto" (or no tool_choice specified). Using tool_choice: "required" or forcing a specific tool will result in an error when reasoning is enabled.

`thinking_config`

Type: object
Description: External API alias for reasoning. Maps to reasoning internally.

{
  "thinking_config": {
    "include_thoughts": true,
    "thinking_budget": 10000,
    "thinking_level": "high"
  }
}

Multimodal

`modalities`

Type: string[]
Description: Requested output modalities: ["text", "audio", "image"]

`audio`

Type: object
Description: Audio output configuration.

{
  "audio": {
    "voice": "alloy",
    "format": "mp3"
  }
}

Prompt Caching

`prompt_cache_key`

Type: string
Description: Cache key for OpenAI prompt caching. See Overview.

Routing

`models`

Type: string[]
Description: Fallback model list for routing.

`route`

Type: "fallback"
Description: Enable smart fallback routing.

`provider`

Type: object
Description: Provider selection preferences.

Check Provider Pricing

For current pricing to set max_price limits, visit anannas.ai/models.

{
  "provider": {
    "order": ["openai", "anthropic"],
    "allow_fallbacks": true,
    "require_parameters": false,
    "data_collection": "deny",
    "zdr": true,
    "only": ["openai"],
    "ignore": ["anthropic"],
    "quantizations": ["q4", "q8"],
    "sort": "price",
    "max_price": {
      "prompt": 0.001,
      "completion": 0.002
    }
  }
}

`fallbacks`

Type: Array<string | object>
Description: Explicit fallback chain.

{
  "fallbacks": [
    "anthropic/claude-3-sonnet",
    {
      "model": "openai/gpt-3.5-turbo",
      "provider": {"only": ["openai"]}
    }
  ]
}

User Tracking

`user`

Type: string
Description: Stable identifier for end-users (abuse prevention).

Metadata

`metadata`

Type: { [key: string]: string }
Description: Custom metadata for request tracking.

{
  "metadata": {
    "request_id": "req-123",
    "environment": "production"
  }
}

Provider-Specific Parameters

Check Parameter Support

For detailed parameter support by model and provider, visit anannas.ai/models.

Some parameters are provider-specific and may be ignored by others:

Mistral: safe_prompt
Hyperbolic: raw_mode
Grok: search_parameters, deferred
Anthropic: cache_control in message content

Parameter Validation

Invalid parameter values return 400 Bad Request
Unsupported parameters are silently ignored
Provider-specific parameters are passed through when supported

Getting Started

Features

API

Models

Use Cases

Community

​Overview

​Required Parameters

​model

​messages

​Sampling Parameters

​temperature

​top_p

​max_tokens

​max_completion_tokens

​stop

​seed

​Penalties

​frequency_penalty

​presence_penalty

​repetition_penalty

​Advanced Sampling

​top_k

​top_a

​min_p

​logit_bias

​Tool Calling

​tools

​tool_choice

​parallel_tool_calls

​Structured Outputs

​response_format

​Streaming

​stream

​Reasoning

​reasoning

Check Reasoning Support

​thinking_config

​Multimodal

​modalities

​audio

​Prompt Caching

​prompt_cache_key

​Routing

​models

​route

​provider

Check Provider Pricing

​fallbacks

​User Tracking

​user

​Metadata

​metadata

​Provider-Specific Parameters

Check Parameter Support

​Parameter Validation

​See Also

Overview

Required Parameters

`model`

`messages`

Sampling Parameters

`temperature`

`top_p`

`max_tokens`

`max_completion_tokens`

`stop`

`seed`

Penalties

`frequency_penalty`

`presence_penalty`

`repetition_penalty`

Advanced Sampling

`top_k`

`top_a`

`min_p`

`logit_bias`

Tool Calling

`tools`

`tool_choice`

`parallel_tool_calls`

Structured Outputs

`response_format`

Streaming

`stream`

Reasoning

`reasoning`

`thinking_config`

Multimodal

`modalities`

`audio`

Prompt Caching

`prompt_cache_key`

Routing

`models`

`route`

`provider`

`fallbacks`

User Tracking

`user`

Metadata

`metadata`

Provider-Specific Parameters

Parameter Validation

See Also