API Reference¶
Client Classes¶
- class venice_ai.AsyncVeniceClient(*, api_key: str | None = None, base_url: str | httpx.URL | None = None, timeout: float | httpx.Timeout | None = Timeout(connect=5.0, read=60.0, write=60.0, pool=60.0), default_timeout: httpx.Timeout | None = None, http_client: httpx.AsyncClient | None = None, http_transport_options: Dict[str, Any] | None = None, proxy: ProxyTypes | NotGiven = NOT_GIVEN, transport: httpx.BaseTransport | NotGiven = NOT_GIVEN, async_transport: httpx.AsyncBaseTransport | NotGiven = NOT_GIVEN, limits: httpx.Limits | NotGiven = NOT_GIVEN, cert: CertTypes | NotGiven = NOT_GIVEN, verify: bool | str | ssl.SSLContext | NotGiven = NOT_GIVEN, trust_env: bool | NotGiven = NOT_GIVEN, http1: bool | NotGiven = NOT_GIVEN, http2: bool | NotGiven = NOT_GIVEN, follow_redirects: bool | NotGiven = NOT_GIVEN, max_redirects: int | NotGiven = NOT_GIVEN, default_encoding: str | Callable[[bytes], str] | NotGiven = NOT_GIVEN, event_hooks: Mapping[str, List[Callable[..., Any]]] | NotGiven = NOT_GIVEN)[source]
Provides an asynchronous client for interacting with the Venice.ai API.
This client provides a complete interface for making asynchronous requests to all Venice AI API endpoints. It handles authentication, request formation, response parsing, and error management through a clean, resource-oriented design.
The client architecture follows a namespaced resource pattern, where different API capabilities are organized into dedicated resource objects (e.g., chat, models, image). This design creates a clean separation of concerns and makes the API more discoverable and easily navigable.
- Parameters:
api_key (str) – Your Venice.ai API key. This is required for authentication.
base_url (Optional[Union[str, httpx.URL]]) – Overrides the default base URL. Defaults to the Venice AI production API URL. Useful for testing against different environments.
timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout in seconds or as a detailed
httpx.Timeoutobject for more granular control. Defaults to 60.0 seconds.default_timeout (Optional[httpx.Timeout]) – Global default timeout for all API calls made by this client instance. If provided, this will be used as the default timeout for all requests unless overridden on a per-request basis. Takes precedence over the
timeoutparameter.max_retries (int) – Maximum number of retries for connection errors or transient failures. This parameter controls the total number of retries for the httpx-retries mechanism. Defaults to 2.
retry_backoff_factor (float) – Backoff factor for retry delays. Defaults to 0.5.
retry_status_forcelist (Optional[List[int]]) – List of HTTP status codes to retry on. Defaults to [429, 500, 502, 503, 504].
retry_respect_retry_after_header (bool) – Whether to respect Retry-After headers. Defaults to True.
http_client (Optional[httpx.AsyncClient]) –
An optional pre-configured
httpx.AsyncClientinstance to use for HTTP requests. If provided:The SDK will use this custom client directly.
The SDK will still configure base_url (from the base_url parameter or default), timeout (from default_timeout or timeout parameter), and Authorization headers on this provided client instance.
All other HTTP-related parameters passed to this constructor (e.g., max_retries, retry_backoff_factor, proxy, transport, limits, verify, etc.) will be ignored. It is assumed that the provided http_client is already configured with these aspects.
You are responsible for managing the lifecycle of the provided http_client (e.g., closing it via await http_client.aclose()).
If not provided, the SDK will create and manage its own internal httpx.AsyncClient.
proxy (Optional[Union[str, httpx.URL, httpx.Proxy]]) – Proxy configuration for HTTP requests. Only used when
http_clientis not provided.transport (Optional[httpx.BaseTransport]) – Custom transport for HTTP requests. Only used when
http_clientis not provided.limits (Optional[httpx.Limits]) – Connection limits configuration. Only used when
http_clientis not provided.cert (Optional[Union[str, Tuple[str, str]]]) – Client certificate configuration. Only used when
http_clientis not provided.verify (Optional[Union[bool, str, ssl.SSLContext]]) – SSL certificate verification. Only used when
http_clientis not provided.trust_env (Optional[bool]) – Whether to trust environment variables for proxy configuration. Only used when
http_clientis not provided.http1 (Optional[bool]) – Whether to enable HTTP/1.1. Only used when
http_clientis not provided.http2 (Optional[bool]) – Whether to enable HTTP/2. Only used when
http_clientis not provided.default_encoding (Optional[Union[str, Callable[[bytes], str]]]) – Default encoding for response content. Only used when
http_clientis not provided.event_hooks (Optional[Mapping[str, List[Callable[..., Any]]]]) – Event hooks for request/response lifecycle. Only used when
http_clientis not provided.
- chat
Access to chat-related endpoints.
- Type:
AsyncChatResource
- models
Access to model listing and information endpoints.
- Type:
AsyncModels
- image
Access to image generation and manipulation endpoints.
- Type:
AsyncImage
- audio
Access to speech synthesis and audio processing endpoints.
- Type:
AsyncAudio
- billing
Access to billing and usage information endpoints.
- Type:
AsyncBilling
- embeddings
Access to embedding generation endpoints.
- Type:
AsyncEmbeddings
- api_keys
Access to API key management endpoints.
- Type:
AsyncApiKeys
- characters
Access to character management endpoints.
- Type:
AsyncCharacters
Examples
Basic usage:
from venice_ai import AsyncVeniceClient async with AsyncVeniceClient(api_key="your-api-key") as client: response = await client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Hello, world!"}] ) print(response["choices"][0]["message"]["content"])
Streaming example:
from venice_ai import AsyncVeniceClient async with AsyncVeniceClient(api_key="your-api-key") as client: async for chunk in client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Count to 5"}], stream=True ): content = chunk["choices"][0]["delta"].get("content", "") if content: print(content, end="", flush=True)
Using with a custom
httpxclient:import httpx from venice_ai import AsyncVeniceClient # Create a custom client with specific configurations custom_client = httpx.AsyncClient( timeout=httpx.Timeout(connect=5.0, read=30.0, write=10.0), follow_redirects=True, http2=True ) # Use the custom client with AsyncVeniceClient async with AsyncVeniceClient( api_key="your-api-key", http_client=custom_client ) as client: # Your API operations here pass
- Raises:
ValueError – If
api_keyis empty orNone.
Note
When used as an async context manager (with
async with), the client will automatically close the underlying HTTP client upon exit, freeing any resources. For manual resource management, use theclose()method.- Parameters:
http_transport_options (
typing.Optional[typing.Dict[str,typing.Any]])async_transport (
typing.Union[httpx.AsyncBaseTransport,venice_ai.utils.NotGivenType,None])follow_redirects (
typing.Union[bool,venice_ai.utils.NotGivenType,None])max_redirects (
typing.Union[int,venice_ai.utils.NotGivenType,None])
- async aclose()[source]
Close the underlying asynchronous HTTP client and free all associated resources.
This is an alias for the close() method, following the conventional async naming pattern where async methods are prefixed with ‘a’. This method performs the same cleanup as close() - it closes the internal httpx.AsyncClient and any associated resources.
- Returns:
None
- Return type:
None
- property api_key: str
Get the API key for authentication.
Returns the explicitly set API key, or falls back to the VENICE_API_KEY environment variable if no key was explicitly provided.
- Returns:
The API key to use for authentication.
- Return type:
- build_request(method: str, path: str, *, json_data: Mapping[str, Any] | None = None, headers: Mapping[str, str] | None = None, params: Mapping[str, Any] | None = None)[source]
Build a request with proper headers including authentication.
This method constructs the headers for a request, merging authentication headers with any provided headers. It supports default token retention by using the current api_key value.
- Parameters:
method (str) – HTTP method for the request.
path (str) – API endpoint path relative to the base URL.
json_data (Optional[Mapping[str, Any]]) – JSON-serializable request body.
headers (Optional[Mapping[str, str]]) – Additional HTTP headers to include.
params (Optional[Mapping[str, Any]]) – URL query parameters.
- Returns:
Dictionary containing the built request information.
- Return type:
Dict[str, Any]
- async close()[source]
Close the underlying asynchronous HTTP client and free all associated resources.
This method performs cleanup of the internal
httpx.AsyncClientand any associated resources such as connection pools, SSL contexts, and background tasks. It should be called when the Venice AI client is no longer needed to ensure proper resource cleanup and prevent resource leaks.When using the client as an async context manager (with
async with), this method is called automatically upon exiting the context, so manual cleanup is not required. For manual resource management, this method should be called explicitly.The method is designed to be idempotent - it can be called multiple times safely. Only the first call will actually perform the cleanup; subsequent calls will be no-ops. This prevents errors if cleanup is attempted multiple times.
Note
If a user-provided httpx.AsyncClient was passed to the constructor, this method will not close it, as the user is responsible for managing the lifecycle of their own client.
After calling this method, the client should not be used for making further API requests. Attempting to use a closed client may result in errors or undefined behavior.
- Return type:
- async delete(path: str, *, cast_to: Type[venice_ai._async_client.T] | None = None, **kwargs)[source]
Make an asynchronous DELETE request to the specified API endpoint.
This is a convenience method that wraps the lower-level
_requestmethod specifically for DELETE requests. It handles proper header configuration for DELETE requests and provides a clean interface for resource deletion operations.- Parameters:
path (str) – API endpoint path relative to the client’s base URL. Should not include a leading slash as it will be properly joined with the base URL.
cast_to (Optional[Type[T]]) – Optional Pydantic model to cast the response to.
kwargs – Additional keyword arguments to pass to the underlying
_requestmethod. This can include options likeheaders,params,timeout, orraw_response.
- Returns:
Parsed JSON response data as Python objects (typically dict or list). Many DELETE endpoints return confirmation data or the deleted resource details.
- Return type:
Any
- Raises:
venice_ai.exceptions.InvalidRequestError – For HTTP 400 errors indicating invalid request parameters.
venice_ai.exceptions.AuthenticationError – For HTTP 401 errors indicating invalid or missing API key.
venice_ai.exceptions.PermissionDeniedError – For HTTP 403 errors indicating insufficient permissions.
venice_ai.exceptions.NotFoundError – For HTTP 404 errors indicating the requested resource was not found.
venice_ai.exceptions.RateLimitError – For HTTP 429 errors indicating rate limit exceeded.
venice_ai.exceptions.InternalServerError – For HTTP 5xx errors indicating server-side problems.
venice_ai.exceptions.APITimeoutError – If the request times out before completion.
venice_ai.exceptions.APIConnectionError – For network connectivity issues or connection failures.
venice_ai.exceptions.APIError – For other API-related errors not covered by specific exceptions.
- async get(path: str, *, params: Mapping[str, Any] | None = None, cast_to: Type[venice_ai._async_client.T] | None = None, **kwargs)[source]
Make an asynchronous GET request to the specified API endpoint.
This is a convenience method that wraps the lower-level
_requestmethod specifically for GET requests. It automatically handles proper header configuration for GET requests (removing Content-Type headers) and provides a clean interface for retrieving data from the API.- Parameters:
path (str) – API endpoint path relative to the client’s base URL. Should not include a leading slash as it will be properly joined with the base URL.
params (Optional[Mapping[str, Any]]) – URL query parameters to include in the request. These will be properly URL-encoded and appended to the request URL.
cast_to (Optional[Type[T]]) – Optional Pydantic model to cast the response to.
kwargs – Additional keyword arguments to pass to the underlying
_requestmethod. This can include options likeheaders,timeout, orraw_response.
- Returns:
Parsed JSON response data as Python objects (typically dict or list).
- Return type:
Any
- Raises:
venice_ai.exceptions.InvalidRequestError – For HTTP 400 errors indicating invalid request parameters.
venice_ai.exceptions.AuthenticationError – For HTTP 401 errors indicating invalid or missing API key.
venice_ai.exceptions.PermissionDeniedError – For HTTP 403 errors indicating insufficient permissions.
venice_ai.exceptions.NotFoundError – For HTTP 404 errors indicating the requested resource was not found.
venice_ai.exceptions.RateLimitError – For HTTP 429 errors indicating rate limit exceeded.
venice_ai.exceptions.InternalServerError – For HTTP 5xx errors indicating server-side problems.
venice_ai.exceptions.APITimeoutError – If the request times out before completion.
venice_ai.exceptions.APIConnectionError – For network connectivity issues or connection failures.
venice_ai.exceptions.APIError – For other API-related errors not covered by specific exceptions.
- async get_model_pricing(model_id: str)[source]
Get pricing information for a specific model.
Retrieves the pricing structure for a given model ID, including both USD and VCU (Venice Compute Units) costs for input and output tokens.
- Parameters:
model_id (str) – The ID of the model to get pricing for
- Returns:
Pricing information for the model
- Return type:
ModelPricing
- Raises:
ValueError – If the model is not found
Example
>>> async with AsyncVeniceClient(api_key="your-api-key") as client: ... pricing = await client.get_model_pricing("llama-3.3-70b") ... print(f"Input: ${pricing['input']['usd']}/1k tokens") ... print(f"Output: ${pricing['output']['usd']}/1k tokens")
- async post(path: str, *, json_data: Mapping[str, Any] | None = None, timeout: float | httpx.Timeout | None = None, cast_to: Type[venice_ai._async_client.T] | None = None, **kwargs)[source]
Make an asynchronous POST request to the specified API endpoint.
This is a convenience method that wraps the lower-level
_requestmethod specifically for POST requests. It handles JSON serialization of the request body and ensures proper Content-Type headers are set for JSON requests.- Parameters:
path (str) – API endpoint path relative to the client’s base URL. Should not include a leading slash as it will be properly joined with the base URL.
json_data (Optional[Mapping[str, Any]]) – JSON-serializable data to send in the request body. This will be automatically serialized to JSON and sent with
Content-Type: application/jsonheaders. Can include any data structure that is JSON-serializable (dict, list, primitives).timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout configuration. Can be a float specifying timeout in seconds, or an
httpx.Timeoutobject for granular timeout control. If not provided, uses the client’s default timeout setting.cast_to (Optional[Type[T]]) – Optional Pydantic model to cast the response to.
kwargs – Additional keyword arguments to pass to the underlying
_requestmethod. This can include options likeheaders,params, orraw_response.
- Returns:
Parsed JSON response data as Python objects (typically dict or list).
- Return type:
Any
- Raises:
venice_ai.exceptions.InvalidRequestError – For HTTP 400 errors indicating invalid request parameters.
venice_ai.exceptions.AuthenticationError – For HTTP 401 errors indicating invalid or missing API key.
venice_ai.exceptions.PermissionDeniedError – For HTTP 403 errors indicating insufficient permissions.
venice_ai.exceptions.NotFoundError – For HTTP 404 errors indicating the requested resource was not found.
venice_ai.exceptions.RateLimitError – For HTTP 429 errors indicating rate limit exceeded.
venice_ai.exceptions.InternalServerError – For HTTP 5xx errors indicating server-side problems.
venice_ai.exceptions.APITimeoutError – If the request times out before completion.
venice_ai.exceptions.APIConnectionError – For network connectivity issues or connection failures.
venice_ai.exceptions.APIError – For other API-related errors not covered by specific exceptions.
- class venice_ai.VeniceClient(*, api_key: str | None = None, base_url: str | httpx.URL | None = None, timeout: float | httpx.Timeout | None = Timeout(connect=5.0, read=60.0, write=60.0, pool=60.0), default_timeout: httpx.Timeout | None = None, http_client: httpx.Client | None = None, http_transport_options: Dict[str, Any] | None = None, proxy: ProxyTypes | NotGiven = NOT_GIVEN, transport: httpx.BaseTransport | NotGiven = NOT_GIVEN, limits: httpx.Limits | NotGiven = NOT_GIVEN, cert: CertTypes | NotGiven = NOT_GIVEN, verify: bool | str | ssl.SSLContext | NotGiven = NOT_GIVEN, trust_env: bool | NotGiven = NOT_GIVEN, http1: bool | NotGiven = NOT_GIVEN, http2: bool | NotGiven = NOT_GIVEN, follow_redirects: bool | NotGiven = NOT_GIVEN, max_redirects: int | NotGiven = NOT_GIVEN, default_encoding: str | Callable[[bytes], str] | NotGiven = NOT_GIVEN, event_hooks: Mapping[str, List[Callable[..., Any]]] | NotGiven = NOT_GIVEN)[source]
Provides a synchronous client for interacting with the Venice.ai API.
This client provides a complete interface for making synchronous requests to all Venice AI API endpoints. It handles authentication, request formation, response parsing, and error management through a clean, resource-oriented design.
The client architecture follows a namespaced resource pattern, where different API capabilities are organized into dedicated resource objects (e.g., chat, models, image). This design creates a clean separation of concerns and makes the API more discoverable and easily navigable.
- Parameters:
api_key (str) – Your Venice.ai API key. This is required for authentication.
base_url (Optional[Union[str, httpx.URL]]) – Overrides the default base URL. Defaults to the Venice AI production API URL. Useful for testing against different environments.
timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout in seconds or as a detailed
httpx.Timeoutobject for more granular control. Defaults to 60.0 seconds.default_timeout (Optional[httpx.Timeout]) – Global default timeout for all API calls made by this client instance. If provided, this will be used as the default timeout for all requests unless overridden on a per-request basis. Takes precedence over the
timeoutparameter.max_retries (int) – Maximum number of retries for connection errors or transient failures. This parameter controls the total number of retries for the httpx-retries mechanism. Defaults to 2.
retry_backoff_factor (float) – Backoff factor for retry delays. Defaults to 0.5.
retry_status_forcelist (Optional[List[int]]) – List of HTTP status codes to retry on. Defaults to [429, 500, 502, 503, 504].
retry_respect_retry_after_header (bool) – Whether to respect Retry-After headers. Defaults to True.
http_client (Optional[httpx.Client]) –
An optional pre-configured
httpx.Clientinstance to use for HTTP requests. If provided:The SDK will use this custom client directly.
The SDK will still configure base_url (from the base_url parameter or default), timeout (from default_timeout or timeout parameter), and Authorization headers on this provided client instance.
All other HTTP-related parameters passed to this constructor (e.g., max_retries, retry_backoff_factor, proxy, transport, limits, verify, etc.) will be ignored. It is assumed that the provided http_client is already configured with these aspects.
You are responsible for managing the lifecycle of the provided http_client (e.g., closing it).
If not provided, the SDK will create and manage its own internal httpx.Client.
proxy (Optional[Union[str, httpx.URL, httpx.Proxy]]) – Proxy configuration for HTTP requests. Only used when
http_clientis not provided.transport (Optional[httpx.BaseTransport]) – Custom transport for HTTP requests. Only used when
http_clientis not provided.limits (Optional[httpx.Limits]) – Connection limits configuration. Only used when
http_clientis not provided.cert (Optional[Union[str, Tuple[str, str]]]) – Client certificate configuration. Only used when
http_clientis not provided.verify (Optional[Union[bool, str, ssl.SSLContext]]) – SSL certificate verification. Only used when
http_clientis not provided.trust_env (Optional[bool]) – Whether to trust environment variables for proxy configuration. Only used when
http_clientis not provided.http1 (Optional[bool]) – Whether to enable HTTP/1.1. Only used when
http_clientis not provided.http2 (Optional[bool]) – Whether to enable HTTP/2. Only used when
http_clientis not provided.default_encoding (Optional[Union[str, Callable[[bytes], str]]]) – Default encoding for response content. Only used when
http_clientis not provided.event_hooks (Optional[Mapping[str, List[Callable[..., Any]]]]) – Event hooks for request/response lifecycle. Only used when
http_clientis not provided.
- chat
Access to chat-related endpoints.
- Type:
ChatResource
- models
Access to model listing and information endpoints.
- Type:
Models
- image
Access to image generation and manipulation endpoints.
- Type:
Image
- audio
Access to speech synthesis and audio processing endpoints.
- Type:
Audio
- billing
Access to billing and usage information endpoints.
- Type:
Billing
- embeddings
Access to embedding generation endpoints.
- Type:
Embeddings
- api_keys
Access to API key management endpoints.
- Type:
ApiKeys
- characters
Access to character management endpoints.
- Type:
Characters
Examples
Basic usage:
from venice_ai import VeniceClient client = VeniceClient(api_key="your-api-key") response = client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Hello, world!"}] ) print(response["choices"][0]["message"]["content"]) client.close() # Important to close the client when done
Using as a context manager (recommended):
from venice_ai import VeniceClient with VeniceClient(api_key="your-api-key") as client: response = client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Hello, world!"}] ) print(response["choices"][0]["message"]["content"]) # Client is automatically closed here
Streaming example:
from venice_ai import VeniceClient with VeniceClient(api_key="your-api-key") as client: for chunk in client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Count to 5"}], stream=True ): content = chunk["choices"][0]["delta"].get("content", "") if content: print(content, end="", flush=True)
- Raises:
ValueError – If
api_keyis empty orNone.
Note
When used as a context manager (with
with), the client will automatically close the underlying HTTP client upon exit, freeing any resources. For manual resource management, always call theclose()method when done.- Parameters:
http_transport_options (
typing.Optional[typing.Dict[str,typing.Any]])follow_redirects (
typing.Union[bool,venice_ai.utils.NotGivenType,None])max_redirects (
typing.Union[int,venice_ai.utils.NotGivenType,None])
- property api_key: str
Get the API key for authentication.
Returns the explicitly set API key, or falls back to the VENICE_API_KEY environment variable if no key was explicitly provided.
- Returns:
The API key to use for authentication.
- Return type:
- build_request(method: str, path: str, *, json_data: Mapping[str, Any] | None = None, headers: Mapping[str, str] | None = None, params: Mapping[str, Any] | None = None)[source]
Build a request with proper headers including authentication.
This method constructs the headers for a request, merging authentication headers with any provided headers. It supports default token retention by using the current api_key value.
- Parameters:
method (str) – HTTP method for the request.
path (str) – API endpoint path relative to the base URL.
json_data (Optional[Mapping[str, Any]]) – JSON-serializable request body.
headers (Optional[Mapping[str, str]]) – Additional HTTP headers to include.
params (Optional[Mapping[str, Any]]) – URL query parameters.
- Returns:
Dictionary containing the built request information.
- Return type:
Dict[str, Any]
- close()[source]
Close the underlying HTTP client and free resources.
This method should be called when the client is no longer needed to ensure proper cleanup of resources. If using the client as a context manager, this is called automatically on exit.
It is safe to call this method multiple times.
Note
If a user-provided httpx.Client was passed to the constructor, this method will not close it, as the user is responsible for managing the lifecycle of their own client.
- Return type:
- delete(path: str, *, cast_to: Type[venice_ai._client.T] | None = None, **kwargs)[source]
Make a DELETE request to the specified API endpoint.
This is a convenience method for making DELETE requests. It automatically handles header configuration appropriate for DELETE requests.
- Parameters:
path (str) – API endpoint path relative to the base URL.
cast_to (Optional[Type[T]]) – Optional Pydantic model to cast the response to.
kwargs – Additional arguments to pass to
_request().
- Returns:
Parsed JSON response body.
- Return type:
Any
- Raises:
venice_ai.exceptions.APIError – If the request fails.
- get(path: str, *, params: Mapping[str, Any] | None = None, cast_to: Type[venice_ai._client.T] | None = None, **kwargs)[source]
Make a GET request to the specified API endpoint.
This is a convenience method for making GET requests. It automatically handles header configuration appropriate for GET requests.
- Parameters:
- Returns:
Parsed JSON response body.
- Return type:
Any
- Raises:
venice_ai.exceptions.APIError – If the request fails.
- get_model_pricing(model_id: str)[source]
Get pricing information for a specific model.
Retrieves the pricing structure for a given model ID, including both USD and VCU (Venice Compute Units) costs for input and output tokens.
- Parameters:
model_id (str) – The ID of the model to get pricing for
- Returns:
Pricing information for the model
- Return type:
ModelPricing
- Raises:
ValueError – If the model is not found
Example
>>> client = VeniceClient(api_key="your-api-key") >>> pricing = client.get_model_pricing("llama-3.3-70b") >>> print(f"Input: ${pricing['input']['usd']}/1k tokens") >>> print(f"Output: ${pricing['output']['usd']}/1k tokens")
- post(path: str, *, json_data: Mapping[str, Any] | None = None, timeout: float | httpx.Timeout | None = None, cast_to: Type[venice_ai._client.T] | None = None, **kwargs)[source]
Make a POST request to the specified API endpoint.
This is a convenience method for making POST requests with JSON data. It automatically sets appropriate headers for JSON content.
- Parameters:
path (str) – API endpoint path relative to the base URL.
json_data (Optional[Mapping[str, Any]]) – JSON-serializable request body to send with the request.
timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout in seconds or an
httpx.Timeoutobject. If not provided, uses the client’s default timeout.cast_to (Optional[Type[T]]) – Optional Pydantic model to cast the response to.
kwargs – Additional arguments to pass to
_request().
- Returns:
Parsed JSON response body.
- Return type:
Any
- Raises:
venice_ai.exceptions.APIError – If the request fails.
- class venice_ai._client.VeniceClient(*, api_key: str | None = None, base_url: str | httpx.URL | None = None, timeout: float | httpx.Timeout | None = Timeout(connect=5.0, read=60.0, write=60.0, pool=60.0), default_timeout: httpx.Timeout | None = None, http_client: httpx.Client | None = None, http_transport_options: Dict[str, Any] | None = None, proxy: httpx.URL | str | httpx.Proxy | venice_ai.utils.NotGivenType | None = NOT_GIVEN, transport: httpx.BaseTransport | venice_ai.utils.NotGivenType | None = NOT_GIVEN, limits: httpx.Limits | venice_ai.utils.NotGivenType | None = NOT_GIVEN, cert: str | Tuple[str, str] | Tuple[str, str, str] | venice_ai.utils.NotGivenType | None = NOT_GIVEN, verify: bool | str | ssl.SSLContext | venice_ai.utils.NotGivenType | None = NOT_GIVEN, trust_env: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, http1: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, http2: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, follow_redirects: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, max_redirects: int | venice_ai.utils.NotGivenType | None = NOT_GIVEN, default_encoding: str | Callable[[bytes], str] | venice_ai.utils.NotGivenType | None = NOT_GIVEN, event_hooks: Mapping[str, List[Callable[[...], Any]]] | venice_ai.utils.NotGivenType | None = NOT_GIVEN)[source]
Bases:
BaseClientProvides a synchronous client for interacting with the Venice.ai API.
This client provides a complete interface for making synchronous requests to all Venice AI API endpoints. It handles authentication, request formation, response parsing, and error management through a clean, resource-oriented design.
The client architecture follows a namespaced resource pattern, where different API capabilities are organized into dedicated resource objects (e.g., chat, models, image). This design creates a clean separation of concerns and makes the API more discoverable and easily navigable.
- Parameters:
api_key (str) – Your Venice.ai API key. This is required for authentication.
base_url (Optional[Union[str, httpx.URL]]) – Overrides the default base URL. Defaults to the Venice AI production API URL. Useful for testing against different environments.
timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout in seconds or as a detailed
httpx.Timeoutobject for more granular control. Defaults to 60.0 seconds.default_timeout (Optional[httpx.Timeout]) – Global default timeout for all API calls made by this client instance. If provided, this will be used as the default timeout for all requests unless overridden on a per-request basis. Takes precedence over the
timeoutparameter.max_retries (int) – Maximum number of retries for connection errors or transient failures. This parameter controls the total number of retries for the httpx-retries mechanism. Defaults to 2.
retry_backoff_factor (float) – Backoff factor for retry delays. Defaults to 0.5.
retry_status_forcelist (Optional[List[int]]) – List of HTTP status codes to retry on. Defaults to [429, 500, 502, 503, 504].
retry_respect_retry_after_header (bool) – Whether to respect Retry-After headers. Defaults to True.
http_client (Optional[httpx.Client]) –
An optional pre-configured
httpx.Clientinstance to use for HTTP requests. If provided:The SDK will use this custom client directly.
The SDK will still configure base_url (from the base_url parameter or default), timeout (from default_timeout or timeout parameter), and Authorization headers on this provided client instance.
All other HTTP-related parameters passed to this constructor (e.g., max_retries, retry_backoff_factor, proxy, transport, limits, verify, etc.) will be ignored. It is assumed that the provided http_client is already configured with these aspects.
You are responsible for managing the lifecycle of the provided http_client (e.g., closing it).
If not provided, the SDK will create and manage its own internal httpx.Client.
proxy (Optional[Union[str, httpx.URL, httpx.Proxy]]) – Proxy configuration for HTTP requests. Only used when
http_clientis not provided.transport (Optional[httpx.BaseTransport]) – Custom transport for HTTP requests. Only used when
http_clientis not provided.limits (Optional[httpx.Limits]) – Connection limits configuration. Only used when
http_clientis not provided.cert (Optional[Union[str, Tuple[str, str]]]) – Client certificate configuration. Only used when
http_clientis not provided.verify (Optional[Union[bool, str, ssl.SSLContext]]) – SSL certificate verification. Only used when
http_clientis not provided.trust_env (Optional[bool]) – Whether to trust environment variables for proxy configuration. Only used when
http_clientis not provided.http1 (Optional[bool]) – Whether to enable HTTP/1.1. Only used when
http_clientis not provided.http2 (Optional[bool]) – Whether to enable HTTP/2. Only used when
http_clientis not provided.default_encoding (Optional[Union[str, Callable[[bytes], str]]]) – Default encoding for response content. Only used when
http_clientis not provided.event_hooks (Optional[Mapping[str, List[Callable[..., Any]]]]) – Event hooks for request/response lifecycle. Only used when
http_clientis not provided.
- chat
Access to chat-related endpoints.
- Type:
ChatResource
- models
Access to model listing and information endpoints.
- Type:
Models
- image
Access to image generation and manipulation endpoints.
- Type:
Image
- audio
Access to speech synthesis and audio processing endpoints.
- Type:
Audio
- billing
Access to billing and usage information endpoints.
- Type:
Billing
- embeddings
Access to embedding generation endpoints.
- Type:
Embeddings
- api_keys
Access to API key management endpoints.
- Type:
ApiKeys
- characters
Access to character management endpoints.
- Type:
Characters
Examples
Basic usage:
from venice_ai import VeniceClient client = VeniceClient(api_key="your-api-key") response = client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Hello, world!"}] ) print(response["choices"][0]["message"]["content"]) client.close() # Important to close the client when done
Using as a context manager (recommended):
from venice_ai import VeniceClient with VeniceClient(api_key="your-api-key") as client: response = client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Hello, world!"}] ) print(response["choices"][0]["message"]["content"]) # Client is automatically closed here
Streaming example:
from venice_ai import VeniceClient with VeniceClient(api_key="your-api-key") as client: for chunk in client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Count to 5"}], stream=True ): content = chunk["choices"][0]["delta"].get("content", "") if content: print(content, end="", flush=True)
- Raises:
ValueError – If
api_keyis empty orNone.
Note
When used as a context manager (with
with), the client will automatically close the underlying HTTP client upon exit, freeing any resources. For manual resource management, always call theclose()method when done.- Parameters:
http_transport_options (
typing.Optional[typing.Dict[str,typing.Any]])follow_redirects (
typing.Union[bool,venice_ai.utils.NotGivenType,None])max_redirects (
typing.Union[int,venice_ai.utils.NotGivenType,None])
- __init__(*, api_key: str | None = None, base_url: str | httpx.URL | None = None, timeout: float | httpx.Timeout | None = Timeout(connect=5.0, read=60.0, write=60.0, pool=60.0), default_timeout: httpx.Timeout | None = None, http_client: httpx.Client | None = None, http_transport_options: Dict[str, Any] | None = None, proxy: httpx.URL | str | httpx.Proxy | venice_ai.utils.NotGivenType | None = NOT_GIVEN, transport: httpx.BaseTransport | venice_ai.utils.NotGivenType | None = NOT_GIVEN, limits: httpx.Limits | venice_ai.utils.NotGivenType | None = NOT_GIVEN, cert: str | Tuple[str, str] | Tuple[str, str, str] | venice_ai.utils.NotGivenType | None = NOT_GIVEN, verify: bool | str | ssl.SSLContext | venice_ai.utils.NotGivenType | None = NOT_GIVEN, trust_env: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, http1: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, http2: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, follow_redirects: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, max_redirects: int | venice_ai.utils.NotGivenType | None = NOT_GIVEN, default_encoding: str | Callable[[bytes], str] | venice_ai.utils.NotGivenType | None = NOT_GIVEN, event_hooks: Mapping[str, List[Callable[[...], Any]]] | venice_ai.utils.NotGivenType | None = NOT_GIVEN)[source]
Initialize the VeniceClient.
This constructor sets up the client for making API requests. It configures authentication, base URL, timeout settings, and retry mechanisms. It also initializes all the resource namespaces (e.g., chat, models).
- Parameters:
api_key (str) – The API key for authentication. Must not be empty or None.
base_url (Optional[Union[str, httpx.URL]]) – Optional base URL to override the default Venice AI API URL. If not provided, uses the default production API URL.
timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout in seconds or as an
httpx.Timeoutobject for more granular control. Defaults to 60.0 seconds.default_timeout (Optional[httpx.Timeout]) – Global default timeout for all API calls made by this client instance. If provided, this will be used as the default timeout for all requests unless overridden on a per-request basis. Takes precedence over the
timeoutparameter.max_retries (int) – Maximum number of retries for connection errors or transient failures. This parameter controls the total number of retries for the httpx-retries mechanism. Defaults to 2.
retry_backoff_factor (float) – Backoff factor for retry delays. Defaults to 0.5.
retry_status_forcelist (Optional[List[int]]) – List of HTTP status codes to retry on. Defaults to [429, 500, 502, 503, 504].
retry_respect_retry_after_header (bool) – Whether to respect Retry-After headers. Defaults to True.
http_client (Optional[httpx.Client]) –
An optional pre-configured
httpx.Clientinstance to use for HTTP requests. If provided:The SDK will use this custom client directly.
The SDK will still configure base_url (from the base_url parameter or default), timeout (from default_timeout or timeout parameter), and Authorization headers on this provided client instance.
All other HTTP-related parameters passed to this constructor (e.g., max_retries, retry_backoff_factor, proxy, transport, limits, verify, etc.) will be ignored. It is assumed that the provided http_client is already configured with these aspects.
You are responsible for managing the lifecycle of the provided http_client (e.g., closing it).
If not provided, the SDK will create and manage its own internal httpx.Client.
http_transport_options (
typing.Optional[typing.Dict[str,typing.Any]])proxy (
typing.Union[httpx.URL,str,httpx.Proxy,venice_ai.utils.NotGivenType,None])transport (
typing.Union[httpx.BaseTransport,venice_ai.utils.NotGivenType,None])limits (
typing.Union[httpx.Limits,venice_ai.utils.NotGivenType,None])cert (
typing.Union[str,typing.Tuple[str,str],typing.Tuple[str,str,str],venice_ai.utils.NotGivenType,None])verify (
typing.Union[bool,str,ssl.SSLContext,venice_ai.utils.NotGivenType,None])trust_env (
typing.Union[bool,venice_ai.utils.NotGivenType,None])http1 (
typing.Union[bool,venice_ai.utils.NotGivenType,None])http2 (
typing.Union[bool,venice_ai.utils.NotGivenType,None])follow_redirects (
typing.Union[bool,venice_ai.utils.NotGivenType,None])max_redirects (
typing.Union[int,venice_ai.utils.NotGivenType,None])default_encoding (
typing.Union[str,typing.Callable[[bytes],str],venice_ai.utils.NotGivenType,None])event_hooks (
typing.Union[typing.Mapping[str,typing.List[typing.Callable[...,typing.Any]]],venice_ai.utils.NotGivenType,None])
- Raises:
ValueError – If
api_keyis empty orNoneandVENICE_API_KEYenvironment variable is not set.
-
api_keys:
venice_ai.resources.api_keys.ApiKeys
-
audio:
venice_ai.resources.audio.Audio
-
billing:
venice_ai.resources.billing.Billing
-
characters:
venice_ai.resources.characters.Characters
-
chat:
venice_ai.resources.chat.ChatResource
- close()[source]
Close the underlying HTTP client and free resources.
This method should be called when the client is no longer needed to ensure proper cleanup of resources. If using the client as a context manager, this is called automatically on exit.
It is safe to call this method multiple times.
Note
If a user-provided httpx.Client was passed to the constructor, this method will not close it, as the user is responsible for managing the lifecycle of their own client.
- Return type:
- delete(path: str, *, cast_to: Type[venice_ai._client.T] | None = None, **kwargs)[source]
Make a DELETE request to the specified API endpoint.
This is a convenience method for making DELETE requests. It automatically handles header configuration appropriate for DELETE requests.
- Parameters:
path (str) – API endpoint path relative to the base URL.
cast_to (Optional[Type[T]]) – Optional Pydantic model to cast the response to.
kwargs – Additional arguments to pass to
_request().
- Returns:
Parsed JSON response body.
- Return type:
Any
- Raises:
venice_ai.exceptions.APIError – If the request fails.
-
embeddings:
venice_ai.resources.embeddings.Embeddings
- get(path: str, *, params: Mapping[str, Any] | None = None, cast_to: Type[venice_ai._client.T] | None = None, **kwargs)[source]
Make a GET request to the specified API endpoint.
This is a convenience method for making GET requests. It automatically handles header configuration appropriate for GET requests.
- Parameters:
- Returns:
Parsed JSON response body.
- Return type:
Any
- Raises:
venice_ai.exceptions.APIError – If the request fails.
-
image:
venice_ai.resources.image.Image
-
models:
venice_ai.resources.models.Models
- post(path: str, *, json_data: Mapping[str, Any] | None = None, timeout: float | httpx.Timeout | None = None, cast_to: Type[venice_ai._client.T] | None = None, **kwargs)[source]
Make a POST request to the specified API endpoint.
This is a convenience method for making POST requests with JSON data. It automatically sets appropriate headers for JSON content.
- Parameters:
path (str) – API endpoint path relative to the base URL.
json_data (Optional[Mapping[str, Any]]) – JSON-serializable request body to send with the request.
timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout in seconds or an
httpx.Timeoutobject. If not provided, uses the client’s default timeout.cast_to (Optional[Type[T]]) – Optional Pydantic model to cast the response to.
kwargs – Additional arguments to pass to
_request().
- Returns:
Parsed JSON response body.
- Return type:
Any
- Raises:
venice_ai.exceptions.APIError – If the request fails.
- class venice_ai._async_client.AsyncVeniceClient(*, api_key: str | None = None, base_url: str | httpx.URL | None = None, timeout: float | httpx.Timeout | None = Timeout(connect=5.0, read=60.0, write=60.0, pool=60.0), default_timeout: httpx.Timeout | None = None, http_client: httpx.AsyncClient | None = None, http_transport_options: Dict[str, Any] | None = None, proxy: httpx.URL | str | httpx.Proxy | venice_ai.utils.NotGivenType | None = NOT_GIVEN, transport: httpx.BaseTransport | venice_ai.utils.NotGivenType | None = NOT_GIVEN, async_transport: httpx.AsyncBaseTransport | venice_ai.utils.NotGivenType | None = NOT_GIVEN, limits: httpx.Limits | venice_ai.utils.NotGivenType | None = NOT_GIVEN, cert: str | Tuple[str, str] | Tuple[str, str, str] | venice_ai.utils.NotGivenType | None = NOT_GIVEN, verify: bool | str | ssl.SSLContext | venice_ai.utils.NotGivenType | None = NOT_GIVEN, trust_env: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, http1: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, http2: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, follow_redirects: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, max_redirects: int | venice_ai.utils.NotGivenType | None = NOT_GIVEN, default_encoding: str | Callable[[bytes], str] | venice_ai.utils.NotGivenType | None = NOT_GIVEN, event_hooks: Mapping[str, List[Callable[[...], Any]]] | venice_ai.utils.NotGivenType | None = NOT_GIVEN)[source]
Bases:
BaseClientProvides an asynchronous client for interacting with the Venice.ai API.
This client provides a complete interface for making asynchronous requests to all Venice AI API endpoints. It handles authentication, request formation, response parsing, and error management through a clean, resource-oriented design.
The client architecture follows a namespaced resource pattern, where different API capabilities are organized into dedicated resource objects (e.g., chat, models, image). This design creates a clean separation of concerns and makes the API more discoverable and easily navigable.
- Parameters:
api_key (str) – Your Venice.ai API key. This is required for authentication.
base_url (Optional[Union[str, httpx.URL]]) – Overrides the default base URL. Defaults to the Venice AI production API URL. Useful for testing against different environments.
timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout in seconds or as a detailed
httpx.Timeoutobject for more granular control. Defaults to 60.0 seconds.default_timeout (Optional[httpx.Timeout]) – Global default timeout for all API calls made by this client instance. If provided, this will be used as the default timeout for all requests unless overridden on a per-request basis. Takes precedence over the
timeoutparameter.max_retries (int) – Maximum number of retries for connection errors or transient failures. This parameter controls the total number of retries for the httpx-retries mechanism. Defaults to 2.
retry_backoff_factor (float) – Backoff factor for retry delays. Defaults to 0.5.
retry_status_forcelist (Optional[List[int]]) – List of HTTP status codes to retry on. Defaults to [429, 500, 502, 503, 504].
retry_respect_retry_after_header (bool) – Whether to respect Retry-After headers. Defaults to True.
http_client (Optional[httpx.AsyncClient]) –
An optional pre-configured
httpx.AsyncClientinstance to use for HTTP requests. If provided:The SDK will use this custom client directly.
The SDK will still configure base_url (from the base_url parameter or default), timeout (from default_timeout or timeout parameter), and Authorization headers on this provided client instance.
All other HTTP-related parameters passed to this constructor (e.g., max_retries, retry_backoff_factor, proxy, transport, limits, verify, etc.) will be ignored. It is assumed that the provided http_client is already configured with these aspects.
You are responsible for managing the lifecycle of the provided http_client (e.g., closing it via await http_client.aclose()).
If not provided, the SDK will create and manage its own internal httpx.AsyncClient.
proxy (Optional[Union[str, httpx.URL, httpx.Proxy]]) – Proxy configuration for HTTP requests. Only used when
http_clientis not provided.transport (Optional[httpx.BaseTransport]) – Custom transport for HTTP requests. Only used when
http_clientis not provided.limits (Optional[httpx.Limits]) – Connection limits configuration. Only used when
http_clientis not provided.cert (Optional[Union[str, Tuple[str, str]]]) – Client certificate configuration. Only used when
http_clientis not provided.verify (Optional[Union[bool, str, ssl.SSLContext]]) – SSL certificate verification. Only used when
http_clientis not provided.trust_env (Optional[bool]) – Whether to trust environment variables for proxy configuration. Only used when
http_clientis not provided.http1 (Optional[bool]) – Whether to enable HTTP/1.1. Only used when
http_clientis not provided.http2 (Optional[bool]) – Whether to enable HTTP/2. Only used when
http_clientis not provided.default_encoding (Optional[Union[str, Callable[[bytes], str]]]) – Default encoding for response content. Only used when
http_clientis not provided.event_hooks (Optional[Mapping[str, List[Callable[..., Any]]]]) – Event hooks for request/response lifecycle. Only used when
http_clientis not provided.
- chat
Access to chat-related endpoints.
- Type:
AsyncChatResource
- models
Access to model listing and information endpoints.
- Type:
AsyncModels
- image
Access to image generation and manipulation endpoints.
- Type:
AsyncImage
- audio
Access to speech synthesis and audio processing endpoints.
- Type:
AsyncAudio
- billing
Access to billing and usage information endpoints.
- Type:
AsyncBilling
- embeddings
Access to embedding generation endpoints.
- Type:
AsyncEmbeddings
- api_keys
Access to API key management endpoints.
- Type:
AsyncApiKeys
- characters
Access to character management endpoints.
- Type:
AsyncCharacters
Examples
Basic usage:
from venice_ai import AsyncVeniceClient async with AsyncVeniceClient(api_key="your-api-key") as client: response = await client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Hello, world!"}] ) print(response["choices"][0]["message"]["content"])
Streaming example:
from venice_ai import AsyncVeniceClient async with AsyncVeniceClient(api_key="your-api-key") as client: async for chunk in client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Count to 5"}], stream=True ): content = chunk["choices"][0]["delta"].get("content", "") if content: print(content, end="", flush=True)
Using with a custom
httpxclient:import httpx from venice_ai import AsyncVeniceClient # Create a custom client with specific configurations custom_client = httpx.AsyncClient( timeout=httpx.Timeout(connect=5.0, read=30.0, write=10.0), follow_redirects=True, http2=True ) # Use the custom client with AsyncVeniceClient async with AsyncVeniceClient( api_key="your-api-key", http_client=custom_client ) as client: # Your API operations here pass
- Raises:
ValueError – If
api_keyis empty orNone.
Note
When used as an async context manager (with
async with), the client will automatically close the underlying HTTP client upon exit, freeing any resources. For manual resource management, use theclose()method.- Parameters:
http_transport_options (
typing.Optional[typing.Dict[str,typing.Any]])async_transport (
typing.Union[httpx.AsyncBaseTransport,venice_ai.utils.NotGivenType,None])follow_redirects (
typing.Union[bool,venice_ai.utils.NotGivenType,None])max_redirects (
typing.Union[int,venice_ai.utils.NotGivenType,None])
- __init__(*, api_key: str | None = None, base_url: str | httpx.URL | None = None, timeout: float | httpx.Timeout | None = Timeout(connect=5.0, read=60.0, write=60.0, pool=60.0), default_timeout: httpx.Timeout | None = None, http_client: httpx.AsyncClient | None = None, http_transport_options: Dict[str, Any] | None = None, proxy: httpx.URL | str | httpx.Proxy | venice_ai.utils.NotGivenType | None = NOT_GIVEN, transport: httpx.BaseTransport | venice_ai.utils.NotGivenType | None = NOT_GIVEN, async_transport: httpx.AsyncBaseTransport | venice_ai.utils.NotGivenType | None = NOT_GIVEN, limits: httpx.Limits | venice_ai.utils.NotGivenType | None = NOT_GIVEN, cert: str | Tuple[str, str] | Tuple[str, str, str] | venice_ai.utils.NotGivenType | None = NOT_GIVEN, verify: bool | str | ssl.SSLContext | venice_ai.utils.NotGivenType | None = NOT_GIVEN, trust_env: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, http1: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, http2: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, follow_redirects: bool | venice_ai.utils.NotGivenType | None = NOT_GIVEN, max_redirects: int | venice_ai.utils.NotGivenType | None = NOT_GIVEN, default_encoding: str | Callable[[bytes], str] | venice_ai.utils.NotGivenType | None = NOT_GIVEN, event_hooks: Mapping[str, List[Callable[[...], Any]]] | venice_ai.utils.NotGivenType | None = NOT_GIVEN)[source]
Initialize the AsyncVeniceClient for asynchronous API interactions.
This constructor sets up the client for making asynchronous API requests to the Venice AI API. It configures authentication, base URL, timeout settings, retry mechanisms, and initializes all the asynchronous resource namespaces (e.g., chat, models, image, audio).
The client can be configured with custom HTTP settings through the
http_clientparameter, or it will create its ownhttpx.AsyncClientwith appropriate defaults. When providing a custom client, essential headers like Authorization will be automatically set or updated.- Parameters:
api_key (str) – Your Venice.ai API key for authentication. This is required and cannot be empty. The key will be automatically stripped of whitespace to prevent authentication issues.
base_url (Optional[Union[str, httpx.URL]]) – Base URL for the Venice AI API. If not provided, defaults to the production Venice AI API URL. Can be a string or
httpx.URLobject. Useful for testing against different environments or API versions.timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout configuration. Can be a float (seconds) for simple timeout, or an
httpx.Timeoutobject for granular control over connect, read, write, and pool timeouts. Defaults to 60.0 seconds if not specified.default_timeout (Optional[httpx.Timeout]) – Global default timeout for all API calls made by this client instance. If provided, this will be used as the default timeout for all requests unless overridden on a per-request basis. Takes precedence over the
timeoutparameter.max_retries (int) – Maximum number of automatic retries for failed requests due to connection errors or transient failures. This parameter controls the total number of retries for the httpx-retries mechanism. Defaults to 2.
retry_backoff_factor (float) – Backoff factor for retry delays. Defaults to 0.5.
retry_status_forcelist (Optional[List[int]]) – List of HTTP status codes to retry on. Defaults to [429, 500, 502, 503, 504].
retry_respect_retry_after_header (bool) – Whether to respect Retry-After headers. Defaults to True.
http_client (Optional[httpx.AsyncClient]) –
An optional pre-configured
httpx.AsyncClientinstance to use for HTTP requests. If provided:The SDK will use this custom client directly.
The SDK will still configure base_url (from the base_url parameter or default), timeout (from default_timeout or timeout parameter), and Authorization headers on this provided client instance.
All other HTTP-related parameters passed to this constructor (e.g., max_retries, retry_backoff_factor, proxy, transport, limits, verify, etc.) will be ignored. It is assumed that the provided http_client is already configured with these aspects.
You are responsible for managing the lifecycle of the provided http_client (e.g., closing it via await http_client.aclose()).
If not provided, the SDK will create and manage its own internal httpx.AsyncClient.
- Raises:
ValueError – If
api_keyis empty,None, or consists only of whitespace.
Note
When using a custom
http_client, ensure it’s configured appropriately for your use case. The Venice AI client will modify headers but will not change other client settings like timeouts, proxies, or SSL configuration.- Parameters:
http_transport_options (
typing.Optional[typing.Dict[str,typing.Any]])proxy (
typing.Union[httpx.URL,str,httpx.Proxy,venice_ai.utils.NotGivenType,None])transport (
typing.Union[httpx.BaseTransport,venice_ai.utils.NotGivenType,None])async_transport (
typing.Union[httpx.AsyncBaseTransport,venice_ai.utils.NotGivenType,None])limits (
typing.Union[httpx.Limits,venice_ai.utils.NotGivenType,None])cert (
typing.Union[str,typing.Tuple[str,str],typing.Tuple[str,str,str],venice_ai.utils.NotGivenType,None])verify (
typing.Union[bool,str,ssl.SSLContext,venice_ai.utils.NotGivenType,None])trust_env (
typing.Union[bool,venice_ai.utils.NotGivenType,None])http1 (
typing.Union[bool,venice_ai.utils.NotGivenType,None])http2 (
typing.Union[bool,venice_ai.utils.NotGivenType,None])follow_redirects (
typing.Union[bool,venice_ai.utils.NotGivenType,None])max_redirects (
typing.Union[int,venice_ai.utils.NotGivenType,None])default_encoding (
typing.Union[str,typing.Callable[[bytes],str],venice_ai.utils.NotGivenType,None])event_hooks (
typing.Union[typing.Mapping[str,typing.List[typing.Callable[...,typing.Any]]],venice_ai.utils.NotGivenType,None])
-
api_keys:
venice_ai.resources.api_keys.AsyncApiKeys
-
audio:
venice_ai.resources.audio.AsyncAudio
-
billing:
venice_ai.resources.billing.AsyncBilling
-
characters:
venice_ai.resources.characters.AsyncCharacters
-
chat:
venice_ai.resources.chat.AsyncChatResource
- async close()[source]
Close the underlying asynchronous HTTP client and free all associated resources.
This method performs cleanup of the internal
httpx.AsyncClientand any associated resources such as connection pools, SSL contexts, and background tasks. It should be called when the Venice AI client is no longer needed to ensure proper resource cleanup and prevent resource leaks.When using the client as an async context manager (with
async with), this method is called automatically upon exiting the context, so manual cleanup is not required. For manual resource management, this method should be called explicitly.The method is designed to be idempotent - it can be called multiple times safely. Only the first call will actually perform the cleanup; subsequent calls will be no-ops. This prevents errors if cleanup is attempted multiple times.
Note
If a user-provided httpx.AsyncClient was passed to the constructor, this method will not close it, as the user is responsible for managing the lifecycle of their own client.
After calling this method, the client should not be used for making further API requests. Attempting to use a closed client may result in errors or undefined behavior.
- Return type:
- async delete(path: str, *, cast_to: Type[venice_ai._async_client.T] | None = None, **kwargs)[source]
Make an asynchronous DELETE request to the specified API endpoint.
This is a convenience method that wraps the lower-level
_requestmethod specifically for DELETE requests. It handles proper header configuration for DELETE requests and provides a clean interface for resource deletion operations.- Parameters:
path (str) – API endpoint path relative to the client’s base URL. Should not include a leading slash as it will be properly joined with the base URL.
cast_to (Optional[Type[T]]) – Optional Pydantic model to cast the response to.
kwargs – Additional keyword arguments to pass to the underlying
_requestmethod. This can include options likeheaders,params,timeout, orraw_response.
- Returns:
Parsed JSON response data as Python objects (typically dict or list). Many DELETE endpoints return confirmation data or the deleted resource details.
- Return type:
Any
- Raises:
venice_ai.exceptions.InvalidRequestError – For HTTP 400 errors indicating invalid request parameters.
venice_ai.exceptions.AuthenticationError – For HTTP 401 errors indicating invalid or missing API key.
venice_ai.exceptions.PermissionDeniedError – For HTTP 403 errors indicating insufficient permissions.
venice_ai.exceptions.NotFoundError – For HTTP 404 errors indicating the requested resource was not found.
venice_ai.exceptions.RateLimitError – For HTTP 429 errors indicating rate limit exceeded.
venice_ai.exceptions.InternalServerError – For HTTP 5xx errors indicating server-side problems.
venice_ai.exceptions.APITimeoutError – If the request times out before completion.
venice_ai.exceptions.APIConnectionError – For network connectivity issues or connection failures.
venice_ai.exceptions.APIError – For other API-related errors not covered by specific exceptions.
-
embeddings:
venice_ai.resources.embeddings.AsyncEmbeddings
- async get(path: str, *, params: Mapping[str, Any] | None = None, cast_to: Type[venice_ai._async_client.T] | None = None, **kwargs)[source]
Make an asynchronous GET request to the specified API endpoint.
This is a convenience method that wraps the lower-level
_requestmethod specifically for GET requests. It automatically handles proper header configuration for GET requests (removing Content-Type headers) and provides a clean interface for retrieving data from the API.- Parameters:
path (str) – API endpoint path relative to the client’s base URL. Should not include a leading slash as it will be properly joined with the base URL.
params (Optional[Mapping[str, Any]]) – URL query parameters to include in the request. These will be properly URL-encoded and appended to the request URL.
cast_to (Optional[Type[T]]) – Optional Pydantic model to cast the response to.
kwargs – Additional keyword arguments to pass to the underlying
_requestmethod. This can include options likeheaders,timeout, orraw_response.
- Returns:
Parsed JSON response data as Python objects (typically dict or list).
- Return type:
Any
- Raises:
venice_ai.exceptions.InvalidRequestError – For HTTP 400 errors indicating invalid request parameters.
venice_ai.exceptions.AuthenticationError – For HTTP 401 errors indicating invalid or missing API key.
venice_ai.exceptions.PermissionDeniedError – For HTTP 403 errors indicating insufficient permissions.
venice_ai.exceptions.NotFoundError – For HTTP 404 errors indicating the requested resource was not found.
venice_ai.exceptions.RateLimitError – For HTTP 429 errors indicating rate limit exceeded.
venice_ai.exceptions.InternalServerError – For HTTP 5xx errors indicating server-side problems.
venice_ai.exceptions.APITimeoutError – If the request times out before completion.
venice_ai.exceptions.APIConnectionError – For network connectivity issues or connection failures.
venice_ai.exceptions.APIError – For other API-related errors not covered by specific exceptions.
-
image:
venice_ai.resources.image.AsyncImage
-
models:
venice_ai.resources.models.AsyncModels
- async post(path: str, *, json_data: Mapping[str, Any] | None = None, timeout: float | httpx.Timeout | None = None, cast_to: Type[venice_ai._async_client.T] | None = None, **kwargs)[source]
Make an asynchronous POST request to the specified API endpoint.
This is a convenience method that wraps the lower-level
_requestmethod specifically for POST requests. It handles JSON serialization of the request body and ensures proper Content-Type headers are set for JSON requests.- Parameters:
path (str) – API endpoint path relative to the client’s base URL. Should not include a leading slash as it will be properly joined with the base URL.
json_data (Optional[Mapping[str, Any]]) – JSON-serializable data to send in the request body. This will be automatically serialized to JSON and sent with
Content-Type: application/jsonheaders. Can include any data structure that is JSON-serializable (dict, list, primitives).timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout configuration. Can be a float specifying timeout in seconds, or an
httpx.Timeoutobject for granular timeout control. If not provided, uses the client’s default timeout setting.cast_to (Optional[Type[T]]) – Optional Pydantic model to cast the response to.
kwargs – Additional keyword arguments to pass to the underlying
_requestmethod. This can include options likeheaders,params, orraw_response.
- Returns:
Parsed JSON response data as Python objects (typically dict or list).
- Return type:
Any
- Raises:
venice_ai.exceptions.InvalidRequestError – For HTTP 400 errors indicating invalid request parameters.
venice_ai.exceptions.AuthenticationError – For HTTP 401 errors indicating invalid or missing API key.
venice_ai.exceptions.PermissionDeniedError – For HTTP 403 errors indicating insufficient permissions.
venice_ai.exceptions.NotFoundError – For HTTP 404 errors indicating the requested resource was not found.
venice_ai.exceptions.RateLimitError – For HTTP 429 errors indicating rate limit exceeded.
venice_ai.exceptions.InternalServerError – For HTTP 5xx errors indicating server-side problems.
venice_ai.exceptions.APITimeoutError – If the request times out before completion.
venice_ai.exceptions.APIConnectionError – For network connectivity issues or connection failures.
venice_ai.exceptions.APIError – For other API-related errors not covered by specific exceptions.
Advanced HTTP Client Configuration¶
The VeniceClient and AsyncVeniceClient offer flexible ways to configure the underlying HTTP client (httpx.Client and httpx.AsyncClient respectively). This allows for advanced scenarios such as custom mTLS, specific proxy setups, detailed transport logging, or fine-tuning HTTP/2 behavior.
There are two primary methods for this:
Passing a Pre-configured
httpx.Client/httpx.AsyncClientPassing Common
httpxSettings Directly to the SDK Client Constructor
Passing a Pre-configured httpx.Client / httpx.AsyncClient¶
You can instantiate the SDK client with an http_client parameter, providing your own fully configured httpx.Client or httpx.AsyncClient instance. The SDK will use this instance directly for making API calls.
Key Behaviors:
SDK Management: The SDK will still manage the
base_urland authentication headers for the requests. It will also apply its defaulttimeoutif a more specific timeout (e.g., per-request timeout) is not already configured on your provided client.Lifecycle Management: You are responsible for the lifecycle (e.g., closing) of the
httpx.Clientorhttpx.AsyncClientinstance you provide. The SDK will not close a user-provided client, even when the SDK client is closed or used as a context manager.
Use Cases:
Implementing custom mutual TLS (mTLS) authentication.
Using a very specific proxy configuration not easily achieved with simple string/dict proxies.
Integrating advanced logging or monitoring at the HTTP transport layer.
Reusing an existing
httpx.Clientinstance that is shared across different parts of your application.
Example: ``VeniceClient`` with a custom ``httpx.Client``
import httpx
from venice_ai import VeniceClient
# User creates and configures their own httpx.Client
custom_transport = httpx.HTTPTransport(retries=5)
my_httpx_client = httpx.Client(
transport=custom_transport,
proxies={"all://": "http://localhost:8080"}
)
# Pass it to VeniceClient
# The user is responsible for closing my_httpx_client when done.
client = VeniceClient(api_key="YOUR_API_KEY", http_client=my_httpx_client)
# Use the client as usual
try:
models = client.models.list()
print(models)
finally:
# User must close their client.
# VeniceClient's close() or context manager __exit__ will NOT close my_httpx_client.
if not my_httpx_client.is_closed:
my_httpx_client.close()
Example: ``AsyncVeniceClient`` with a custom ``httpx.AsyncClient``
import httpx
import asyncio
from venice_ai import AsyncVeniceClient
async def main():
# User creates and configures their own httpx.AsyncClient
custom_transport = httpx.AsyncHTTPTransport(retries=5)
my_async_httpx_client = httpx.AsyncClient(
transport=custom_transport,
proxies={"all://": "http://localhost:8080"}
)
# Pass it to AsyncVeniceClient
# The user is responsible for closing my_async_httpx_client when done.
async_client = AsyncVeniceClient(api_key="YOUR_API_KEY", http_client=my_async_httpx_client)
# Use the client as usual
try:
models = await async_client.models.list()
print(models)
finally:
# User must close their client.
# AsyncVeniceClient's aclose() or context manager __aexit__ will NOT close my_async_httpx_client.
if not my_async_httpx_client.is_closed:
await my_async_httpx_client.aclose()
if __name__ == "__main__":
asyncio.run(main())
Passing httpx Settings Directly¶
If you do not provide your own http_client instance, you can pass common httpx.Client or httpx.AsyncClient constructor arguments directly to the VeniceClient or AsyncVeniceClient constructor. The SDK will use these arguments to create and manage its internal httpx client.
Key Behaviors:
SDK Management: The SDK creates, configures, and manages the lifecycle of the internal
httpx.Clientorhttpx.AsyncClient. It will be closed when the SDK client’sclose()(oraclose()) method is called, or when the SDK client exits its context manager.Supported Parameters: You can pass the following
httpxconstructor arguments: *proxy: A proxy URL or a dictionary mapping URL schemes to proxy URLs. *proxies: (Alternative toproxy) A dictionary mapping URL schemes or specific domain/host patterns to proxy URLs. *transport: Anhttpx.HTTPTransportorhttpx.AsyncHTTPTransportinstance for advanced transport layer customization (e.g., connection pooling, retries, UNIX domain sockets). *limits: Anhttpx.Limitsinstance to configure connection limits (e.g.,max_connections,max_keepalive_connections). *cert: An SSL certificate, either a path to a PEM file or a 2-tuple of (cert, key) file paths. *verify: SSL verification. Can be a boolean (True/False) or a path to a CA bundle file. Defaults toTrue. Set toFalsewith caution. *trust_env: A boolean indicating whether to trust environment variables for proxy configuration, SSL certificates, etc. Defaults toTrue. *http1: A boolean indicating whether to allow HTTP/1.1 requests. Defaults toTrue. *http2: A boolean indicating whether to enable HTTP/2 support. Defaults toFalse(httpx default). *follow_redirects: A boolean indicating whether to automatically follow redirects. Defaults toFalsefor the SDK client. *max_redirects: The maximum number of redirects to follow iffollow_redirectsisTrue. *default_encoding: A callable or string to determine the default encoding for response text. *event_hooks: A dictionary of event hooks (e.g., for request, response).
Use Cases:
Easily configuring a standard HTTP/S proxy.
Setting up custom SSL/TLS verification (e.g., using a corporate CA bundle).
Adjusting connection pool limits.
Enabling HTTP/2.
Customizing retry behavior via a custom
transport.
Example: ``VeniceClient`` with direct ``httpx`` settings
from venice_ai import VeniceClient
import httpx # For httpx.Limits and httpx.HTTPTransport
# Pass httpx settings directly to VeniceClient constructor
# The SDK will create and manage its internal httpx.Client with these settings
client = VeniceClient(
api_key="YOUR_API_KEY",
proxies={"all://": "http://localhost:8080"}, # Example proxy
transport=httpx.HTTPTransport(retries=3), # Example custom transport
limits=httpx.Limits(max_connections=100, max_keepalive_connections=20), # Example limits
verify=False # Example: disable SSL verification (use with caution)
)
# Use the client as usual (SDK manages httpx.Client lifecycle)
with client: # Or client.close() when done
models = client.models.list()
print(models)
Example: ``AsyncVeniceClient`` with direct ``httpx`` settings
from venice_ai import AsyncVeniceClient
import httpx # For httpx.Limits and httpx.AsyncHTTPTransport
import asyncio
async def main():
# Pass httpx settings directly to AsyncVeniceClient constructor
# The SDK will create and manage its internal httpx.AsyncClient with these settings
async_client = AsyncVeniceClient(
api_key="YOUR_API_KEY",
proxies={"all://": "http://localhost:8080"}, # Example proxy
transport=httpx.AsyncHTTPTransport(retries=3), # Example custom transport
limits=httpx.Limits(max_connections=100, max_keepalive_connections=20), # Example limits
verify=False # Example: disable SSL verification (use with caution)
)
# Use the client as usual (SDK manages httpx.AsyncClient lifecycle)
async with async_client: # Or await async_client.aclose() when done
models = await async_client.models.list()
print(models)
if __name__ == "__main__":
asyncio.run(main())
Chat Resources¶
- class venice_ai.resources.chat.AsyncChatResource(client: AsyncVeniceClient)[source]
Provides asynchronous access to chat-related API operations.
This class acts as a namespace for asynchronous chat functionalities and is accessed via
async_client.chat. It serves as a container for chat-related operations, primarily providing access to asynchronous chat completion functionality through thecompletionsproperty.- Parameters:
client (venice_ai._async_client.AsyncVeniceClient) – The asynchronous AsyncVeniceClient instance.
-
completions:
venice_ai.resources.chat.completions.AsyncChatCompletions Access to asynchronous chat completion creation operations.
- class venice_ai.resources.chat.ChatResource(client: venice_ai._client.VeniceClient)[source]
Provides access to chat-related API operations.
This class acts as a namespace for chat functionalities and is accessed via
client.chat. It serves as a container for chat-related operations, primarily providing access to chat completion functionality through thecompletionsproperty.- Parameters:
client (venice_ai._client.VeniceClient) – The synchronous VeniceClient instance.
-
completions:
venice_ai.resources.chat.completions.ChatCompletions Access to chat completion creation operations.
- class venice_ai.resources.chat.completions.AsyncChatCompletions(client: venice_ai._resource.AsyncClientT)[source]
Provides access to asynchronous chat completion operations.
This class manages asynchronous chat completion operations with Venice AI models, supporting both standard (non-streaming) and streaming response formats. It serves as the primary interface for chat-based interactions with Venice AI language models in asynchronous contexts.
The class handles parameter validation, request formation, and response parsing for asynchronous chat completion requests.
- Parameters:
_client (venice_ai._async_client.AsyncVeniceClient) – The client instance used to make API requests.
Example
from venice_ai import AsyncVeniceClient import asyncio async def main(): # Initialize the async client client = AsyncVeniceClient(api_key="your-api-key") # Create a chat completion asynchronously response = await client.chat.completions.create( model="venice-1", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Tell me about Venice AI."} ] ) # Access the response content print(response["choices"][0]["message"]["content"]) # Run the async function asyncio.run(main())
- Parameters:
client (
typing.TypeVar(AsyncClientT, bound= AsyncVeniceClient))
- async create(*, model: str, messages: Sequence[venice_ai.types.chat.MessageParam], stream: bool = False, stream_cls: Type[venice_ai.types.chat.ChunkModelFactory[venice_ai.types.chat.ChatCompletionChunk]] | None = None, **kwargs: Any)[source]
Create a model response for the given chat conversation asynchronously.
This method handles the core functionality of the chat completions API, allowing for both synchronous and streaming responses in async contexts. It sends the provided messages and parameters to the Venice AI API and returns either a complete response or a stream of partial responses.
The method automatically formats the request body, applies appropriate defaults, and routes the request to either the standard or streaming endpoint based on the
streamparameter.- Parameters:
model (str) – ID of the model to use (e.g.,
"venice-1","llama-3.3-70b").messages (Sequence[venice_ai.types.chat.MessageParam]) – Sequence of messages forming the conversation.
stream (bool) – If
True, stream back partial progress. Defaults toFalse. Returns anAsyncIterator[ChatCompletionChunk]ifTrue, otherwiseChatCompletion.stream_cls (Optional[Type[venice_ai.types.chat.ChunkModelFactory[venice_ai.types.chat.ChatCompletionChunk]]]) – Optional stream wrapper class for streaming responses. Must conform to the ChunkModelFactory protocol.
frequency_penalty (Optional[float]) – Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far.
max_tokens (Optional[int]) – Deprecated. Please use
max_completion_tokensinstead. The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length.max_completion_tokens (Optional[int]) – Maximum number of tokens that can be generated in the chat completion.
n (Optional[int]) – Number of chat completion choices to generate for each input message.
presence_penalty (Optional[float]) – Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.
response_format (Optional[venice_ai.types.chat.ResponseFormat]) – Specifies the format that the model must output (e.g., for JSON mode).
seed (Optional[int]) – Random seed for reproducible outputs.
stop (Optional[Union[str, Sequence[str]]]) – Up to 4 sequences where the API will stop generating further tokens.
temperature (Optional[float]) – Sampling temperature between 0.0 and 2.0. Higher values make output more random, lower values more focused and deterministic. Defaults to 0.7.
top_p (Optional[float]) – Nucleus sampling parameter between 0.0 and 1.0. Defaults to 1.0.
tools (Optional[Sequence[venice_ai.types.chat.Tool]]) – List of tools the model may call.
tool_choice (Optional[Union[Literal["none", "auto"], venice_ai.types.chat.ToolChoiceObject]]) – Controls which (if any) tool is called by the model. Can be
"none","auto", or a specific tool.user (Optional[str]) – Unique identifier representing your end-user (discarded by API but supported for OpenAI compatibility).
venice_parameters (Optional[venice_ai.types.chat.VeniceParameters]) – Venice-specific parameters for fine-tuning model behavior.
logprobs (Optional[bool]) – Whether to return log probabilities of the output tokens.
top_logprobs (Optional[int]) – Number of most likely tokens to return at each token position if
logprobsisTrue.parallel_tool_calls (Optional[bool]) – Whether to enable parallel function calling during tool use.
repetition_penalty (Optional[float]) – Penalty for token repetition.
stop_token_ids (Optional[Sequence[int]]) – List of token IDs at which to stop generation.
top_k (Optional[int]) – Number of highest probability vocabulary tokens to keep for top-k-filtering.
stream_options (Optional[venice_ai.types.chat.StreamOptions]) – Additional options for controlling streaming behavior.
logit_bias (Optional[Dict[str, int]]) – Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100.
kwargs (
typing.Any) – Additional keyword arguments.
- Returns:
A
ChatCompletionifstreamisFalse, otherwise anAsyncIteratorofChatCompletionChunk.- Return type:
Union[venice_ai.types.chat.ChatCompletion, AsyncIterator[venice_ai.types.chat.ChatCompletionChunk]]
- Raises:
venice_ai.exceptions.InvalidRequestError – If parameters are invalid or malformed.
venice_ai.exceptions.AuthenticationError – If the API key is invalid or missing.
venice_ai.exceptions.PermissionDeniedError – If access is denied to the requested model or feature.
venice_ai.exceptions.NotFoundError – If the model or resource is not found.
venice_ai.exceptions.RateLimitError – If rate limits are exceeded for the account.
venice_ai.exceptions.APIError – For other API-related errors not covered by specific exceptions.
Example
# Non-streaming async usage import asyncio from venice_ai import AsyncVeniceClient async def main(): client = AsyncVeniceClient(api_key="your-api-key") response = await client.chat.completions.create( model="llama-3.3-70b", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Explain async programming in Python."} ], temperature=0.3 ) print(response["choices"][0]["message"]["content"]) asyncio.run(main()) # Async streaming usage async def stream_example(): client = AsyncVeniceClient(api_key="your-api-key") async for chunk in await client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Tell me a story."}], stream=True, max_completion_tokens=200 ): content = chunk["choices"][0]["delta"].get("content", "") if content: print(content, end="", flush=True) asyncio.run(stream_example())
- class venice_ai.resources.chat.completions.ChatCompletions(client: venice_ai._resource.SyncClientT)[source]
Provides access to chat completion operations.
This class manages synchronous chat completion operations with Venice AI models, supporting both standard (non-streaming) and streaming response formats. It serves as the primary interface for chat-based interactions with Venice AI language models.
The class handles parameter validation, request formation, and response parsing for chat completion requests.
- Parameters:
_client (venice_ai._client.VeniceClient) – The client instance used to make API requests.
Example
from venice_ai import VeniceClient # Initialize the client client = VeniceClient(api_key="your-api-key") # Create a chat completion response = client.chat.completions.create( model="venice-1", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Tell me about Venice AI."} ] ) # Access the response content print(response["choices"][0]["message"]["content"])
- Parameters:
client (
typing.TypeVar(SyncClientT, bound= VeniceClient))
- create(*, model: str, messages: Sequence[venice_ai.types.chat.MessageParam], stream: bool = False, stream_cls: Type[venice_ai.types.chat.ChunkModelFactory[venice_ai.types.chat.ChatCompletionChunk]] | None = None, **kwargs: Any)[source]
Create a model response for the given chat conversation.
This method handles the core functionality of the chat completions API, allowing for both synchronous and streaming responses. It sends the provided messages and parameters to the Venice AI API and returns either a complete response or a stream of partial responses.
The method automatically formats the request body, applies appropriate defaults, and routes the request to either the standard or streaming endpoint based on the
streamparameter.- Parameters:
model (str) – ID of the model to use (e.g.,
"venice-1","llama-3.3-70b").messages (Sequence[venice_ai.types.chat.MessageParam]) – Sequence of messages forming the conversation.
stream (bool) – If
True, stream back partial progress. Defaults toFalse. Returns anIterator[ChatCompletionChunk]ifTrue, otherwiseChatCompletion.stream_cls (Optional[Type[venice_ai.types.chat.ChunkModelFactory[venice_ai.types.chat.ChatCompletionChunk]]]) – Optional stream wrapper class for streaming responses. Must conform to the ChunkModelFactory protocol.
frequency_penalty (Optional[float]) – Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far.
max_tokens (Optional[int]) – Deprecated. Please use
max_completion_tokensinstead. The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length.max_completion_tokens (Optional[int]) – Maximum number of tokens that can be generated in the chat completion.
n (Optional[int]) – Number of chat completion choices to generate for each input message.
presence_penalty (Optional[float]) – Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.
response_format (Optional[venice_ai.types.chat.ResponseFormat]) – Specifies the format that the model must output (e.g., for JSON mode).
seed (Optional[int]) – Random seed for reproducible outputs.
stop (Optional[Union[str, Sequence[str]]]) – Up to 4 sequences where the API will stop generating further tokens.
temperature (Optional[float]) – Sampling temperature between 0.0 and 2.0. Higher values make output more random, lower values more focused and deterministic. Defaults to 0.7.
top_p (Optional[float]) – Nucleus sampling parameter between 0.0 and 1.0. Defaults to 1.0.
tools (Optional[Sequence[venice_ai.types.chat.Tool]]) – List of tools the model may call.
tool_choice (Optional[Union[Literal["none", "auto"], venice_ai.types.chat.ToolChoiceObject]]) – Controls which (if any) tool is called by the model. Can be
"none","auto", or a specific tool.user (Optional[str]) – Unique identifier representing your end-user (discarded by API but supported for OpenAI compatibility).
venice_parameters (Optional[venice_ai.types.chat.VeniceParameters]) – Venice-specific parameters for fine-tuning model behavior.
logprobs (Optional[bool]) – Whether to return log probabilities of the output tokens.
top_logprobs (Optional[int]) – Number of most likely tokens to return at each token position if
logprobsisTrue.parallel_tool_calls (Optional[bool]) – Whether to enable parallel function calling during tool use.
repetition_penalty (Optional[float]) – Penalty for token repetition.
stop_token_ids (Optional[Sequence[int]]) – List of token IDs at which to stop generation.
top_k (Optional[int]) – Number of highest probability vocabulary tokens to keep for top-k-filtering.
stream_options (Optional[venice_ai.types.chat.StreamOptions]) – Additional options for controlling streaming behavior.
logit_bias (Optional[Dict[str, int]]) – Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100.
kwargs (
typing.Any) – Additional keyword arguments.
- Returns:
A
ChatCompletionifstreamisFalse, otherwise anIteratorofChatCompletionChunk.- Return type:
Union[venice_ai.types.chat.ChatCompletion, Iterator[venice_ai.types.chat.ChatCompletionChunk]]
- Raises:
venice_ai.exceptions.InvalidRequestError – If parameters are invalid or malformed.
venice_ai.exceptions.AuthenticationError – If the API key is invalid or missing.
venice_ai.exceptions.PermissionDeniedError – If access is denied to the requested model or feature.
venice_ai.exceptions.NotFoundError – If the model or resource is not found.
venice_ai.exceptions.RateLimitError – If rate limits are exceeded for the account.
venice_ai.exceptions.APIError – For other API-related errors not covered by specific exceptions.
Example
# Non-streaming usage with system and user messages from venice_ai import VeniceClient client = VeniceClient(api_key="your-api-key") response = client.chat.completions.create( model="llama-3.3-70b", messages=[ {"role": "system", "content": "You are a helpful assistant specializing in Python."}, {"role": "user", "content": "Write a function to calculate the Fibonacci sequence."} ], temperature=0.3 # More deterministic/focused response ) print(response["choices"][0]["message"]["content"]) # Streaming usage with progress display for chunk in client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Explain quantum computing briefly."}], stream=True, max_completion_tokens=250 # Limit response length ): content = chunk["choices"][0]["delta"].get("content", "") if content: print(content, end="", flush=True) # Using tools/function calling response = client.chat.completions.create( model="llama-3.3-70b", messages=[{"role": "user", "content": "What's the weather in New York?"}], tools=[{ "type": "function", "function": { "name": "get_weather", "description": "Get current weather for a location", "parameters": { "type": "object", "properties": { "location": {"type": "string", "description": "City name"}, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]} }, "required": ["location"] } } }] )
- class venice_ai.resources.chat.ChatResource(client: venice_ai._client.VeniceClient)[source]
Bases:
APIResourceProvides access to chat-related API operations.
This class acts as a namespace for chat functionalities and is accessed via
client.chat. It serves as a container for chat-related operations, primarily providing access to chat completion functionality through thecompletionsproperty.- Parameters:
client (venice_ai._client.VeniceClient) – The synchronous VeniceClient instance.
-
completions:
venice_ai.resources.chat.completions.ChatCompletions Access to chat completion creation operations.
- class venice_ai.resources.chat.AsyncChatResource(client: venice_ai._async_client.AsyncVeniceClient)[source]
Bases:
AsyncAPIResourceProvides asynchronous access to chat-related API operations.
This class acts as a namespace for asynchronous chat functionalities and is accessed via
async_client.chat. It serves as a container for chat-related operations, primarily providing access to asynchronous chat completion functionality through thecompletionsproperty.- Parameters:
client (venice_ai._async_client.AsyncVeniceClient) – The asynchronous AsyncVeniceClient instance.
-
completions:
venice_ai.resources.chat.completions.AsyncChatCompletions Access to asynchronous chat completion creation operations.
- class venice_ai.resources.chat.completions.ChatCompletions(client: venice_ai._resource.SyncClientT)[source]
Bases:
APIResourceProvides access to chat completion operations.
This class manages synchronous chat completion operations with Venice AI models, supporting both standard (non-streaming) and streaming response formats. It serves as the primary interface for chat-based interactions with Venice AI language models.
The class handles parameter validation, request formation, and response parsing for chat completion requests.
- Parameters:
_client (venice_ai._client.VeniceClient) – The client instance used to make API requests.
Example
from venice_ai import VeniceClient # Initialize the client client = VeniceClient(api_key="your-api-key") # Create a chat completion response = client.chat.completions.create( model="venice-1", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Tell me about Venice AI."} ] ) # Access the response content print(response["choices"][0]["message"]["content"])
- Parameters:
client (
typing.TypeVar(SyncClientT, bound= VeniceClient))
- create(*, model: str, messages: Sequence[venice_ai.types.chat.MessageParam], stream: bool = False, stream_cls: Type[venice_ai.types.chat.ChunkModelFactory[venice_ai.types.chat.ChatCompletionChunk]] | None = None, **kwargs: Any)[source]
Create a model response for the given chat conversation.
This method handles the core functionality of the chat completions API, allowing for both synchronous and streaming responses. It sends the provided messages and parameters to the Venice AI API and returns either a complete response or a stream of partial responses.
The method automatically formats the request body, applies appropriate defaults, and routes the request to either the standard or streaming endpoint based on the
streamparameter.- Parameters:
model (str) – ID of the model to use (e.g.,
"venice-1","llama-3.3-70b").messages (Sequence[venice_ai.types.chat.MessageParam]) – Sequence of messages forming the conversation.
stream (bool) – If
True, stream back partial progress. Defaults toFalse. Returns anIterator[ChatCompletionChunk]ifTrue, otherwiseChatCompletion.stream_cls (Optional[Type[venice_ai.types.chat.ChunkModelFactory[venice_ai.types.chat.ChatCompletionChunk]]]) – Optional stream wrapper class for streaming responses. Must conform to the ChunkModelFactory protocol.
frequency_penalty (Optional[float]) – Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far.
max_tokens (Optional[int]) – Deprecated. Please use
max_completion_tokensinstead. The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length.max_completion_tokens (Optional[int]) – Maximum number of tokens that can be generated in the chat completion.
n (Optional[int]) – Number of chat completion choices to generate for each input message.
presence_penalty (Optional[float]) – Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.
response_format (Optional[venice_ai.types.chat.ResponseFormat]) – Specifies the format that the model must output (e.g., for JSON mode).
seed (Optional[int]) – Random seed for reproducible outputs.
stop (Optional[Union[str, Sequence[str]]]) – Up to 4 sequences where the API will stop generating further tokens.
temperature (Optional[float]) – Sampling temperature between 0.0 and 2.0. Higher values make output more random, lower values more focused and deterministic. Defaults to 0.7.
top_p (Optional[float]) – Nucleus sampling parameter between 0.0 and 1.0. Defaults to 1.0.
tools (Optional[Sequence[venice_ai.types.chat.Tool]]) – List of tools the model may call.
tool_choice (Optional[Union[Literal["none", "auto"], venice_ai.types.chat.ToolChoiceObject]]) – Controls which (if any) tool is called by the model. Can be
"none","auto", or a specific tool.user (Optional[str]) – Unique identifier representing your end-user (discarded by API but supported for OpenAI compatibility).
venice_parameters (Optional[venice_ai.types.chat.VeniceParameters]) – Venice-specific parameters for fine-tuning model behavior.
logprobs (Optional[bool]) – Whether to return log probabilities of the output tokens.
top_logprobs (Optional[int]) – Number of most likely tokens to return at each token position if
logprobsisTrue.parallel_tool_calls (Optional[bool]) – Whether to enable parallel function calling during tool use.
repetition_penalty (Optional[float]) – Penalty for token repetition.
stop_token_ids (Optional[Sequence[int]]) – List of token IDs at which to stop generation.
top_k (Optional[int]) – Number of highest probability vocabulary tokens to keep for top-k-filtering.
stream_options (Optional[venice_ai.types.chat.StreamOptions]) – Additional options for controlling streaming behavior.
logit_bias (Optional[Dict[str, int]]) – Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100.
kwargs (
typing.Any) – Additional keyword arguments.
- Returns:
A
ChatCompletionifstreamisFalse, otherwise anIteratorofChatCompletionChunk.- Return type:
Union[venice_ai.types.chat.ChatCompletion, Iterator[venice_ai.types.chat.ChatCompletionChunk]]
- Raises:
venice_ai.exceptions.InvalidRequestError – If parameters are invalid or malformed.
venice_ai.exceptions.AuthenticationError – If the API key is invalid or missing.
venice_ai.exceptions.PermissionDeniedError – If access is denied to the requested model or feature.
venice_ai.exceptions.NotFoundError – If the model or resource is not found.
venice_ai.exceptions.RateLimitError – If rate limits are exceeded for the account.
venice_ai.exceptions.APIError – For other API-related errors not covered by specific exceptions.
Example
# Non-streaming usage with system and user messages from venice_ai import VeniceClient client = VeniceClient(api_key="your-api-key") response = client.chat.completions.create( model="llama-3.3-70b", messages=[ {"role": "system", "content": "You are a helpful assistant specializing in Python."}, {"role": "user", "content": "Write a function to calculate the Fibonacci sequence."} ], temperature=0.3 # More deterministic/focused response ) print(response["choices"][0]["message"]["content"]) # Streaming usage with progress display for chunk in client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Explain quantum computing briefly."}], stream=True, max_completion_tokens=250 # Limit response length ): content = chunk["choices"][0]["delta"].get("content", "") if content: print(content, end="", flush=True) # Using tools/function calling response = client.chat.completions.create( model="llama-3.3-70b", messages=[{"role": "user", "content": "What's the weather in New York?"}], tools=[{ "type": "function", "function": { "name": "get_weather", "description": "Get current weather for a location", "parameters": { "type": "object", "properties": { "location": {"type": "string", "description": "City name"}, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]} }, "required": ["location"] } } }] )
- class venice_ai.resources.chat.completions.AsyncChatCompletions(client: venice_ai._resource.AsyncClientT)[source]
Bases:
AsyncAPIResourceProvides access to asynchronous chat completion operations.
This class manages asynchronous chat completion operations with Venice AI models, supporting both standard (non-streaming) and streaming response formats. It serves as the primary interface for chat-based interactions with Venice AI language models in asynchronous contexts.
The class handles parameter validation, request formation, and response parsing for asynchronous chat completion requests.
- Parameters:
_client (venice_ai._async_client.AsyncVeniceClient) – The client instance used to make API requests.
Example
from venice_ai import AsyncVeniceClient import asyncio async def main(): # Initialize the async client client = AsyncVeniceClient(api_key="your-api-key") # Create a chat completion asynchronously response = await client.chat.completions.create( model="venice-1", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Tell me about Venice AI."} ] ) # Access the response content print(response["choices"][0]["message"]["content"]) # Run the async function asyncio.run(main())
- Parameters:
client (
typing.TypeVar(AsyncClientT, bound= AsyncVeniceClient))
- async create(*, model: str, messages: Sequence[venice_ai.types.chat.MessageParam], stream: bool = False, stream_cls: Type[venice_ai.types.chat.ChunkModelFactory[venice_ai.types.chat.ChatCompletionChunk]] | None = None, **kwargs: Any)[source]
Create a model response for the given chat conversation asynchronously.
This method handles the core functionality of the chat completions API, allowing for both synchronous and streaming responses in async contexts. It sends the provided messages and parameters to the Venice AI API and returns either a complete response or a stream of partial responses.
The method automatically formats the request body, applies appropriate defaults, and routes the request to either the standard or streaming endpoint based on the
streamparameter.- Parameters:
model (str) – ID of the model to use (e.g.,
"venice-1","llama-3.3-70b").messages (Sequence[venice_ai.types.chat.MessageParam]) – Sequence of messages forming the conversation.
stream (bool) – If
True, stream back partial progress. Defaults toFalse. Returns anAsyncIterator[ChatCompletionChunk]ifTrue, otherwiseChatCompletion.stream_cls (Optional[Type[venice_ai.types.chat.ChunkModelFactory[venice_ai.types.chat.ChatCompletionChunk]]]) – Optional stream wrapper class for streaming responses. Must conform to the ChunkModelFactory protocol.
frequency_penalty (Optional[float]) – Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far.
max_tokens (Optional[int]) – Deprecated. Please use
max_completion_tokensinstead. The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length.max_completion_tokens (Optional[int]) – Maximum number of tokens that can be generated in the chat completion.
n (Optional[int]) – Number of chat completion choices to generate for each input message.
presence_penalty (Optional[float]) – Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.
response_format (Optional[venice_ai.types.chat.ResponseFormat]) – Specifies the format that the model must output (e.g., for JSON mode).
seed (Optional[int]) – Random seed for reproducible outputs.
stop (Optional[Union[str, Sequence[str]]]) – Up to 4 sequences where the API will stop generating further tokens.
temperature (Optional[float]) – Sampling temperature between 0.0 and 2.0. Higher values make output more random, lower values more focused and deterministic. Defaults to 0.7.
top_p (Optional[float]) – Nucleus sampling parameter between 0.0 and 1.0. Defaults to 1.0.
tools (Optional[Sequence[venice_ai.types.chat.Tool]]) – List of tools the model may call.
tool_choice (Optional[Union[Literal["none", "auto"], venice_ai.types.chat.ToolChoiceObject]]) – Controls which (if any) tool is called by the model. Can be
"none","auto", or a specific tool.user (Optional[str]) – Unique identifier representing your end-user (discarded by API but supported for OpenAI compatibility).
venice_parameters (Optional[venice_ai.types.chat.VeniceParameters]) – Venice-specific parameters for fine-tuning model behavior.
logprobs (Optional[bool]) – Whether to return log probabilities of the output tokens.
top_logprobs (Optional[int]) – Number of most likely tokens to return at each token position if
logprobsisTrue.parallel_tool_calls (Optional[bool]) – Whether to enable parallel function calling during tool use.
repetition_penalty (Optional[float]) – Penalty for token repetition.
stop_token_ids (Optional[Sequence[int]]) – List of token IDs at which to stop generation.
top_k (Optional[int]) – Number of highest probability vocabulary tokens to keep for top-k-filtering.
stream_options (Optional[venice_ai.types.chat.StreamOptions]) – Additional options for controlling streaming behavior.
logit_bias (Optional[Dict[str, int]]) – Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100.
kwargs (
typing.Any) – Additional keyword arguments.
- Returns:
A
ChatCompletionifstreamisFalse, otherwise anAsyncIteratorofChatCompletionChunk.- Return type:
Union[venice_ai.types.chat.ChatCompletion, AsyncIterator[venice_ai.types.chat.ChatCompletionChunk]]
- Raises:
venice_ai.exceptions.InvalidRequestError – If parameters are invalid or malformed.
venice_ai.exceptions.AuthenticationError – If the API key is invalid or missing.
venice_ai.exceptions.PermissionDeniedError – If access is denied to the requested model or feature.
venice_ai.exceptions.NotFoundError – If the model or resource is not found.
venice_ai.exceptions.RateLimitError – If rate limits are exceeded for the account.
venice_ai.exceptions.APIError – For other API-related errors not covered by specific exceptions.
Example
# Non-streaming async usage import asyncio from venice_ai import AsyncVeniceClient async def main(): client = AsyncVeniceClient(api_key="your-api-key") response = await client.chat.completions.create( model="llama-3.3-70b", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Explain async programming in Python."} ], temperature=0.3 ) print(response["choices"][0]["message"]["content"]) asyncio.run(main()) # Async streaming usage async def stream_example(): client = AsyncVeniceClient(api_key="your-api-key") async for chunk in await client.chat.completions.create( model="venice-1", messages=[{"role": "user", "content": "Tell me a story."}], stream=True, max_completion_tokens=200 ): content = chunk["choices"][0]["delta"].get("content", "") if content: print(content, end="", flush=True) asyncio.run(stream_example())
Models Resources¶
- class venice_ai.resources.models.Models(client: venice_ai._resource.SyncClientT)[source]
Bases:
APIResourceProvides access to model listing and capability information.
This class manages synchronous operations for retrieving information about available AI models, their traits, and compatibility mappings. It provides methods to list models, query model traits (semantic shortcuts), and get compatibility mappings between external model names and Venice model IDs.
- Parameters:
client (VeniceClient) – The Venice client instance used for API requests.
Example
Basic usage through a Venice client:
from venice_ai import VeniceClient client = VeniceClient() models = client.models.list() for model in models.data: print(f"Model: {model.name} (ID: {model.id})")
- list(*, type: Literal['embedding', 'image', 'text', 'tts', 'upscale'] | None = None)[source]
Lists available models.
Retrieves a list of AI models available through the Venice API. Models can optionally be filtered by type to narrow down results to specific categories such as text generation, image generation, or embedding models.
- Parameters:
type (Optional[venice_ai.types.models.ModelType]) – Optional filter for model type. Valid values include
"text","image","embedding","tts", and"upscale". If not provided, all available models are returned.- Returns:
A list of available models with their metadata, capabilities, and pricing information.
- Return type:
venice_ai.types.models.ModelList
- Raises:
venice_ai.exceptions.APIError – If an API error occurs during the request.
Example
List all available models:
models = client.models.list() for model in models.data: print(f"Model ID: {model.id}, Name: {model.name}")
Filter models by type:
chat_models = client.models.list(type="text") image_models = client.models.list(type="image")
- list_compatibility(*, type: Literal['embedding', 'image', 'text', 'tts', 'upscale'] | None = None)[source]
Lists model compatibility mapping between external model names and Venice model IDs.
Retrieves a mapping that allows applications to reference external model identifiers (e.g., from other AI platforms like OpenAI) and have them automatically mapped to equivalent Venice models. This compatibility layer facilitates smoother transitions when migrating applications from other AI platforms to Venice.
- Parameters:
type (Optional[venice_ai.types.models.ModelType]) – Optional filter for model type. Only compatibility mappings for models of the specified type will be returned. Valid values include
"text","image","embedding","tts", and"upscale".- Returns:
A mapping of external model names to their equivalent Venice model IDs.
- Return type:
venice_ai.types.models.ModelCompatibilityList
- Raises:
venice_ai.exceptions.APIError – If an API error occurs during the request.
Example
Get all compatibility mappings:
compatibility = client.models.list_compatibility() venice_model = compatibility.data.get("gpt-4") print(f"GPT-4 maps to Venice model: {venice_model}")
Get compatibility for specific model type:
text_compat = client.models.list_compatibility(type="text") for external_name, venice_id in text_compat.data.items(): print(f"{external_name} -> {venice_id}")
- list_traits(*, type: Literal['embedding', 'image', 'text', 'tts', 'upscale'] | None = None)[source]
Lists model traits and their associated model IDs.
Retrieves a mapping of semantic trait names (e.g., “default”, “fastest”, “best”) to their corresponding model IDs. Traits provide convenient shortcuts for selecting models based on desired characteristics rather than specific model identifiers, making it easier to choose appropriate models without needing to know exact model versions or IDs.
- Parameters:
type (Optional[venice_ai.types.models.ModelType]) – Optional filter for model type. Only traits for models of the specified type will be returned. Valid values include
"text","image","embedding","tts", and"upscale".- Returns:
A mapping of trait names to their corresponding model IDs.
- Return type:
venice_ai.types.models.ModelTraitList
- Raises:
venice_ai.exceptions.APIError – If an API error occurs during the request.
Example
Get all model traits:
traits = client.models.list_traits() default_model = traits.data.get("default") fastest_model = traits.data.get("fastest")
Get traits for specific model type:
text_traits = client.models.list_traits(type="text") print(f"Default text model: {text_traits.data['default']}")
- class venice_ai.resources.models.AsyncModels(client: venice_ai._resource.AsyncClientT)[source]
Bases:
AsyncAPIResourceProvides access to model listing and capability information (asynchronous).
This class manages asynchronous operations for retrieving information about available AI models, their traits, and compatibility mappings. It provides async methods to list models, query model traits (semantic shortcuts), and get compatibility mappings between external model names and Venice model IDs.
- Parameters:
client (AsyncVeniceClient) – The async Venice client instance used for API requests.
Example
Basic usage through an async Venice client:
from venice_ai import AsyncVeniceClient async def main(): client = AsyncVeniceClient() models = await client.models.list() for model in models.data: print(f"Model: {model.name} (ID: {model.id})")
- async list(*, type: Literal['embedding', 'image', 'text', 'tts', 'upscale'] | None = None)[source]
Lists available models asynchronously.
Asynchronously retrieves a list of AI models available through the Venice API. Models can optionally be filtered by type to narrow down results to specific categories such as text generation, image generation, or embedding models.
- Parameters:
type (Optional[venice_ai.types.models.ModelType]) – Optional filter for model type. Valid values include
"text","image","embedding","tts", and"upscale". If not provided, all available models are returned.- Returns:
A list of available models with their metadata, capabilities, and pricing information.
- Return type:
venice_ai.types.models.ModelList
- Raises:
venice_ai.exceptions.APIError – If an API error occurs during the request.
Example
List all available models:
models = await client.models.list() for model in models.data: print(f"Model ID: {model.id}, Name: {model.name}")
Filter models by type:
chat_models = await client.models.list(type="text") image_models = await client.models.list(type="image")
- async list_compatibility(*, type: Literal['embedding', 'image', 'text', 'tts', 'upscale'] | None = None)[source]
Lists model compatibility mapping between external model names and Venice model IDs asynchronously.
Asynchronously retrieves a mapping that allows applications to reference external model identifiers (e.g., from other AI platforms like OpenAI) and have them automatically mapped to equivalent Venice models. This compatibility layer facilitates smoother transitions when migrating applications from other AI platforms to Venice.
- Parameters:
type (Optional[venice_ai.types.models.ModelType]) – Optional filter for model type. Only compatibility mappings for models of the specified type will be returned. Valid values include
"text","image","embedding","tts", and"upscale".- Returns:
A mapping of external model names to their equivalent Venice model IDs.
- Return type:
venice_ai.types.models.ModelCompatibilityList
- Raises:
venice_ai.exceptions.APIError – If an API error occurs during the request.
Example
Get all compatibility mappings:
compatibility = await client.models.list_compatibility() venice_model = compatibility.data.get("gpt-4") print(f"GPT-4 maps to Venice model: {venice_model}")
Get compatibility for specific model type:
text_compat = await client.models.list_compatibility(type="text") for external_name, venice_id in text_compat.data.items(): print(f"{external_name} -> {venice_id}")
- async list_traits(*, type: Literal['embedding', 'image', 'text', 'tts', 'upscale'] | None = None)[source]
Lists model traits and their associated model IDs asynchronously.
Asynchronously retrieves a mapping of semantic trait names (e.g., “default”, “fastest”, “best”) to their corresponding model IDs. Traits provide convenient shortcuts for selecting models based on desired characteristics rather than specific model identifiers, making it easier to choose appropriate models without needing to know exact model versions or IDs.
- Parameters:
type (Optional[venice_ai.types.models.ModelType]) – Optional filter for model type. Only traits for models of the specified type will be returned. Valid values include
"text","image","embedding","tts", and"upscale".- Returns:
A mapping of trait names to their corresponding model IDs.
- Return type:
venice_ai.types.models.ModelTraitList
- Raises:
venice_ai.exceptions.APIError – If an API error occurs during the request.
Example
Get all model traits:
traits = await client.models.list_traits() default_model = traits.data.get("default") fastest_model = traits.data.get("fastest")
Get traits for specific model type:
text_traits = await client.models.list_traits(type="text") print(f"Default text model: {text_traits.data['default']}")
Image Resources¶
Resource for interacting with the Venice AI image-related API endpoints.
This module provides both synchronous and asynchronous classes for generating images, upscaling images, and listing available image styles. It implements the core functionality for the Venice AI image generation services through a clean, typed interface matching the API specification.
- class venice_ai.resources.image.AsyncImage(client: venice_ai._resource.AsyncClientT)[source]
Provides access to asynchronous image generation, upscaling, and style listing operations.
This class manages asynchronous image operations using Venice AI’s image API. It mirrors the
Imageclass functionality but uses non-blocking async/await operations for use in asyncio applications.All methods return awaitable coroutines. For synchronous calls, use the
Imageclass.- Parameters:
client (venice_ai._async_client.AsyncVeniceClient) – The async Venice AI client instance used for making API requests.
- async generate(*, model: str, prompt: str, cfg_scale: float | None = None, embed_exif_metadata: bool | None = None, format: Literal['jpeg', 'png', 'webp'] | None = None, height: int | None = None, hide_watermark: bool | None = None, lora_strength: int | None = None, num_images: int | None = None, negative_prompt: str | None = None, return_binary: bool | None = None, safe_mode: bool | None = None, seed: int | None = None, steps: int | None = None, style_preset: str | None = None, width: int | None = None)[source]
Generate an image using Venice AI’s image generation API asynchronously.
This method creates a new image based on a text prompt using the specified model, executing the request asynchronously for use in async/await contexts. It provides comprehensive control over the image generation process with multiple parameters to customize the output.
- Parameters:
model (str) – Model ID for image generation (e.g.,
"venice-sd35").prompt (str) – Text prompt describing the image to generate.
cfg_scale (Optional[float]) – Optional. Classifier Free Guidance scale (1.0-30.0). Higher values adhere more strictly to the prompt.
embed_exif_metadata (Optional[bool]) – Optional. If
True, embed generation metadata in EXIF data.format (Optional[Literal["jpeg", "png", "webp"]]) – Optional. Output image format.
height (Optional[int]) – Optional. Height of the generated image in pixels.
hide_watermark (Optional[bool]) – Optional. If
True, hide Venice AI watermark from the generated image.lora_strength (Optional[int]) – Optional. Strength of LoRA model adaptation (0-100).
num_images (Optional[int]) – Optional. Number of images to generate (typically 1-10).
negative_prompt (Optional[str]) – Optional. Text describing what to avoid in the generated image.
return_binary (Optional[bool]) – Optional. If
True, return raw image bytes instead of JSON response with base64 data.safe_mode (Optional[bool]) – Optional. If
True, enable content filtering for safer outputs.seed (Optional[int]) – Optional. Random seed for reproducible image generation results.
steps (Optional[int]) – Optional. Number of diffusion steps. Higher values generally improve quality but increase generation time.
style_preset (Optional[str]) – Optional. Style preset ID from
list_styles()to apply to the generated image.width (Optional[int]) – Optional. Width of the generated image in pixels.
- Returns:
Response containing generated image data as base64 string, or raw image bytes if
return_binaryisTrue.- Return type:
Union[
ImageResponse, bytes]- Raises:
venice_ai.exceptions.APIError – If an API error occurs during image generation.
Example:
# client = AsyncVeniceClient() response = await client.image.generate( model="venice-sd35", prompt="A serene landscape with mountains and a lake", width=1024, height=768, steps=30 ) # Process response.images[0] (base64 string)
- async get_available_styles()[source]
Retrieve the list of available image generation styles from the API asynchronously.
This method fetches the most up-to-date list of styles that can be used for image generation, such as ‘cinematic’, ‘photorealistic’, etc. It performs this operation asynchronously for use in async/await contexts.
- Returns:
An object containing a list of available image styles.
- Return type:
ImageStyleList- Raises:
venice_ai.exceptions.APIError – If an API error occurs during the request.
Example:
# client = AsyncVeniceClient() styles_response = await client.image.get_available_styles() for style_name in styles_response.data: print(f"Available style: {style_name}")
- async list_styles()[source]
List available image style presets asynchronously for use with image generation.
This method retrieves all available style presets that can be used with the
style_presetparameter in thegenerate()method to influence the aesthetic and artistic style of generated images. It performs this operation asynchronously.- Returns:
A list of available image style presets with their identifiers.
- Return type:
ImageStyleList- Raises:
venice_ai.exceptions.APIError – If an API error occurs while retrieving styles.
- async simple_generate(*, model: str, prompt: str, background: Literal['transparent', 'opaque', 'auto'] | None = None, moderation: Literal['low', 'auto'] | None = None, n: int | None = None, output_compression: int | None = None, output_format: Literal['jpeg', 'png', 'webp'] | None = None, quality: Literal['auto', 'high', 'medium', 'low', 'hd', 'standard'] | None = None, response_format: Literal['b64_json', 'url'] | None = None, size: Literal['auto', '256x256', '512x512', '1024x1024', '1536x1024', '1024x1536', '1792x1024', '1024x1792'] | None = None, style: Literal['vivid', 'natural'] | None = None, user: str | None = None)[source]
Generate an image using Venice AI’s simple image generation API asynchronously (OpenAI-compatible).
This method provides a simplified interface for image generation that’s compatible with OpenAI’s DALL-E API format, executed asynchronously for use in async/await contexts. It’s designed to be easier to use than the full
generate()method while still providing essential customization options.- Parameters:
model (str) – Model ID for image generation.
prompt (str) – Text prompt describing the image to generate.
background (Optional[Literal["transparent", "opaque", "auto"]]) – Optional. Background style for the generated image.
moderation (Optional[Literal["low", "auto"]]) – Optional. Content moderation level to apply.
n (Optional[int]) – Optional. Number of images to generate (typically 1-10).
output_compression (Optional[int]) – Optional. Output image compression level (0-100, where 100 is highest quality).
output_format (Optional[Literal["jpeg", "png", "webp"]]) – Optional. Output image format.
quality (Optional[Literal["auto", "high", "medium", "low", "hd", "standard"]]) – Optional. Image quality setting.
response_format (Optional[Literal["b64_json", "url"]]) – Optional. Format of the response data.
size (Optional[Literal["auto", "256x256", "512x512", "1024x1024", "1536x1024", "1024x1536", "1792x1024", "1024x1792"]]) – Optional. Dimensions of the generated image.
style (Optional[Literal["vivid", "natural"]]) – Optional. Style of the generated image.
user (Optional[str]) – Optional. User identifier for tracking and analytics purposes.
- Returns:
API response containing generated images as base64 data or URLs.
- Return type:
SimpleImageResponse- Raises:
venice_ai.exceptions.APIError – If an API error occurs during image generation.
Example:
# client = AsyncVeniceClient() response = await client.image.simple_generate( model="venice-sd35", prompt="A cute cat sitting on a windowsill", size="1024x1024", style="natural" ) # Process response.data[0] (ImageDataItem)
- async upscale(*, image: str | bytes | BinaryIO, enhance: bool | None = None, enhance_creativity: float | None = None, enhance_prompt: bool | None = None, replication: float | None = None, scale: float | None = None, timeout: float | httpx.Timeout | None = None)[source]
Upscale an image using Venice AI’s image upscaling API asynchronously.
This method allows for increasing the resolution of an image while maintaining or enhancing its quality using Venice AI’s upscaling technology, in an asynchronous manner compatible with asyncio applications.
- Parameters:
image (Union[str, bytes, BinaryIO]) – Image to upscale. Can be a file path (string), raw image bytes, or a file-like object.
enhance (Optional[bool]) – Optional. Whether to enhance image quality during upscaling.
enhance_creativity (Optional[float]) – Optional. Creativity level for enhancement (0.0-1.0, where 1.0 is most creative).
enhance_prompt (Optional[bool]) – Optional. Whether to use text prompt guidance for enhancement.
replication (Optional[float]) – Optional. Replication factor for matching the original image (0.0-1.0, where 1.0 matches exactly).
scale (Optional[float]) – Optional. Scaling factor for upscaling (e.g.,
2.0for 2x upscaling).timeout (Optional[Union[float, httpx.Timeout]]) – Optional. Request timeout configuration.
- Returns:
Raw bytes of the upscaled image.
- Return type:
- Raises:
ValueError – If image path is invalid or image type is unsupported.
TypeError – If image content type is unsupported.
venice_ai.exceptions.APIError – If an API error occurs during upscaling.
- class venice_ai.resources.image.Image(client: venice_ai._resource.SyncClientT)[source]
Provides access to image generation, upscaling, and style listing operations.
This class manages synchronous image operations using Venice AI’s image API. It encapsulates functionality for image generation, upscaling, and style listing through a clean, typed interface that makes synchronous HTTP requests.
All methods in this class make synchronous HTTP requests. For non-blocking calls, use the
AsyncImageclass.- Parameters:
client (venice_ai._client.VeniceClient) – The Venice AI client instance used for making API requests.
- generate(*, model: str, prompt: str, cfg_scale: float | None = None, embed_exif_metadata: bool | None = None, format: Literal['jpeg', 'png', 'webp'] | None = None, height: int | None = None, hide_watermark: bool | None = None, lora_strength: int | None = None, num_images: int | None = None, negative_prompt: str | None = None, return_binary: bool | None = None, safe_mode: bool | None = None, seed: int | None = None, steps: int | None = None, style_preset: str | None = None, width: int | None = None)[source]
Generate an image using Venice AI’s image generation API.
This method creates a new image based on a text prompt using the specified model. It provides comprehensive control over the image generation process with multiple parameters to customize the output.
- Parameters:
model (str) – Model ID for image generation (e.g.,
"venice-sd35").prompt (str) – Text prompt describing the image to generate.
cfg_scale (Optional[float]) – Optional. Classifier Free Guidance scale (1.0-30.0). Higher values adhere more strictly to the prompt.
embed_exif_metadata (Optional[bool]) – Optional. If
True, embed generation metadata in EXIF data.format (Optional[Literal["jpeg", "png", "webp"]]) – Optional. Output image format.
height (Optional[int]) – Optional. Height of the generated image in pixels.
hide_watermark (Optional[bool]) – Optional. If
True, hide Venice AI watermark from the generated image.lora_strength (Optional[int]) – Optional. Strength of LoRA model adaptation (0-100).
num_images (Optional[int]) – Optional. Number of images to generate (typically 1-10).
negative_prompt (Optional[str]) – Optional. Text describing what to avoid in the generated image.
return_binary (Optional[bool]) – Optional. If
True, return raw image bytes instead of JSON response with base64 data.safe_mode (Optional[bool]) – Optional. If
True, enable content filtering for safer outputs.seed (Optional[int]) – Optional. Random seed for reproducible image generation results.
steps (Optional[int]) – Optional. Number of diffusion steps. Higher values generally improve quality but increase generation time.
style_preset (Optional[str]) – Optional. Style preset ID from
list_styles()to apply to the generated image.width (Optional[int]) – Optional. Width of the generated image in pixels.
- Returns:
Response containing generated image data as base64 string, or raw image bytes if
return_binaryisTrue.- Return type:
Union[
ImageResponse, bytes]- Raises:
venice_ai.exceptions.APIError – If an API error occurs during image generation.
Example:
# client = VeniceClient() response = client.image.generate( model="venice-sd35", prompt="A serene landscape with mountains and a lake", width=1024, height=768, steps=30 ) # Process response.images[0] (base64 string)
- get_available_styles()[source]
Retrieve the list of available image generation styles from the API.
This method fetches the most up-to-date list of styles that can be used for image generation, such as ‘cinematic’, ‘photorealistic’, etc.
- Returns:
An object containing a list of available image styles.
- Return type:
ImageStyleList- Raises:
venice_ai.exceptions.APIError – If an API error occurs during the request.
Example:
# client = VeniceClient() styles_response = client.image.get_available_styles() for style_name in styles_response.data: print(f"Available style: {style_name}")
- list_styles()[source]
List available image style presets for use with image generation.
This method retrieves all available style presets that can be used with the
style_presetparameter in thegenerate()method to influence the aesthetic and artistic style of generated images.- Returns:
A list of available image style presets with their identifiers.
- Return type:
ImageStyleList- Raises:
venice_ai.exceptions.APIError – If an API error occurs while retrieving styles.
- simple_generate(*, model: str, prompt: str, background: Literal['transparent', 'opaque', 'auto'] | None = None, moderation: Literal['low', 'auto'] | None = None, n: int | None = None, output_compression: int | None = None, output_format: Literal['jpeg', 'png', 'webp'] | None = None, quality: Literal['auto', 'high', 'medium', 'low', 'hd', 'standard'] | None = None, response_format: Literal['b64_json', 'url'] | None = None, size: Literal['auto', '256x256', '512x512', '1024x1024', '1536x1024', '1024x1536', '1792x1024', '1024x1792'] | None = None, style: Literal['vivid', 'natural'] | None = None, user: str | None = None)[source]
Generate an image using Venice AI’s simple image generation API (OpenAI-compatible).
This method provides a simplified interface for image generation that’s compatible with OpenAI’s DALL-E API format. It’s designed to be easier to use than the full
generate()method while still providing essential customization options.- Parameters:
model (str) – Model ID for image generation.
prompt (str) – Text prompt describing the image to generate.
background (Optional[Literal["transparent", "opaque", "auto"]]) – Optional. Background style for the generated image.
moderation (Optional[Literal["low", "auto"]]) – Optional. Content moderation level to apply.
n (Optional[int]) – Optional. Number of images to generate (typically 1-10).
output_compression (Optional[int]) – Optional. Output image compression level (0-100, where 100 is highest quality).
output_format (Optional[Literal["jpeg", "png", "webp"]]) – Optional. Output image format.
quality (Optional[Literal["auto", "high", "medium", "low", "hd", "standard"]]) – Optional. Image quality setting.
response_format (Optional[Literal["b64_json", "url"]]) – Optional. Format of the response data.
size (Optional[Literal["auto", "256x256", "512x512", "1024x1024", "1536x1024", "1024x1536", "1792x1024", "1024x1792"]]) – Optional. Dimensions of the generated image.
style (Optional[Literal["vivid", "natural"]]) – Optional. Style of the generated image.
user (Optional[str]) – Optional. User identifier for tracking and analytics purposes.
- Returns:
API response containing generated images as base64 data or URLs.
- Return type:
SimpleImageResponse- Raises:
venice_ai.exceptions.APIError – If an API error occurs during image generation.
Example:
# client = VeniceClient() response = client.image.simple_generate( model="venice-sd35", prompt="A cute cat sitting on a windowsill", size="1024x1024", style="natural" ) # Process response.data[0] (ImageDataItem)
- upscale(*, image: str | bytes | BinaryIO, enhance: bool | None = None, enhance_creativity: float | None = None, enhance_prompt: bool | None = None, replication: float | None = None, scale: float | None = None, timeout: float | httpx.Timeout | None = None)[source]
Upscale an image using Venice AI’s image upscaling API.
This method allows for increasing the resolution of an image while maintaining or enhancing its quality using Venice AI’s upscaling technology.
- Parameters:
image (Union[str, bytes, BinaryIO]) – Image to upscale. Can be a file path (string), raw image bytes, or a file-like object.
enhance (Optional[bool]) – Optional. Whether to enhance image quality during upscaling.
enhance_creativity (Optional[float]) – Optional. Creativity level for enhancement (0.0-1.0, where 1.0 is most creative).
enhance_prompt (Optional[bool]) – Optional. Whether to use text prompt guidance for enhancement.
replication (Optional[float]) – Optional. Replication factor for matching the original image (0.0-1.0, where 1.0 matches exactly).
scale (Optional[float]) – Optional. Scaling factor for upscaling (e.g.,
2.0for 2x upscaling).timeout (Optional[Union[float, httpx.Timeout]]) – Optional. Request timeout configuration.
- Returns:
Raw bytes of the upscaled image.
- Return type:
- Raises:
ValueError – If image path is invalid or image type is unsupported.
TypeError – If image content type is unsupported.
venice_ai.exceptions.APIError – If an API error occurs during upscaling.
- class venice_ai.resources.image.Image(client: venice_ai._resource.SyncClientT)[source]
Bases:
APIResourceProvides access to image generation, upscaling, and style listing operations.
This class manages synchronous image operations using Venice AI’s image API. It encapsulates functionality for image generation, upscaling, and style listing through a clean, typed interface that makes synchronous HTTP requests.
All methods in this class make synchronous HTTP requests. For non-blocking calls, use the
AsyncImageclass.- Parameters:
client (venice_ai._client.VeniceClient) – The Venice AI client instance used for making API requests.
- generate(*, model: str, prompt: str, cfg_scale: float | None = None, embed_exif_metadata: bool | None = None, format: Literal['jpeg', 'png', 'webp'] | None = None, height: int | None = None, hide_watermark: bool | None = None, lora_strength: int | None = None, num_images: int | None = None, negative_prompt: str | None = None, return_binary: bool | None = None, safe_mode: bool | None = None, seed: int | None = None, steps: int | None = None, style_preset: str | None = None, width: int | None = None)[source]
Generate an image using Venice AI’s image generation API.
This method creates a new image based on a text prompt using the specified model. It provides comprehensive control over the image generation process with multiple parameters to customize the output.
- Parameters:
model (str) – Model ID for image generation (e.g.,
"venice-sd35").prompt (str) – Text prompt describing the image to generate.
cfg_scale (Optional[float]) – Optional. Classifier Free Guidance scale (1.0-30.0). Higher values adhere more strictly to the prompt.
embed_exif_metadata (Optional[bool]) – Optional. If
True, embed generation metadata in EXIF data.format (Optional[Literal["jpeg", "png", "webp"]]) – Optional. Output image format.
height (Optional[int]) – Optional. Height of the generated image in pixels.
hide_watermark (Optional[bool]) – Optional. If
True, hide Venice AI watermark from the generated image.lora_strength (Optional[int]) – Optional. Strength of LoRA model adaptation (0-100).
num_images (Optional[int]) – Optional. Number of images to generate (typically 1-10).
negative_prompt (Optional[str]) – Optional. Text describing what to avoid in the generated image.
return_binary (Optional[bool]) – Optional. If
True, return raw image bytes instead of JSON response with base64 data.safe_mode (Optional[bool]) – Optional. If
True, enable content filtering for safer outputs.seed (Optional[int]) – Optional. Random seed for reproducible image generation results.
steps (Optional[int]) – Optional. Number of diffusion steps. Higher values generally improve quality but increase generation time.
style_preset (Optional[str]) – Optional. Style preset ID from
list_styles()to apply to the generated image.width (Optional[int]) – Optional. Width of the generated image in pixels.
- Returns:
Response containing generated image data as base64 string, or raw image bytes if
return_binaryisTrue.- Return type:
Union[
ImageResponse, bytes]- Raises:
venice_ai.exceptions.APIError – If an API error occurs during image generation.
Example:
# client = VeniceClient() response = client.image.generate( model="venice-sd35", prompt="A serene landscape with mountains and a lake", width=1024, height=768, steps=30 ) # Process response.images[0] (base64 string)
- list_styles()[source]
List available image style presets for use with image generation.
This method retrieves all available style presets that can be used with the
style_presetparameter in thegenerate()method to influence the aesthetic and artistic style of generated images.- Returns:
A list of available image style presets with their identifiers.
- Return type:
ImageStyleList- Raises:
venice_ai.exceptions.APIError – If an API error occurs while retrieving styles.
- simple_generate(*, model: str, prompt: str, background: Literal['transparent', 'opaque', 'auto'] | None = None, moderation: Literal['low', 'auto'] | None = None, n: int | None = None, output_compression: int | None = None, output_format: Literal['jpeg', 'png', 'webp'] | None = None, quality: Literal['auto', 'high', 'medium', 'low', 'hd', 'standard'] | None = None, response_format: Literal['b64_json', 'url'] | None = None, size: Literal['auto', '256x256', '512x512', '1024x1024', '1536x1024', '1024x1536', '1792x1024', '1024x1792'] | None = None, style: Literal['vivid', 'natural'] | None = None, user: str | None = None)[source]
Generate an image using Venice AI’s simple image generation API (OpenAI-compatible).
This method provides a simplified interface for image generation that’s compatible with OpenAI’s DALL-E API format. It’s designed to be easier to use than the full
generate()method while still providing essential customization options.- Parameters:
model (str) – Model ID for image generation.
prompt (str) – Text prompt describing the image to generate.
background (Optional[Literal["transparent", "opaque", "auto"]]) – Optional. Background style for the generated image.
moderation (Optional[Literal["low", "auto"]]) – Optional. Content moderation level to apply.
n (Optional[int]) – Optional. Number of images to generate (typically 1-10).
output_compression (Optional[int]) – Optional. Output image compression level (0-100, where 100 is highest quality).
output_format (Optional[Literal["jpeg", "png", "webp"]]) – Optional. Output image format.
quality (Optional[Literal["auto", "high", "medium", "low", "hd", "standard"]]) – Optional. Image quality setting.
response_format (Optional[Literal["b64_json", "url"]]) – Optional. Format of the response data.
size (Optional[Literal["auto", "256x256", "512x512", "1024x1024", "1536x1024", "1024x1536", "1792x1024", "1024x1792"]]) – Optional. Dimensions of the generated image.
style (Optional[Literal["vivid", "natural"]]) – Optional. Style of the generated image.
user (Optional[str]) – Optional. User identifier for tracking and analytics purposes.
- Returns:
API response containing generated images as base64 data or URLs.
- Return type:
SimpleImageResponse- Raises:
venice_ai.exceptions.APIError – If an API error occurs during image generation.
Example:
# client = VeniceClient() response = client.image.simple_generate( model="venice-sd35", prompt="A cute cat sitting on a windowsill", size="1024x1024", style="natural" ) # Process response.data[0] (ImageDataItem)
- upscale(*, image: str | bytes | BinaryIO, enhance: bool | None = None, enhance_creativity: float | None = None, enhance_prompt: bool | None = None, replication: float | None = None, scale: float | None = None, timeout: float | httpx.Timeout | None = None)[source]
Upscale an image using Venice AI’s image upscaling API.
This method allows for increasing the resolution of an image while maintaining or enhancing its quality using Venice AI’s upscaling technology.
- Parameters:
image (Union[str, bytes, BinaryIO]) – Image to upscale. Can be a file path (string), raw image bytes, or a file-like object.
enhance (Optional[bool]) – Optional. Whether to enhance image quality during upscaling.
enhance_creativity (Optional[float]) – Optional. Creativity level for enhancement (0.0-1.0, where 1.0 is most creative).
enhance_prompt (Optional[bool]) – Optional. Whether to use text prompt guidance for enhancement.
replication (Optional[float]) – Optional. Replication factor for matching the original image (0.0-1.0, where 1.0 matches exactly).
scale (Optional[float]) – Optional. Scaling factor for upscaling (e.g.,
2.0for 2x upscaling).timeout (Optional[Union[float, httpx.Timeout]]) – Optional. Request timeout configuration.
- Returns:
Raw bytes of the upscaled image.
- Return type:
- Raises:
ValueError – If image path is invalid or image type is unsupported.
TypeError – If image content type is unsupported.
venice_ai.exceptions.APIError – If an API error occurs during upscaling.
- class venice_ai.resources.image.AsyncImage(client: venice_ai._resource.AsyncClientT)[source]
Bases:
AsyncAPIResourceProvides access to asynchronous image generation, upscaling, and style listing operations.
This class manages asynchronous image operations using Venice AI’s image API. It mirrors the
Imageclass functionality but uses non-blocking async/await operations for use in asyncio applications.All methods return awaitable coroutines. For synchronous calls, use the
Imageclass.- Parameters:
client (venice_ai._async_client.AsyncVeniceClient) – The async Venice AI client instance used for making API requests.
- async generate(*, model: str, prompt: str, cfg_scale: float | None = None, embed_exif_metadata: bool | None = None, format: Literal['jpeg', 'png', 'webp'] | None = None, height: int | None = None, hide_watermark: bool | None = None, lora_strength: int | None = None, num_images: int | None = None, negative_prompt: str | None = None, return_binary: bool | None = None, safe_mode: bool | None = None, seed: int | None = None, steps: int | None = None, style_preset: str | None = None, width: int | None = None)[source]
Generate an image using Venice AI’s image generation API asynchronously.
This method creates a new image based on a text prompt using the specified model, executing the request asynchronously for use in async/await contexts. It provides comprehensive control over the image generation process with multiple parameters to customize the output.
- Parameters:
model (str) – Model ID for image generation (e.g.,
"venice-sd35").prompt (str) – Text prompt describing the image to generate.
cfg_scale (Optional[float]) – Optional. Classifier Free Guidance scale (1.0-30.0). Higher values adhere more strictly to the prompt.
embed_exif_metadata (Optional[bool]) – Optional. If
True, embed generation metadata in EXIF data.format (Optional[Literal["jpeg", "png", "webp"]]) – Optional. Output image format.
height (Optional[int]) – Optional. Height of the generated image in pixels.
hide_watermark (Optional[bool]) – Optional. If
True, hide Venice AI watermark from the generated image.lora_strength (Optional[int]) – Optional. Strength of LoRA model adaptation (0-100).
num_images (Optional[int]) – Optional. Number of images to generate (typically 1-10).
negative_prompt (Optional[str]) – Optional. Text describing what to avoid in the generated image.
return_binary (Optional[bool]) – Optional. If
True, return raw image bytes instead of JSON response with base64 data.safe_mode (Optional[bool]) – Optional. If
True, enable content filtering for safer outputs.seed (Optional[int]) – Optional. Random seed for reproducible image generation results.
steps (Optional[int]) – Optional. Number of diffusion steps. Higher values generally improve quality but increase generation time.
style_preset (Optional[str]) – Optional. Style preset ID from
list_styles()to apply to the generated image.width (Optional[int]) – Optional. Width of the generated image in pixels.
- Returns:
Response containing generated image data as base64 string, or raw image bytes if
return_binaryisTrue.- Return type:
Union[
ImageResponse, bytes]- Raises:
venice_ai.exceptions.APIError – If an API error occurs during image generation.
Example:
# client = AsyncVeniceClient() response = await client.image.generate( model="venice-sd35", prompt="A serene landscape with mountains and a lake", width=1024, height=768, steps=30 ) # Process response.images[0] (base64 string)
- async list_styles()[source]
List available image style presets asynchronously for use with image generation.
This method retrieves all available style presets that can be used with the
style_presetparameter in thegenerate()method to influence the aesthetic and artistic style of generated images. It performs this operation asynchronously.- Returns:
A list of available image style presets with their identifiers.
- Return type:
ImageStyleList- Raises:
venice_ai.exceptions.APIError – If an API error occurs while retrieving styles.
- async simple_generate(*, model: str, prompt: str, background: Literal['transparent', 'opaque', 'auto'] | None = None, moderation: Literal['low', 'auto'] | None = None, n: int | None = None, output_compression: int | None = None, output_format: Literal['jpeg', 'png', 'webp'] | None = None, quality: Literal['auto', 'high', 'medium', 'low', 'hd', 'standard'] | None = None, response_format: Literal['b64_json', 'url'] | None = None, size: Literal['auto', '256x256', '512x512', '1024x1024', '1536x1024', '1024x1536', '1792x1024', '1024x1792'] | None = None, style: Literal['vivid', 'natural'] | None = None, user: str | None = None)[source]
Generate an image using Venice AI’s simple image generation API asynchronously (OpenAI-compatible).
This method provides a simplified interface for image generation that’s compatible with OpenAI’s DALL-E API format, executed asynchronously for use in async/await contexts. It’s designed to be easier to use than the full
generate()method while still providing essential customization options.- Parameters:
model (str) – Model ID for image generation.
prompt (str) – Text prompt describing the image to generate.
background (Optional[Literal["transparent", "opaque", "auto"]]) – Optional. Background style for the generated image.
moderation (Optional[Literal["low", "auto"]]) – Optional. Content moderation level to apply.
n (Optional[int]) – Optional. Number of images to generate (typically 1-10).
output_compression (Optional[int]) – Optional. Output image compression level (0-100, where 100 is highest quality).
output_format (Optional[Literal["jpeg", "png", "webp"]]) – Optional. Output image format.
quality (Optional[Literal["auto", "high", "medium", "low", "hd", "standard"]]) – Optional. Image quality setting.
response_format (Optional[Literal["b64_json", "url"]]) – Optional. Format of the response data.
size (Optional[Literal["auto", "256x256", "512x512", "1024x1024", "1536x1024", "1024x1536", "1792x1024", "1024x1792"]]) – Optional. Dimensions of the generated image.
style (Optional[Literal["vivid", "natural"]]) – Optional. Style of the generated image.
user (Optional[str]) – Optional. User identifier for tracking and analytics purposes.
- Returns:
API response containing generated images as base64 data or URLs.
- Return type:
SimpleImageResponse- Raises:
venice_ai.exceptions.APIError – If an API error occurs during image generation.
Example:
# client = AsyncVeniceClient() response = await client.image.simple_generate( model="venice-sd35", prompt="A cute cat sitting on a windowsill", size="1024x1024", style="natural" ) # Process response.data[0] (ImageDataItem)
- async upscale(*, image: str | bytes | BinaryIO, enhance: bool | None = None, enhance_creativity: float | None = None, enhance_prompt: bool | None = None, replication: float | None = None, scale: float | None = None, timeout: float | httpx.Timeout | None = None)[source]
Upscale an image using Venice AI’s image upscaling API asynchronously.
This method allows for increasing the resolution of an image while maintaining or enhancing its quality using Venice AI’s upscaling technology, in an asynchronous manner compatible with asyncio applications.
- Parameters:
image (Union[str, bytes, BinaryIO]) – Image to upscale. Can be a file path (string), raw image bytes, or a file-like object.
enhance (Optional[bool]) – Optional. Whether to enhance image quality during upscaling.
enhance_creativity (Optional[float]) – Optional. Creativity level for enhancement (0.0-1.0, where 1.0 is most creative).
enhance_prompt (Optional[bool]) – Optional. Whether to use text prompt guidance for enhancement.
replication (Optional[float]) – Optional. Replication factor for matching the original image (0.0-1.0, where 1.0 matches exactly).
scale (Optional[float]) – Optional. Scaling factor for upscaling (e.g.,
2.0for 2x upscaling).timeout (Optional[Union[float, httpx.Timeout]]) – Optional. Request timeout configuration.
- Returns:
Raw bytes of the upscaled image.
- Return type:
- Raises:
ValueError – If image path is invalid or image type is unsupported.
TypeError – If image content type is unsupported.
venice_ai.exceptions.APIError – If an API error occurs during upscaling.
API Keys Resources¶
Resource module for interacting with the Venice API keys endpoints.
This module provides both synchronous and asynchronous resource classes for managing API keys. API keys are used for authentication and authorization when making requests to the Venice API. They control access to various Venice API features and endpoints, and are subject to rate limits that govern the number of requests that can be made within a specific time period.
- Classes:
ApiKeys: Synchronous client for API key management AsyncApiKeys: Asynchronous client for API key management
- class venice_ai.resources.api_keys.ApiKeys(client: venice_ai._resource.SyncClientT)[source]
Provides access to API key management operations.
This class implements the synchronous interface for API key management, including creating, listing, deleting API keys, and managing rate limits. It inherits from
APIResourcewhich handles the underlying HTTP requests.- Parameters:
_client (
VeniceClient) – The Venice client instance used for making API requests.
Example
from venice_ai import VeniceClient from venice_ai.types.api_keys import ApiKeyCreateRequest client = VeniceClient() # List existing API keys keys = client.api_keys.list(limit=10) for key in keys: print(f"Key ID: {key.id}, Description: {key.description}") # Create a new API key create_request = ApiKeyCreateRequest( description="My Test Key", apiKeyType="INFERENCE" ) new_key = client.api_keys.create(api_key_request=create_request) print(f"Created key: {new_key.apiKey}") # Only shown on creation
- Parameters:
client (
typing.TypeVar(SyncClientT, bound= VeniceClient))
- create(*, api_key_request: venice_ai.types.api_keys.ApiKeyCreateRequest)[source]
Creates a new API key.
Creates a new API key with the specified parameters. The created API key will be returned only once in the response and cannot be retrieved later, so it should be securely stored immediately.
- Parameters:
api_key_request (
ApiKeyCreateRequest) –Request object containing API key configuration. Must include at minimum a description and apiKeyType. The request can contain:
description(str): Human-readable description of the API keyapiKeyType(str): Type of API key (e.g., “INFERENCE”, “ADMIN”)expiresAt(Optional[str]): ISO 8601 timestamp when key expiresconsumptionLimit(Optional[int]): Maximum usage limit for the key
- Returns:
Response containing the newly created API key details, including the secret key value (only returned once), key ID, creation timestamp, and other metadata.
- Return type:
ApiKey- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error, such as when maximum API key limit is reached or invalid parameters are provided.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
from venice_ai.types.api_keys import ApiKeyCreateRequest # Create a basic API key create_request = ApiKeyCreateRequest( description="My Test Key", apiKeyType="INFERENCE" ) new_key = client.api_keys.create(api_key_request=create_request) print(f"Created key ID: {new_key.id}") print(f"API Key: {new_key.apiKey}") # Store this securely! # Create a key with expiration and limits advanced_request = ApiKeyCreateRequest( description="Limited Production Key", apiKeyType="INFERENCE", expiresAt="2024-12-31T23:59:59Z", consumptionLimit=10000 ) limited_key = client.api_keys.create(api_key_request=advanced_request)
- create_web3_key(*, web3_key_request: venice_ai.types.api_keys.ApiKeyGenerateWeb3KeyCreateRequest)[source]
Creates a new Web3 API key.
Creates a new API key authenticated via a Web3 signature.
- Parameters:
web3_key_request (
ApiKeyGenerateWeb3KeyCreateRequest) – Request body containing Web3 authentication details (such asweb3_network_id,web3_address, and signature) and API key parameters.- Returns:
Response containing the newly created API key details.
- Return type:
ApiKeyGenerateWeb3KeyCreateResponse- Raises:
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
- delete(*, api_key_id: str)[source]
Deletes an API key.
Permanently deletes the specified API key. Once deleted, the API key can no longer be used to authenticate requests and this action cannot be undone. Use with caution in production environments.
- Parameters:
api_key_id (str) – Unique identifier of the API key to delete. This is the key’s ID (not the secret key value) as returned by the create or list operations.
- Returns:
Response indicating the result of the deletion operation, typically containing a success flag and deletion confirmation message.
- Return type:
Dict[str, Any]
- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error, such as when the API key ID does not exist or belongs to another account.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# Delete an API key (use with caution) result = client.api_keys.delete(api_key_id="key_123456789") print(f"Deletion result: {result}") # Safe deletion pattern keys = client.api_keys.list() test_keys = [k for k in keys if "test" in k.description.lower()] for test_key in test_keys: client.api_keys.delete(api_key_id=test_key.id) print(f"Deleted test key: {test_key.id}")
- get_rate_limit_logs(*, api_key_id: str | None = None, start_date: str | None = None, end_date: str | None = None, limit: int | None = None, page: int | None = None)[source]
Retrieves rate limit logs for API keys.
Returns a history of rate limit events, such as when rate limits were reset, exceeded, or modified. This can be useful for understanding API usage patterns, diagnosing rate limit issues, and optimizing request timing.
- Parameters:
api_key_id (Optional[str]) – Specific API key ID to get logs for. If not provided, returns logs for the current API key.
start_date (Optional[str]) – Start date for log retrieval in ISO 8601 format (e.g., “2024-01-01T00:00:00Z”). If not provided, uses a default lookback period.
end_date (Optional[str]) – End date for log retrieval in ISO 8601 format (e.g., “2024-01-31T23:59:59Z”). If not provided, uses current time.
limit (Optional[int]) – Maximum number of log entries to return per page.
page (Optional[int]) – Page number for pagination (1-based indexing).
- Returns:
A list of rate limit log entries with timestamps, event types, and related metadata.
- Return type:
RateLimitLogList- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# Get recent rate limit logs logs = client.api_keys.get_rate_limit_logs(limit=10) for log_entry in logs: print(f"Event: {log_entry.event_type} at {log_entry.timestamp}") # Get logs for a specific date range logs = client.api_keys.get_rate_limit_logs( start_date="2024-01-01T00:00:00Z", end_date="2024-01-31T23:59:59Z", limit=50 )
- get_rate_limits()[source]
Retrieves rate limit information for the current API key.
Returns information about the rate limits applied to the current API key, including the limits per minute, hour, day, and month, as well as the current usage against those limits.
- Returns:
Rate limit information, including limits and current usage.
- Return type:
RateLimitInfo- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
- get_web3_token()[source]
Retrieves a token for Web3 API key generation.
This token is required for the subsequent POST request to create a Web3 API key.
- Returns:
Response containing the token required for Web3 key generation.
- Return type:
ApiKeyGenerateWeb3KeyGetResponse- Raises:
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
- list(*, page: int | None = None, limit: int | None = None)[source]
Lists API keys for the authenticated account, with optional pagination.
Retrieves a list of API keys associated with the current account. This includes active and inactive API keys. Supports pagination for managing large numbers of API keys.
- Parameters:
- Returns:
A list of API key objects containing metadata such as ID, description, creation date, and status. Note that the actual API key values are not included in the response for security reasons.
- Return type:
List[
ApiKey]- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# List all API keys all_keys = client.api_keys.list() # List with pagination page_keys = client.api_keys.list(page=1, limit=5) for key in page_keys: print(f"Key ID: {key.id}, Description: {key.description}")
- retrieve(*, api_key_id: str)[source]
Retrieves a specific API key by ID.
Fetches the details of a specific API key using its unique identifier. Note that the actual API key value is not included in the response for security reasons.
- Parameters:
api_key_id (str) – Unique identifier of the API key to retrieve. This is the key’s ID (not the secret key value) as returned by the create or list operations.
- Returns:
API key details including metadata such as description, creation date, expiration, usage statistics, and other configuration information.
- Return type:
Dict[str, Any]
- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.NotFoundError – If the API key ID does not exist.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# Retrieve a specific API key key_details = client.api_keys.retrieve(api_key_id="key_123456789") print(f"Key description: {key_details['description']}") print(f"Created at: {key_details['createdAt']}")
- class venice_ai.resources.api_keys.AsyncApiKeys(client: venice_ai._resource.AsyncClientT)[source]
Provides access to API key management operations asynchronously.
This class implements the asynchronous interface for API key management, including creating, listing, deleting API keys, and managing rate limits. It inherits from
AsyncAPIResourcewhich handles the underlying HTTP requests. All methods return awaitable coroutines that should be awaited by the caller.- Parameters:
_client (
AsyncVeniceClient) – The AsyncVeniceClient instance used for making asynchronous API requests.
Example
import asyncio from venice_ai import AsyncVeniceClient from venice_ai.types.api_keys import ApiKeyCreateRequest async def manage_api_keys(): client = AsyncVeniceClient() # List existing API keys keys = await client.api_keys.list(limit=10) for key in keys: print(f"Key ID: {key.id}, Description: {key.description}") # Create a new API key create_request = ApiKeyCreateRequest( description="My Async Test Key", apiKeyType="INFERENCE" ) new_key = await client.api_keys.create(api_key_request=create_request) print(f"Created key: {new_key.apiKey}") # Only shown on creation asyncio.run(manage_api_keys())
- Parameters:
client (
typing.TypeVar(AsyncClientT, bound= AsyncVeniceClient))
- async create(*, api_key_request: venice_ai.types.api_keys.ApiKeyCreateRequest)[source]
Creates a new API key asynchronously.
Creates a new API key with the specified parameters. The created API key will be returned only once in the response and cannot be retrieved later, so it should be securely stored immediately.
- Parameters:
api_key_request (
ApiKeyCreateRequest) –Request object containing API key configuration. Must include at minimum a description and apiKeyType. The request can contain:
description(str): Human-readable description of the API keyapiKeyType(str): Type of API key (e.g., “INFERENCE”, “ADMIN”)expiresAt(Optional[str]): ISO 8601 timestamp when key expiresconsumptionLimit(Optional[ConsumptionLimit]): Usage limits for the key
- Returns:
Response containing the newly created API key details, including the secret key value (only returned once), key ID, creation timestamp, and other metadata.
- Return type:
ApiKey- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error, such as when maximum API key limit is reached or invalid parameters are provided.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
from venice_ai.types.api_keys import ApiKeyCreateRequest # Create a basic API key asynchronously create_request = ApiKeyCreateRequest( description="My Async Test Key", apiKeyType="INFERENCE" ) new_key = await client.api_keys.create(api_key_request=create_request) print(f"Created key ID: {new_key.id}") print(f"API Key: {new_key.apiKey}") # Store this securely!
- async create_web3_key(*, web3_key_request: venice_ai.types.api_keys.ApiKeyGenerateWeb3KeyCreateRequest)[source]
Creates a new Web3 API key asynchronously.
Creates a new API key authenticated via a Web3 signature.
- Parameters:
web3_key_request (
ApiKeyGenerateWeb3KeyCreateRequest) – Request body containing Web3 authentication details (such asweb3_network_id,web3_address, and signature) and API key parameters.- Returns:
Response containing the newly created API key details.
- Return type:
ApiKeyGenerateWeb3KeyCreateResponse- Raises:
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
- async delete(*, api_key_id: str)[source]
Deletes an API key asynchronously.
Permanently deletes the specified API key. Once deleted, the API key can no longer be used to authenticate requests and this action cannot be undone.
- Parameters:
api_key_id (str) – ID of the API key to delete. This is the key’s unique identifier, not the secret key value.
- Returns:
Response indicating the result of the operation, typically containing a success flag and deletion confirmation.
- Return type:
Dict[str, Any]
- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error, such as when the API key ID does not exist or belongs to another account.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# Delete an API key asynchronously (use with caution) result = await client.api_keys.delete(api_key_id="key_123456789") print(f"Deletion result: {result}") # Safe deletion pattern keys = await client.api_keys.list() test_keys = [k for k in keys if "test" in k.description.lower()] for test_key in test_keys: await client.api_keys.delete(api_key_id=test_key.id) print(f"Deleted test key: {test_key.id}")
- async get_rate_limit_logs(*, api_key_id: str | None = None, start_date: str | None = None, end_date: str | None = None, limit: int | None = None, page: int | None = None)[source]
Retrieves rate limit logs for API keys asynchronously.
Returns a history of rate limit events, such as when rate limits were reset, exceeded, or modified. This can be useful for understanding API usage patterns, diagnosing rate limit issues, and optimizing request timing.
- Parameters:
api_key_id (Optional[str]) – Specific API key ID to get logs for. If not provided, returns logs for the current API key.
start_date (Optional[str]) – Start date for log retrieval in ISO 8601 format (e.g., “2024-01-01T00:00:00Z”). If not provided, uses a default lookback period.
end_date (Optional[str]) – End date for log retrieval in ISO 8601 format (e.g., “2024-01-31T23:59:59Z”). If not provided, uses current time.
limit (Optional[int]) – Maximum number of log entries to return per page.
page (Optional[int]) – Page number for pagination (1-based indexing).
- Returns:
A list of rate limit log entries with timestamps, event types, and related metadata.
- Return type:
RateLimitLogList- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# Get recent rate limit logs asynchronously logs = await client.api_keys.get_rate_limit_logs(limit=10) for log_entry in logs: print(f"Event: {log_entry.event_type} at {log_entry.timestamp}") # Get logs for a specific date range logs = await client.api_keys.get_rate_limit_logs( start_date="2024-01-01T00:00:00Z", end_date="2024-01-31T23:59:59Z", limit=50 )
- async get_rate_limits()[source]
Retrieves rate limit information for the current API key asynchronously.
Returns information about the rate limits applied to the current API key, including the limits per minute, hour, day, and month, as well as the current usage against those limits.
- Returns:
Rate limit information, including limits and current usage.
- Return type:
RateLimitInfo- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
- async get_web3_token()[source]
Retrieves a token for Web3 API key generation asynchronously.
This token is required for the subsequent POST request to create a Web3 API key.
- Returns:
Response containing the token required for Web3 key generation.
- Return type:
ApiKeyGenerateWeb3KeyGetResponse- Raises:
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
- async list(*, page: int | None = None, limit: int | None = None)[source]
Lists API keys for the authenticated account asynchronously, with optional pagination.
Retrieves a list of API keys associated with the current account. This includes active and inactive API keys. Supports pagination.
- Parameters:
- Returns:
A list of API key objects containing metadata such as ID, description, creation date, and status. Note that the actual API key values are not included in the response for security reasons.
- Return type:
List[
ApiKey]- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# List all API keys asynchronously all_keys = await client.api_keys.list() # List with pagination page_keys = await client.api_keys.list(page=1, limit=5) for key in page_keys: print(f"Key ID: {key.id}, Description: {key.description}")
- async retrieve(*, api_key_id: str)[source]
Retrieves a specific API key by ID asynchronously.
Fetches the details of a specific API key using its unique identifier. Note that the actual API key value is not included in the response for security reasons.
- Parameters:
api_key_id (str) – Unique identifier of the API key to retrieve. This is the key’s ID (not the secret key value) as returned by the create or list operations.
- Returns:
API key details including metadata such as description, creation date, expiration, usage statistics, and other configuration information.
- Return type:
Dict[str, Any]
- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.NotFoundError – If the API key ID does not exist.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# Retrieve a specific API key asynchronously key_details = await client.api_keys.retrieve(api_key_id="key_123456789") print(f"Key description: {key_details['description']}") print(f"Created at: {key_details['createdAt']}")
- class venice_ai.resources.api_keys.ApiKeys(client: venice_ai._resource.SyncClientT)[source]
Bases:
APIResourceProvides access to API key management operations.
This class implements the synchronous interface for API key management, including creating, listing, deleting API keys, and managing rate limits. It inherits from
APIResourcewhich handles the underlying HTTP requests.- Parameters:
_client (
VeniceClient) – The Venice client instance used for making API requests.
Example
from venice_ai import VeniceClient from venice_ai.types.api_keys import ApiKeyCreateRequest client = VeniceClient() # List existing API keys keys = client.api_keys.list(limit=10) for key in keys: print(f"Key ID: {key.id}, Description: {key.description}") # Create a new API key create_request = ApiKeyCreateRequest( description="My Test Key", apiKeyType="INFERENCE" ) new_key = client.api_keys.create(api_key_request=create_request) print(f"Created key: {new_key.apiKey}") # Only shown on creation
- Parameters:
client (
typing.TypeVar(SyncClientT, bound= VeniceClient))
- create(*, api_key_request: venice_ai.types.api_keys.ApiKeyCreateRequest)[source]
Creates a new API key.
Creates a new API key with the specified parameters. The created API key will be returned only once in the response and cannot be retrieved later, so it should be securely stored immediately.
- Parameters:
api_key_request (
ApiKeyCreateRequest) –Request object containing API key configuration. Must include at minimum a description and apiKeyType. The request can contain:
description(str): Human-readable description of the API keyapiKeyType(str): Type of API key (e.g., “INFERENCE”, “ADMIN”)expiresAt(Optional[str]): ISO 8601 timestamp when key expiresconsumptionLimit(Optional[int]): Maximum usage limit for the key
- Returns:
Response containing the newly created API key details, including the secret key value (only returned once), key ID, creation timestamp, and other metadata.
- Return type:
ApiKey- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error, such as when maximum API key limit is reached or invalid parameters are provided.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
from venice_ai.types.api_keys import ApiKeyCreateRequest # Create a basic API key create_request = ApiKeyCreateRequest( description="My Test Key", apiKeyType="INFERENCE" ) new_key = client.api_keys.create(api_key_request=create_request) print(f"Created key ID: {new_key.id}") print(f"API Key: {new_key.apiKey}") # Store this securely! # Create a key with expiration and limits advanced_request = ApiKeyCreateRequest( description="Limited Production Key", apiKeyType="INFERENCE", expiresAt="2024-12-31T23:59:59Z", consumptionLimit=10000 ) limited_key = client.api_keys.create(api_key_request=advanced_request)
- create_web3_key(*, web3_key_request: venice_ai.types.api_keys.ApiKeyGenerateWeb3KeyCreateRequest)[source]
Creates a new Web3 API key.
Creates a new API key authenticated via a Web3 signature.
- Parameters:
web3_key_request (
ApiKeyGenerateWeb3KeyCreateRequest) – Request body containing Web3 authentication details (such asweb3_network_id,web3_address, and signature) and API key parameters.- Returns:
Response containing the newly created API key details.
- Return type:
ApiKeyGenerateWeb3KeyCreateResponse- Raises:
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
- delete(*, api_key_id: str)[source]
Deletes an API key.
Permanently deletes the specified API key. Once deleted, the API key can no longer be used to authenticate requests and this action cannot be undone. Use with caution in production environments.
- Parameters:
api_key_id (str) – Unique identifier of the API key to delete. This is the key’s ID (not the secret key value) as returned by the create or list operations.
- Returns:
Response indicating the result of the deletion operation, typically containing a success flag and deletion confirmation message.
- Return type:
Dict[str, Any]
- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error, such as when the API key ID does not exist or belongs to another account.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# Delete an API key (use with caution) result = client.api_keys.delete(api_key_id="key_123456789") print(f"Deletion result: {result}") # Safe deletion pattern keys = client.api_keys.list() test_keys = [k for k in keys if "test" in k.description.lower()] for test_key in test_keys: client.api_keys.delete(api_key_id=test_key.id) print(f"Deleted test key: {test_key.id}")
- get_rate_limit_logs(*, api_key_id: str | None = None, start_date: str | None = None, end_date: str | None = None, limit: int | None = None, page: int | None = None)[source]
Retrieves rate limit logs for API keys.
Returns a history of rate limit events, such as when rate limits were reset, exceeded, or modified. This can be useful for understanding API usage patterns, diagnosing rate limit issues, and optimizing request timing.
- Parameters:
api_key_id (Optional[str]) – Specific API key ID to get logs for. If not provided, returns logs for the current API key.
start_date (Optional[str]) – Start date for log retrieval in ISO 8601 format (e.g., “2024-01-01T00:00:00Z”). If not provided, uses a default lookback period.
end_date (Optional[str]) – End date for log retrieval in ISO 8601 format (e.g., “2024-01-31T23:59:59Z”). If not provided, uses current time.
limit (Optional[int]) – Maximum number of log entries to return per page.
page (Optional[int]) – Page number for pagination (1-based indexing).
- Returns:
A list of rate limit log entries with timestamps, event types, and related metadata.
- Return type:
RateLimitLogList- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# Get recent rate limit logs logs = client.api_keys.get_rate_limit_logs(limit=10) for log_entry in logs: print(f"Event: {log_entry.event_type} at {log_entry.timestamp}") # Get logs for a specific date range logs = client.api_keys.get_rate_limit_logs( start_date="2024-01-01T00:00:00Z", end_date="2024-01-31T23:59:59Z", limit=50 )
- get_rate_limits()[source]
Retrieves rate limit information for the current API key.
Returns information about the rate limits applied to the current API key, including the limits per minute, hour, day, and month, as well as the current usage against those limits.
- Returns:
Rate limit information, including limits and current usage.
- Return type:
RateLimitInfo- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
- get_web3_token()[source]
Retrieves a token for Web3 API key generation.
This token is required for the subsequent POST request to create a Web3 API key.
- Returns:
Response containing the token required for Web3 key generation.
- Return type:
ApiKeyGenerateWeb3KeyGetResponse- Raises:
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
- list(*, page: int | None = None, limit: int | None = None)[source]
Lists API keys for the authenticated account, with optional pagination.
Retrieves a list of API keys associated with the current account. This includes active and inactive API keys. Supports pagination for managing large numbers of API keys.
- Parameters:
- Returns:
A list of API key objects containing metadata such as ID, description, creation date, and status. Note that the actual API key values are not included in the response for security reasons.
- Return type:
List[
ApiKey]- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# List all API keys all_keys = client.api_keys.list() # List with pagination page_keys = client.api_keys.list(page=1, limit=5) for key in page_keys: print(f"Key ID: {key.id}, Description: {key.description}")
- class venice_ai.resources.api_keys.AsyncApiKeys(client: venice_ai._resource.AsyncClientT)[source]
Bases:
AsyncAPIResourceProvides access to API key management operations asynchronously.
This class implements the asynchronous interface for API key management, including creating, listing, deleting API keys, and managing rate limits. It inherits from
AsyncAPIResourcewhich handles the underlying HTTP requests. All methods return awaitable coroutines that should be awaited by the caller.- Parameters:
_client (
AsyncVeniceClient) – The AsyncVeniceClient instance used for making asynchronous API requests.
Example
import asyncio from venice_ai import AsyncVeniceClient from venice_ai.types.api_keys import ApiKeyCreateRequest async def manage_api_keys(): client = AsyncVeniceClient() # List existing API keys keys = await client.api_keys.list(limit=10) for key in keys: print(f"Key ID: {key.id}, Description: {key.description}") # Create a new API key create_request = ApiKeyCreateRequest( description="My Async Test Key", apiKeyType="INFERENCE" ) new_key = await client.api_keys.create(api_key_request=create_request) print(f"Created key: {new_key.apiKey}") # Only shown on creation asyncio.run(manage_api_keys())
- Parameters:
client (
typing.TypeVar(AsyncClientT, bound= AsyncVeniceClient))
- async create(*, api_key_request: venice_ai.types.api_keys.ApiKeyCreateRequest)[source]
Creates a new API key asynchronously.
Creates a new API key with the specified parameters. The created API key will be returned only once in the response and cannot be retrieved later, so it should be securely stored immediately.
- Parameters:
api_key_request (
ApiKeyCreateRequest) –Request object containing API key configuration. Must include at minimum a description and apiKeyType. The request can contain:
description(str): Human-readable description of the API keyapiKeyType(str): Type of API key (e.g., “INFERENCE”, “ADMIN”)expiresAt(Optional[str]): ISO 8601 timestamp when key expiresconsumptionLimit(Optional[ConsumptionLimit]): Usage limits for the key
- Returns:
Response containing the newly created API key details, including the secret key value (only returned once), key ID, creation timestamp, and other metadata.
- Return type:
ApiKey- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error, such as when maximum API key limit is reached or invalid parameters are provided.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
from venice_ai.types.api_keys import ApiKeyCreateRequest # Create a basic API key asynchronously create_request = ApiKeyCreateRequest( description="My Async Test Key", apiKeyType="INFERENCE" ) new_key = await client.api_keys.create(api_key_request=create_request) print(f"Created key ID: {new_key.id}") print(f"API Key: {new_key.apiKey}") # Store this securely!
- async create_web3_key(*, web3_key_request: venice_ai.types.api_keys.ApiKeyGenerateWeb3KeyCreateRequest)[source]
Creates a new Web3 API key asynchronously.
Creates a new API key authenticated via a Web3 signature.
- Parameters:
web3_key_request (
ApiKeyGenerateWeb3KeyCreateRequest) – Request body containing Web3 authentication details (such asweb3_network_id,web3_address, and signature) and API key parameters.- Returns:
Response containing the newly created API key details.
- Return type:
ApiKeyGenerateWeb3KeyCreateResponse- Raises:
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
- async delete(*, api_key_id: str)[source]
Deletes an API key asynchronously.
Permanently deletes the specified API key. Once deleted, the API key can no longer be used to authenticate requests and this action cannot be undone.
- Parameters:
api_key_id (str) – ID of the API key to delete. This is the key’s unique identifier, not the secret key value.
- Returns:
Response indicating the result of the operation, typically containing a success flag and deletion confirmation.
- Return type:
Dict[str, Any]
- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error, such as when the API key ID does not exist or belongs to another account.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# Delete an API key asynchronously (use with caution) result = await client.api_keys.delete(api_key_id="key_123456789") print(f"Deletion result: {result}") # Safe deletion pattern keys = await client.api_keys.list() test_keys = [k for k in keys if "test" in k.description.lower()] for test_key in test_keys: await client.api_keys.delete(api_key_id=test_key.id) print(f"Deleted test key: {test_key.id}")
- async get_rate_limit_logs(*, api_key_id: str | None = None, start_date: str | None = None, end_date: str | None = None, limit: int | None = None, page: int | None = None)[source]
Retrieves rate limit logs for API keys asynchronously.
Returns a history of rate limit events, such as when rate limits were reset, exceeded, or modified. This can be useful for understanding API usage patterns, diagnosing rate limit issues, and optimizing request timing.
- Parameters:
api_key_id (Optional[str]) – Specific API key ID to get logs for. If not provided, returns logs for the current API key.
start_date (Optional[str]) – Start date for log retrieval in ISO 8601 format (e.g., “2024-01-01T00:00:00Z”). If not provided, uses a default lookback period.
end_date (Optional[str]) – End date for log retrieval in ISO 8601 format (e.g., “2024-01-31T23:59:59Z”). If not provided, uses current time.
limit (Optional[int]) – Maximum number of log entries to return per page.
page (Optional[int]) – Page number for pagination (1-based indexing).
- Returns:
A list of rate limit log entries with timestamps, event types, and related metadata.
- Return type:
RateLimitLogList- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# Get recent rate limit logs asynchronously logs = await client.api_keys.get_rate_limit_logs(limit=10) for log_entry in logs: print(f"Event: {log_entry.event_type} at {log_entry.timestamp}") # Get logs for a specific date range logs = await client.api_keys.get_rate_limit_logs( start_date="2024-01-01T00:00:00Z", end_date="2024-01-31T23:59:59Z", limit=50 )
- async get_rate_limits()[source]
Retrieves rate limit information for the current API key asynchronously.
Returns information about the rate limits applied to the current API key, including the limits per minute, hour, day, and month, as well as the current usage against those limits.
- Returns:
Rate limit information, including limits and current usage.
- Return type:
RateLimitInfo- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
- async get_web3_token()[source]
Retrieves a token for Web3 API key generation asynchronously.
This token is required for the subsequent POST request to create a Web3 API key.
- Returns:
Response containing the token required for Web3 key generation.
- Return type:
ApiKeyGenerateWeb3KeyGetResponse- Raises:
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
- async list(*, page: int | None = None, limit: int | None = None)[source]
Lists API keys for the authenticated account asynchronously, with optional pagination.
Retrieves a list of API keys associated with the current account. This includes active and inactive API keys. Supports pagination.
- Parameters:
- Returns:
A list of API key objects containing metadata such as ID, description, creation date, and status. Note that the actual API key values are not included in the response for security reasons.
- Return type:
List[
ApiKey]- Raises:
venice_ai.exceptions.AuthenticationError – If authentication fails.
venice_ai.exceptions.APIError – If the API returns an error.
venice_ai.exceptions.APIConnectionError – If there’s an issue connecting to the API.
Example
# List all API keys asynchronously all_keys = await client.api_keys.list() # List with pagination page_keys = await client.api_keys.list(page=1, limit=5) for key in page_keys: print(f"Key ID: {key.id}, Description: {key.description}")
Audio Resources¶
Venice AI Audio API resources.
This module provides classes for interacting with the Venice AI Audio API, supporting speech synthesis operations. The module includes both synchronous and asynchronous interfaces for audio generation with various voice options and output formats.
The audio API allows for: - Converting text to natural-sounding speech (text-to-speech) - Selecting from multiple voice options for speech synthesis - Controlling speech speed and output format - Both full and streaming response modes
- class venice_ai.resources.audio.AsyncAudio(client: venice_ai._resource.AsyncClientT)[source]
Provides access to text-to-speech (TTS) audio generation operations asynchronously.
This class handles asynchronous audio generation requests, supporting both streaming and non-streaming modes. It allows conversion of text to natural-sounding speech using various voice models and output formats in async applications.
- Parameters:
client (AsyncVeniceClient) – The async Venice AI client instance used for making API requests.
Note
This class is typically accessed through the
AsyncVeniceClient.audioproperty rather than being instantiated directly.- async create_speech(*, input: str, model: str, voice: str | venice_ai.types.audio.Voice, response_format: str | venice_ai.types.audio.ResponseFormat | None = 'mp3', speed: float | None = 1.0, stream: bool = False, timeout: float | httpx.Timeout | None = None)[source]
Generates audio from input text asynchronously.
Converts the provided text to speech using the specified model and voice using asynchronous requests. The audio can be returned either as complete binary data or as an async stream of audio chunks for real-time processing.
- Parameters:
model (str) – ID of the model to use for speech generation (e.g., “tts-kokoro”).
input (str) – The text to convert to speech. Maximum length varies by model.
voice (Union[str, venice_ai.types.audio.Voice]) – The voice to use for the generated audio. Can be a string literal or a
Voiceenum value (e.g., Voice.KOKORO_DEFAULT or “kokoro-default”).response_format (Optional[Union[str, venice_ai.types.audio.ResponseFormat]]) – The format to return the audio in. Can be a string literal or a
ResponseFormatenum value. Defaults to “mp3”.speed (Optional[float]) – The speed of the generated audio. Select a value from 0.25 to 4.0. Defaults to 1.0.
stream (Optional[bool]) – Whether to stream the audio data. If True, returns an AsyncIterator of audio chunks. If False, returns the complete audio data. Defaults to False.
timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout in seconds or an httpx.Timeout object. If not provided, uses the client’s default timeout.
- Returns:
If stream is False, returns the audio data as bytes (awaitable). If stream is True, returns an AsyncIterator yielding chunks of audio data as bytes.
- Return type:
- Raises:
venice_ai.exceptions.APIError – If the API request fails.
ValueError – If the input text is empty or invalid parameters are provided.
Example
Basic non-streaming text-to-speech:
import asyncio from venice_ai import AsyncVeniceClient from venice_ai.types.audio import Voice, ResponseFormat async def generate_speech(): client = AsyncVeniceClient() # Generate speech with enum values audio_bytes = await client.audio.create_speech( model="tts-kokoro", input="Hello, this is a test.", voice=Voice.KOKORO_DEFAULT ) # Save to file with open("speech.mp3", "wb") as f: f.write(audio_bytes) # Using string literals and different format audio_bytes = await client.audio.create_speech( model="tts-kokoro", input="Hello with different settings.", voice="kokoro-default", response_format="wav", speed=1.2 ) asyncio.run(generate_speech())
Streaming text-to-speech:
async def stream_speech(): client = AsyncVeniceClient() # Stream audio data stream = client.audio.create_speech( model="tts-kokoro", input="This is a streamed audio example.", voice="kokoro-default", stream=True ) # Write streamed chunks to file with open("streamed_speech.mp3", "wb") as f: async for chunk in stream: f.write(chunk) asyncio.run(stream_speech())
- async get_voices(*, model_id: str | None = None, gender: Literal['male', 'female', 'unknown'] | None = None, region_code: str | None = None)[source]
Lists available text-to-speech (TTS) voices asynchronously, with optional filtering.
This method retrieves information about available voices for TTS models, allowing filtering by model ID, gender, and region code.
- Parameters:
model_id (
typing.Optional[str]) – Optional. If provided, only voices for this specific TTS model ID will be returned.gender (
typing.Optional[typing.Literal['male','female','unknown']]) – Optional. Filter voices by gender (“male”, “female”, “unknown”). Gender is inferred from the voice ID prefix.region_code (
typing.Optional[str]) – Optional. Filter voices by the raw two-letter region/language prefix from the voice ID (e.g., “af” for American Female-sounding, “zm” for Chinese Male-sounding).
- Return type:
venice_ai.types.audio.VoiceList- Returns:
A VoiceList object containing a list of VoiceDetail objects that match the filter criteria, along with information about the applied filters.
- Raises:
venice_ai.exceptions.APIError – If an API error occurs during the request to the underlying models endpoint.
- class venice_ai.resources.audio.Audio(client: venice_ai._resource.SyncClientT)[source]
Provides access to text-to-speech (TTS) audio generation operations.
This class handles synchronous audio generation requests, supporting both streaming and non-streaming modes. It allows conversion of text to natural-sounding speech using various voice models and output formats.
- Parameters:
client (VeniceClient) – The Venice AI client instance used for making API requests.
Note
This class is typically accessed through the
VeniceClient.audioproperty rather than being instantiated directly.- create_speech(*, input: str, model: str, voice: str | venice_ai.types.audio.Voice, response_format: str | venice_ai.types.audio.ResponseFormat | None = 'mp3', speed: float | None = 1.0, stream: bool = False, timeout: float | httpx.Timeout | None = None)[source]
Generates audio from input text.
Converts the provided text to speech using the specified model and voice. The audio can be returned either as complete binary data or as a stream of audio chunks for real-time processing.
- Parameters:
model (str) – ID of the model to use for speech generation (e.g., “tts-kokoro”).
input (str) – The text to convert to speech. Maximum length varies by model.
voice (Union[str, venice_ai.types.audio.Voice]) – The voice to use for the generated audio. Can be a string literal or a
Voiceenum value (e.g., Voice.KOKORO_DEFAULT or “kokoro-default”).response_format (Optional[Union[str, venice_ai.types.audio.ResponseFormat]]) – The format to return the audio in. Can be a string literal or a
ResponseFormatenum value. Defaults to “mp3”.speed (Optional[float]) – The speed of the generated audio. Select a value from 0.25 to 4.0. Defaults to 1.0.
stream (Optional[bool]) – Whether to stream the audio data. If True, returns an Iterator of audio chunks. If False, returns the complete audio data. Defaults to False.
timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout in seconds or an httpx.Timeout object. If not provided, uses the client’s default timeout.
- Returns:
If stream is False, returns the audio data as bytes. If stream is True, returns an Iterator yielding chunks of audio data as bytes.
- Return type:
- Raises:
venice_ai.exceptions.APIError – If the API request fails.
ValueError – If the input text is empty or invalid parameters are provided.
Example
Basic non-streaming text-to-speech:
from venice_ai import VeniceClient from venice_ai.types.audio import Voice, ResponseFormat client = VeniceClient() # Generate speech with enum values audio_bytes = client.audio.create_speech( model="tts-kokoro", input="Hello, this is a test.", voice=Voice.KOKORO_DEFAULT ) # Save to file with open("speech.mp3", "wb") as f: f.write(audio_bytes) # Using string literals and different format audio_bytes = client.audio.create_speech( model="tts-kokoro", input="Hello with different settings.", voice="kokoro-default", response_format="wav", speed=1.2 )
Streaming text-to-speech:
# Stream audio data stream = client.audio.create_speech( model="tts-kokoro", input="This is a streamed audio example.", voice="kokoro-default", stream=True ) # Write streamed chunks to file with open("streamed_speech.mp3", "wb") as f: for chunk in stream: f.write(chunk)
- get_voices(*, model_id: str | None = None, gender: Literal['male', 'female', 'unknown'] | None = None, region_code: str | None = None)[source]
Lists available text-to-speech (TTS) voices, with optional filtering.
This method retrieves information about available voices for TTS models, allowing filtering by model ID, gender, and region code.
- Parameters:
model_id (
typing.Optional[str]) – Optional. If provided, only voices for this specific TTS model ID will be returned.gender (
typing.Optional[typing.Literal['male','female','unknown']]) – Optional. Filter voices by gender (“male”, “female”, “unknown”). Gender is inferred from the voice ID prefix.region_code (
typing.Optional[str]) – Optional. Filter voices by the raw two-letter region/language prefix from the voice ID (e.g., “af” for American Female-sounding, “zm” for Chinese Male-sounding).
- Return type:
venice_ai.types.audio.VoiceList- Returns:
A VoiceList object containing a list of VoiceDetail objects that match the filter criteria, along with information about the applied filters.
- Raises:
venice_ai.exceptions.APIError – If an API error occurs during the request to the underlying models endpoint.
- class venice_ai.resources.audio.Audio(client: venice_ai._resource.SyncClientT)[source]
Bases:
APIResourceProvides access to text-to-speech (TTS) audio generation operations.
This class handles synchronous audio generation requests, supporting both streaming and non-streaming modes. It allows conversion of text to natural-sounding speech using various voice models and output formats.
- Parameters:
client (VeniceClient) – The Venice AI client instance used for making API requests.
Note
This class is typically accessed through the
VeniceClient.audioproperty rather than being instantiated directly.- create_speech(*, input: str, model: str, voice: str | venice_ai.types.audio.Voice, response_format: str | venice_ai.types.audio.ResponseFormat | None = 'mp3', speed: float | None = 1.0, stream: bool = False, timeout: float | httpx.Timeout | None = None)[source]
Generates audio from input text.
Converts the provided text to speech using the specified model and voice. The audio can be returned either as complete binary data or as a stream of audio chunks for real-time processing.
- Parameters:
model (str) – ID of the model to use for speech generation (e.g., “tts-kokoro”).
input (str) – The text to convert to speech. Maximum length varies by model.
voice (Union[str, venice_ai.types.audio.Voice]) – The voice to use for the generated audio. Can be a string literal or a
Voiceenum value (e.g., Voice.KOKORO_DEFAULT or “kokoro-default”).response_format (Optional[Union[str, venice_ai.types.audio.ResponseFormat]]) – The format to return the audio in. Can be a string literal or a
ResponseFormatenum value. Defaults to “mp3”.speed (Optional[float]) – The speed of the generated audio. Select a value from 0.25 to 4.0. Defaults to 1.0.
stream (Optional[bool]) – Whether to stream the audio data. If True, returns an Iterator of audio chunks. If False, returns the complete audio data. Defaults to False.
timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout in seconds or an httpx.Timeout object. If not provided, uses the client’s default timeout.
- Returns:
If stream is False, returns the audio data as bytes. If stream is True, returns an Iterator yielding chunks of audio data as bytes.
- Return type:
- Raises:
venice_ai.exceptions.APIError – If the API request fails.
ValueError – If the input text is empty or invalid parameters are provided.
Example
Basic non-streaming text-to-speech:
from venice_ai import VeniceClient from venice_ai.types.audio import Voice, ResponseFormat client = VeniceClient() # Generate speech with enum values audio_bytes = client.audio.create_speech( model="tts-kokoro", input="Hello, this is a test.", voice=Voice.KOKORO_DEFAULT ) # Save to file with open("speech.mp3", "wb") as f: f.write(audio_bytes) # Using string literals and different format audio_bytes = client.audio.create_speech( model="tts-kokoro", input="Hello with different settings.", voice="kokoro-default", response_format="wav", speed=1.2 )
Streaming text-to-speech:
# Stream audio data stream = client.audio.create_speech( model="tts-kokoro", input="This is a streamed audio example.", voice="kokoro-default", stream=True ) # Write streamed chunks to file with open("streamed_speech.mp3", "wb") as f: for chunk in stream: f.write(chunk)
- class venice_ai.resources.audio.AsyncAudio(client: venice_ai._resource.AsyncClientT)[source]
Bases:
AsyncAPIResourceProvides access to text-to-speech (TTS) audio generation operations asynchronously.
This class handles asynchronous audio generation requests, supporting both streaming and non-streaming modes. It allows conversion of text to natural-sounding speech using various voice models and output formats in async applications.
- Parameters:
client (AsyncVeniceClient) – The async Venice AI client instance used for making API requests.
Note
This class is typically accessed through the
AsyncVeniceClient.audioproperty rather than being instantiated directly.- async create_speech(*, input: str, model: str, voice: str | venice_ai.types.audio.Voice, response_format: str | venice_ai.types.audio.ResponseFormat | None = 'mp3', speed: float | None = 1.0, stream: bool = False, timeout: float | httpx.Timeout | None = None)[source]
Generates audio from input text asynchronously.
Converts the provided text to speech using the specified model and voice using asynchronous requests. The audio can be returned either as complete binary data or as an async stream of audio chunks for real-time processing.
- Parameters:
model (str) – ID of the model to use for speech generation (e.g., “tts-kokoro”).
input (str) – The text to convert to speech. Maximum length varies by model.
voice (Union[str, venice_ai.types.audio.Voice]) – The voice to use for the generated audio. Can be a string literal or a
Voiceenum value (e.g., Voice.KOKORO_DEFAULT or “kokoro-default”).response_format (Optional[Union[str, venice_ai.types.audio.ResponseFormat]]) – The format to return the audio in. Can be a string literal or a
ResponseFormatenum value. Defaults to “mp3”.speed (Optional[float]) – The speed of the generated audio. Select a value from 0.25 to 4.0. Defaults to 1.0.
stream (Optional[bool]) – Whether to stream the audio data. If True, returns an AsyncIterator of audio chunks. If False, returns the complete audio data. Defaults to False.
timeout (Optional[Union[float, httpx.Timeout]]) – Request timeout in seconds or an httpx.Timeout object. If not provided, uses the client’s default timeout.
- Returns:
If stream is False, returns the audio data as bytes (awaitable). If stream is True, returns an AsyncIterator yielding chunks of audio data as bytes.
- Return type:
- Raises:
venice_ai.exceptions.APIError – If the API request fails.
ValueError – If the input text is empty or invalid parameters are provided.
Example
Basic non-streaming text-to-speech:
import asyncio from venice_ai import AsyncVeniceClient from venice_ai.types.audio import Voice, ResponseFormat async def generate_speech(): client = AsyncVeniceClient() # Generate speech with enum values audio_bytes = await client.audio.create_speech( model="tts-kokoro", input="Hello, this is a test.", voice=Voice.KOKORO_DEFAULT ) # Save to file with open("speech.mp3", "wb") as f: f.write(audio_bytes) # Using string literals and different format audio_bytes = await client.audio.create_speech( model="tts-kokoro", input="Hello with different settings.", voice="kokoro-default", response_format="wav", speed=1.2 ) asyncio.run(generate_speech())
Streaming text-to-speech:
async def stream_speech(): client = AsyncVeniceClient() # Stream audio data stream = client.audio.create_speech( model="tts-kokoro", input="This is a streamed audio example.", voice="kokoro-default", stream=True ) # Write streamed chunks to file with open("streamed_speech.mp3", "wb") as f: async for chunk in stream: f.write(chunk) asyncio.run(stream_speech())
Embeddings Resources¶
Venice AI Embeddings API resources.
This module provides classes for interacting with the Venice AI Embeddings API, allowing clients to generate embeddings from text or token inputs. These embeddings are vector representations of text that capture semantic meaning and can be used for tasks such as semantic search, clustering, and classification.
- class venice_ai.resources.embeddings.AsyncEmbeddings(client: venice_ai._resource.AsyncClientT)[source]
Provides access to text embedding generation operations (asynchronous).
This class manages asynchronous embedding operations through the Venice AI API. It provides the same functionality as the synchronous
Embeddingsclass but uses async/await patterns for non-blocking operations. Embeddings are vector representations of text that capture semantic meaning and can be used for various natural language processing tasks.- Parameters:
client (venice_ai._async_client.AsyncVeniceClient) – The async Venice AI client instance used to make API requests.
- async create(*, model: str, input: str | List[str] | List[int] | List[List[int]], dimensions: int | None = None, encoding_format: Literal['float', 'base64'] | None = None, user: str | None = None)[source]
Generates embeddings for input text(s) asynchronously.
This method sends an asynchronous request to the Venice AI API to generate vector embeddings for the provided text or token inputs using the specified model. The embeddings can be used for semantic search, clustering, classification, and other NLP tasks.
- Parameters:
model (str) – The ID of the embedding model to use. Available models can be retrieved using the models API. Example:
'text-embedding-bge-m3'.input (Union[str, List[str], List[int], List[List[int]]]) – The input text(s) to generate embeddings for. Can be a single string, a list of strings for batch processing, a list of token integers, or a list of token lists. For batch processing, all inputs will be processed together in a single API call.
dimensions (Optional[int]) – The number of dimensions for the output embeddings. If not specified, uses the model’s default dimensionality. Some models support reducing dimensions for efficiency.
encoding_format (Optional[Literal["float", "base64"]]) – The format for the returned embeddings. Defaults to
'float'for numerical arrays. Use'base64'for base64-encoded string representation.user (Optional[str]) – A unique identifier representing your end-user. This parameter is supported for compatibility with OpenAI clients but is discarded by the Venice API and does not affect the response.
- Returns:
A response object containing the generated embeddings and usage data. The response includes an array of embedding objects, each containing the vector representation and associated metadata.
- Return type:
EmbeddingList- Raises:
venice_ai.exceptions.InvalidRequestError – If parameter values are invalid (e.g., empty model or input, unsupported encoding format).
venice_ai.exceptions.AuthenticationError – If the API key is invalid or missing.
venice_ai.exceptions.PermissionDeniedError – If access to the specified model is denied.
venice_ai.exceptions.NotFoundError – If the specified model is not found.
venice_ai.exceptions.RateLimitError – If rate limits are exceeded.
venice_ai.exceptions.APIError – For other API-related errors.
Examples:
Generate an embedding for a single string:
import asyncio from venice_ai import AsyncVeniceClient async def create_embedding(): async with AsyncVeniceClient(api_key="your-api-key") as client: response = await client.embeddings.create( model="text-embedding-bge-m3", input="The quick brown fox jumps over the lazy dog." ) embedding = response.data[0].embedding print(f"Embedding dimensions: {len(embedding)}") print(f"First 5 dimensions: {embedding[:5]}") asyncio.run(create_embedding())
Generate embeddings for multiple strings (batch processing):
async def create_batch_embeddings(): inputs = [ "First sentence for embedding.", "Second sentence for embedding.", "Third sentence for embedding." ] async with AsyncVeniceClient(api_key="your-api-key") as client: batch_response = await client.embeddings.create( model="text-embedding-bge-m3", input=inputs ) for i, data_item in enumerate(batch_response.data): print(f"Embedding for '{inputs[i]}' (first 3 dims): {data_item.embedding[:3]}") print(f"Total tokens used: {batch_response.usage.total_tokens}") asyncio.run(create_batch_embeddings())
Using optional parameters:
async def create_custom_embedding(): async with AsyncVeniceClient(api_key="your-api-key") as client: response = await client.embeddings.create( model="text-embedding-bge-m3", input="Sample text for embedding", dimensions=512, # Reduce dimensions if supported encoding_format="base64", # Get base64-encoded embeddings user="user-123" # Track usage by user ) asyncio.run(create_custom_embedding())
- class venice_ai.resources.embeddings.Embeddings(client: venice_ai._resource.SyncClientT)[source]
Provides access to text embedding generation operations.
This class manages synchronous embedding operations through the Venice AI API. Embeddings are vector representations of text that capture semantic meaning and can be used for various natural language processing tasks such as semantic search, clustering, classification, and similarity analysis.
- Parameters:
client (venice_ai._client.VeniceClient) – The Venice AI client instance used to make API requests.
- create(*, model: str, input: str | List[str] | List[int] | List[List[int]], dimensions: int | None = None, encoding_format: Literal['float', 'base64'] | None = None, user: str | None = None)[source]
Generates embeddings for input text(s).
This method sends a request to the Venice AI API to generate vector embeddings for the provided text or token inputs using the specified model. The embeddings can be used for semantic search, clustering, classification, and other NLP tasks.
- Parameters:
model (str) – The ID of the embedding model to use. Available models can be retrieved using the models API. Example:
'text-embedding-bge-m3'.input (Union[str, List[str], List[int], List[List[int]]]) – The input text(s) to generate embeddings for. Can be a single string, a list of strings for batch processing, a list of token integers, or a list of token lists. For batch processing, all inputs will be processed together in a single API call.
dimensions (Optional[int]) – The number of dimensions for the output embeddings. If not specified, uses the model’s default dimensionality. Some models support reducing dimensions for efficiency.
encoding_format (Optional[Literal["float", "base64"]]) – The format for the returned embeddings. Defaults to
'float'for numerical arrays. Use'base64'for base64-encoded string representation.user (Optional[str]) – A unique identifier representing your end-user. This parameter is supported for compatibility with OpenAI clients but is discarded by the Venice API and does not affect the response.
- Returns:
A response object containing the generated embeddings and usage data. The response includes an array of embedding objects, each containing the vector representation and associated metadata.
- Return type:
EmbeddingList- Raises:
venice_ai.exceptions.InvalidRequestError – If parameter values are invalid (e.g., empty model or input, unsupported encoding format).
venice_ai.exceptions.AuthenticationError – If the API key is invalid or missing.
venice_ai.exceptions.PermissionDeniedError – If access to the specified model is denied.
venice_ai.exceptions.NotFoundError – If the specified model is not found.
venice_ai.exceptions.RateLimitError – If rate limits are exceeded.
venice_ai.exceptions.APIError – For other API-related errors.
Examples:
Generate an embedding for a single string:
from venice_ai import VeniceClient client = VeniceClient(api_key="your-api-key") response = client.embeddings.create( model="text-embedding-bge-m3", input="The quick brown fox jumps over the lazy dog." ) embedding = response.data[0].embedding print(f"Embedding dimensions: {len(embedding)}") print(f"First 5 dimensions: {embedding[:5]}")
Generate embeddings for multiple strings (batch processing):
inputs = [ "First sentence for embedding.", "Second sentence for embedding.", "Third sentence for embedding." ] batch_response = client.embeddings.create( model="text-embedding-bge-m3", input=inputs ) for i, data_item in enumerate(batch_response.data): print(f"Embedding for '{inputs[i]}' (first 3 dims): {data_item.embedding[:3]}") print(f"Total tokens used: {batch_response.usage.total_tokens}")
Using optional parameters:
response = client.embeddings.create( model="text-embedding-bge-m3", input="Sample text for embedding", dimensions=512, # Reduce dimensions if supported encoding_format="base64", # Get base64-encoded embeddings user="user-123" # Track usage by user )
- class venice_ai.resources.embeddings.Embeddings(client: venice_ai._resource.SyncClientT)[source]
Bases:
APIResourceProvides access to text embedding generation operations.
This class manages synchronous embedding operations through the Venice AI API. Embeddings are vector representations of text that capture semantic meaning and can be used for various natural language processing tasks such as semantic search, clustering, classification, and similarity analysis.
- Parameters:
client (venice_ai._client.VeniceClient) – The Venice AI client instance used to make API requests.
- create(*, model: str, input: str | List[str] | List[int] | List[List[int]], dimensions: int | None = None, encoding_format: Literal['float', 'base64'] | None = None, user: str | None = None)[source]
Generates embeddings for input text(s).
This method sends a request to the Venice AI API to generate vector embeddings for the provided text or token inputs using the specified model. The embeddings can be used for semantic search, clustering, classification, and other NLP tasks.
- Parameters:
model (str) – The ID of the embedding model to use. Available models can be retrieved using the models API. Example:
'text-embedding-bge-m3'.input (Union[str, List[str], List[int], List[List[int]]]) – The input text(s) to generate embeddings for. Can be a single string, a list of strings for batch processing, a list of token integers, or a list of token lists. For batch processing, all inputs will be processed together in a single API call.
dimensions (Optional[int]) – The number of dimensions for the output embeddings. If not specified, uses the model’s default dimensionality. Some models support reducing dimensions for efficiency.
encoding_format (Optional[Literal["float", "base64"]]) – The format for the returned embeddings. Defaults to
'float'for numerical arrays. Use'base64'for base64-encoded string representation.user (Optional[str]) – A unique identifier representing your end-user. This parameter is supported for compatibility with OpenAI clients but is discarded by the Venice API and does not affect the response.
- Returns:
A response object containing the generated embeddings and usage data. The response includes an array of embedding objects, each containing the vector representation and associated metadata.
- Return type:
EmbeddingList- Raises:
venice_ai.exceptions.InvalidRequestError – If parameter values are invalid (e.g., empty model or input, unsupported encoding format).
venice_ai.exceptions.AuthenticationError – If the API key is invalid or missing.
venice_ai.exceptions.PermissionDeniedError – If access to the specified model is denied.
venice_ai.exceptions.NotFoundError – If the specified model is not found.
venice_ai.exceptions.RateLimitError – If rate limits are exceeded.
venice_ai.exceptions.APIError – For other API-related errors.
Examples:
Generate an embedding for a single string:
from venice_ai import VeniceClient client = VeniceClient(api_key="your-api-key") response = client.embeddings.create( model="text-embedding-bge-m3", input="The quick brown fox jumps over the lazy dog." ) embedding = response.data[0].embedding print(f"Embedding dimensions: {len(embedding)}") print(f"First 5 dimensions: {embedding[:5]}")
Generate embeddings for multiple strings (batch processing):
inputs = [ "First sentence for embedding.", "Second sentence for embedding.", "Third sentence for embedding." ] batch_response = client.embeddings.create( model="text-embedding-bge-m3", input=inputs ) for i, data_item in enumerate(batch_response.data): print(f"Embedding for '{inputs[i]}' (first 3 dims): {data_item.embedding[:3]}") print(f"Total tokens used: {batch_response.usage.total_tokens}")
Using optional parameters:
response = client.embeddings.create( model="text-embedding-bge-m3", input="Sample text for embedding", dimensions=512, # Reduce dimensions if supported encoding_format="base64", # Get base64-encoded embeddings user="user-123" # Track usage by user )
- class venice_ai.resources.embeddings.AsyncEmbeddings(client: venice_ai._resource.AsyncClientT)[source]
Bases:
AsyncAPIResourceProvides access to text embedding generation operations (asynchronous).
This class manages asynchronous embedding operations through the Venice AI API. It provides the same functionality as the synchronous
Embeddingsclass but uses async/await patterns for non-blocking operations. Embeddings are vector representations of text that capture semantic meaning and can be used for various natural language processing tasks.- Parameters:
client (venice_ai._async_client.AsyncVeniceClient) – The async Venice AI client instance used to make API requests.
- async create(*, model: str, input: str | List[str] | List[int] | List[List[int]], dimensions: int | None = None, encoding_format: Literal['float', 'base64'] | None = None, user: str | None = None)[source]
Generates embeddings for input text(s) asynchronously.
This method sends an asynchronous request to the Venice AI API to generate vector embeddings for the provided text or token inputs using the specified model. The embeddings can be used for semantic search, clustering, classification, and other NLP tasks.
- Parameters:
model (str) – The ID of the embedding model to use. Available models can be retrieved using the models API. Example:
'text-embedding-bge-m3'.input (Union[str, List[str], List[int], List[List[int]]]) – The input text(s) to generate embeddings for. Can be a single string, a list of strings for batch processing, a list of token integers, or a list of token lists. For batch processing, all inputs will be processed together in a single API call.
dimensions (Optional[int]) – The number of dimensions for the output embeddings. If not specified, uses the model’s default dimensionality. Some models support reducing dimensions for efficiency.
encoding_format (Optional[Literal["float", "base64"]]) – The format for the returned embeddings. Defaults to
'float'for numerical arrays. Use'base64'for base64-encoded string representation.user (Optional[str]) – A unique identifier representing your end-user. This parameter is supported for compatibility with OpenAI clients but is discarded by the Venice API and does not affect the response.
- Returns:
A response object containing the generated embeddings and usage data. The response includes an array of embedding objects, each containing the vector representation and associated metadata.
- Return type:
EmbeddingList- Raises:
venice_ai.exceptions.InvalidRequestError – If parameter values are invalid (e.g., empty model or input, unsupported encoding format).
venice_ai.exceptions.AuthenticationError – If the API key is invalid or missing.
venice_ai.exceptions.PermissionDeniedError – If access to the specified model is denied.
venice_ai.exceptions.NotFoundError – If the specified model is not found.
venice_ai.exceptions.RateLimitError – If rate limits are exceeded.
venice_ai.exceptions.APIError – For other API-related errors.
Examples:
Generate an embedding for a single string:
import asyncio from venice_ai import AsyncVeniceClient async def create_embedding(): async with AsyncVeniceClient(api_key="your-api-key") as client: response = await client.embeddings.create( model="text-embedding-bge-m3", input="The quick brown fox jumps over the lazy dog." ) embedding = response.data[0].embedding print(f"Embedding dimensions: {len(embedding)}") print(f"First 5 dimensions: {embedding[:5]}") asyncio.run(create_embedding())
Generate embeddings for multiple strings (batch processing):
async def create_batch_embeddings(): inputs = [ "First sentence for embedding.", "Second sentence for embedding.", "Third sentence for embedding." ] async with AsyncVeniceClient(api_key="your-api-key") as client: batch_response = await client.embeddings.create( model="text-embedding-bge-m3", input=inputs ) for i, data_item in enumerate(batch_response.data): print(f"Embedding for '{inputs[i]}' (first 3 dims): {data_item.embedding[:3]}") print(f"Total tokens used: {batch_response.usage.total_tokens}") asyncio.run(create_batch_embeddings())
Using optional parameters:
async def create_custom_embedding(): async with AsyncVeniceClient(api_key="your-api-key") as client: response = await client.embeddings.create( model="text-embedding-bge-m3", input="Sample text for embedding", dimensions=512, # Reduce dimensions if supported encoding_format="base64", # Get base64-encoded embeddings user="user-123" # Track usage by user ) asyncio.run(create_custom_embedding())
Billing Resources¶
Venice AI Billing API resources.
This module provides classes for interacting with the Venice AI Billing API, allowing clients to retrieve usage information in different formats (JSON or CSV). The module offers both synchronous and asynchronous interfaces for retrieving billing data, designed to integrate smoothly with the respective client types.
- class venice_ai.resources.billing.AsyncBilling(client: venice_ai._resource.AsyncClientT)[source]
Provides access to billing and usage data operations using asynchronous requests.
Manages asynchronous billing operations, providing methods to retrieve billing usage data in either JSON or CSV format using asynchronous requests. It’s designed to work with
AsyncVeniceClient, allowing for non-blocking API calls in asynchronous applications. The class handles request formatting, response parsing, and proper type conversions based on the requested format.- Parameters:
client (venice_ai.AsyncVeniceClient) – The asynchronous Venice AI client instance used for making API requests.
- async get_usage(*, format: venice_ai.types.billing.BillingFormatEnum = BillingFormatEnum.JSON, currency: str | None = None, startDate: str | None = None, endDate: str | None = None, limit: int | None = None, page: int | None = None, sortOrder: str | None = None)[source]
Retrieves billing usage information asynchronously.
Fetches usage data from the Venice AI Billing API with various filtering options, using asynchronous HTTP requests. The method sets the appropriate ‘Accept’ header (
'application/json'or'text/csv') based on the requested format, which determines how the API processes and returns the data.- Parameters:
format (venice_ai.types.billing.BillingFormatEnum) – Response format (JSON or CSV). Defaults to
JSON.currency (Optional[str]) – Optional currency filter (USD or VCU).
startDate (Optional[str]) – Optional start date (ISO 8601 format, e.g.,
"2025-01-01T00:00:00Z").endDate (Optional[str]) – Optional end date (ISO 8601 format, e.g.,
"2025-05-01T00:00:00Z").limit (Optional[int]) – Optional number of items per page (1-500, default
200).page (Optional[int]) – Optional page number for pagination (default
1).sortOrder (Optional[str]) – Optional sort order for timestamp (asc/desc, default
'desc').
- Returns:
Billing usage data as
BillingUsageResponsefor JSON, orbytesfor CSV.- Return type:
Union[venice_ai.types.billing.BillingUsageResponse, bytes]
- Raises:
venice_ai.exceptions.InvalidRequestError – If parameter values are invalid.
venice_ai.exceptions.AuthenticationError – If the API key is invalid.
venice_ai.exceptions.PermissionDeniedError – If access is denied.
venice_ai.exceptions.RateLimitError – If rate limits are exceeded.
venice_ai.exceptions.APIError – For other API-related errors.
Example
import asyncio from venice_ai import AsyncVeniceClient from venice_ai.types.billing import BillingFormatEnum async def get_usage_example(): async with AsyncVeniceClient(api_key="your-api-key") as client: # Get JSON usage data usage_response = await client.billing.get_usage( startDate="2025-01-01T00:00:00Z", endDate="2025-05-01T00:00:00Z", limit=10, page=1 ) # Access usage records for usage_record in usage_response['data']: print(f"Date: {usage_record['timestamp']}, Cost: {usage_record['amount']}") # Get CSV usage data usage_csv = await client.billing.get_usage( format=BillingFormatEnum.CSV, startDate="2025-01-01T00:00:00Z", endDate="2025-05-01T00:00:00Z" ) # Write CSV to file with open("usage.csv", "wb") as f: f.write(usage_csv) asyncio.run(get_usage_example())
- class venice_ai.resources.billing.Billing(client: venice_ai._resource.SyncClientT)[source]
Provides access to billing and usage data operations.
Manages synchronous billing operations, providing methods to retrieve billing usage data in either JSON or CSV format. It handles API requests to the Venice AI Billing API endpoints, managing request parameters, headers, and response formats. When initialized with a
VeniceClientinstance, it inherits the client’s configuration including API key authentication.- Parameters:
client (venice_ai.VeniceClient) – The Venice AI client instance used for making API requests.
- get_usage(*, format: venice_ai.types.billing.BillingFormatEnum = BillingFormatEnum.JSON, currency: str | None = None, startDate: str | None = None, endDate: str | None = None, limit: int | None = None, page: int | None = None, sortOrder: str | None = None)[source]
Retrieves billing usage information.
Fetches usage data from the Venice AI Billing API with various filtering options. The response format is determined by the ‘format’ parameter and corresponding ‘Accept’ header: ‘application/json’ for JSON format or ‘text/csv’ for CSV format.
- Parameters:
format (venice_ai.types.billing.BillingFormatEnum) – Response format (JSON or CSV). Defaults to
JSON.currency (Optional[str]) – Optional currency filter (USD or VCU).
startDate (Optional[str]) – Optional start date (ISO 8601 format, e.g.,
"2025-01-01T00:00:00Z").endDate (Optional[str]) – Optional end date (ISO 8601 format, e.g.,
"2025-05-01T00:00:00Z").limit (Optional[int]) – Optional number of items per page (1-500, default
200).page (Optional[int]) – Optional page number for pagination (default
1).sortOrder (Optional[str]) – Optional sort order for timestamp (asc/desc, default
'desc').
- Returns:
Billing usage data as
BillingUsageResponsefor JSON, orbytesfor CSV.- Return type:
Union[venice_ai.types.billing.BillingUsageResponse, bytes]
- Raises:
venice_ai.exceptions.InvalidRequestError – If parameter values are invalid.
venice_ai.exceptions.AuthenticationError – If the API key is invalid.
venice_ai.exceptions.PermissionDeniedError – If access is denied.
venice_ai.exceptions.RateLimitError – If rate limits are exceeded.
venice_ai.exceptions.APIError – For other API-related errors.
Example
from venice_ai import VeniceClient from venice_ai.types.billing import BillingFormatEnum client = VeniceClient(api_key="your-api-key") # Get JSON usage data usage_response = client.billing.get_usage( startDate="2025-01-01T00:00:00Z", endDate="2025-05-01T00:00:00Z", limit=10, page=1 ) # Access usage records for usage_record in usage_response['data']: print(f"Date: {usage_record['timestamp']}, Cost: {usage_record['amount']}") # Get CSV usage data usage_csv = client.billing.get_usage( format=BillingFormatEnum.CSV, startDate="2025-01-01T00:00:00Z", endDate="2025-05-01T00:00:00Z" ) # Write CSV to file with open("usage.csv", "wb") as f: f.write(usage_csv)
- class venice_ai.resources.billing.Billing(client: venice_ai._resource.SyncClientT)[source]
Bases:
APIResourceProvides access to billing and usage data operations.
Manages synchronous billing operations, providing methods to retrieve billing usage data in either JSON or CSV format. It handles API requests to the Venice AI Billing API endpoints, managing request parameters, headers, and response formats. When initialized with a
VeniceClientinstance, it inherits the client’s configuration including API key authentication.- Parameters:
client (venice_ai.VeniceClient) – The Venice AI client instance used for making API requests.
- get_usage(*, format: venice_ai.types.billing.BillingFormatEnum = BillingFormatEnum.JSON, currency: str | None = None, startDate: str | None = None, endDate: str | None = None, limit: int | None = None, page: int | None = None, sortOrder: str | None = None)[source]
Retrieves billing usage information.
Fetches usage data from the Venice AI Billing API with various filtering options. The response format is determined by the ‘format’ parameter and corresponding ‘Accept’ header: ‘application/json’ for JSON format or ‘text/csv’ for CSV format.
- Parameters:
format (venice_ai.types.billing.BillingFormatEnum) – Response format (JSON or CSV). Defaults to
JSON.currency (Optional[str]) – Optional currency filter (USD or VCU).
startDate (Optional[str]) – Optional start date (ISO 8601 format, e.g.,
"2025-01-01T00:00:00Z").endDate (Optional[str]) – Optional end date (ISO 8601 format, e.g.,
"2025-05-01T00:00:00Z").limit (Optional[int]) – Optional number of items per page (1-500, default
200).page (Optional[int]) – Optional page number for pagination (default
1).sortOrder (Optional[str]) – Optional sort order for timestamp (asc/desc, default
'desc').
- Returns:
Billing usage data as
BillingUsageResponsefor JSON, orbytesfor CSV.- Return type:
Union[venice_ai.types.billing.BillingUsageResponse, bytes]
- Raises:
venice_ai.exceptions.InvalidRequestError – If parameter values are invalid.
venice_ai.exceptions.AuthenticationError – If the API key is invalid.
venice_ai.exceptions.PermissionDeniedError – If access is denied.
venice_ai.exceptions.RateLimitError – If rate limits are exceeded.
venice_ai.exceptions.APIError – For other API-related errors.
Example
from venice_ai import VeniceClient from venice_ai.types.billing import BillingFormatEnum client = VeniceClient(api_key="your-api-key") # Get JSON usage data usage_response = client.billing.get_usage( startDate="2025-01-01T00:00:00Z", endDate="2025-05-01T00:00:00Z", limit=10, page=1 ) # Access usage records for usage_record in usage_response['data']: print(f"Date: {usage_record['timestamp']}, Cost: {usage_record['amount']}") # Get CSV usage data usage_csv = client.billing.get_usage( format=BillingFormatEnum.CSV, startDate="2025-01-01T00:00:00Z", endDate="2025-05-01T00:00:00Z" ) # Write CSV to file with open("usage.csv", "wb") as f: f.write(usage_csv)
- class venice_ai.resources.billing.AsyncBilling(client: venice_ai._resource.AsyncClientT)[source]
Bases:
AsyncAPIResourceProvides access to billing and usage data operations using asynchronous requests.
Manages asynchronous billing operations, providing methods to retrieve billing usage data in either JSON or CSV format using asynchronous requests. It’s designed to work with
AsyncVeniceClient, allowing for non-blocking API calls in asynchronous applications. The class handles request formatting, response parsing, and proper type conversions based on the requested format.- Parameters:
client (venice_ai.AsyncVeniceClient) – The asynchronous Venice AI client instance used for making API requests.
- async get_usage(*, format: venice_ai.types.billing.BillingFormatEnum = BillingFormatEnum.JSON, currency: str | None = None, startDate: str | None = None, endDate: str | None = None, limit: int | None = None, page: int | None = None, sortOrder: str | None = None)[source]
Retrieves billing usage information asynchronously.
Fetches usage data from the Venice AI Billing API with various filtering options, using asynchronous HTTP requests. The method sets the appropriate ‘Accept’ header (
'application/json'or'text/csv') based on the requested format, which determines how the API processes and returns the data.- Parameters:
format (venice_ai.types.billing.BillingFormatEnum) – Response format (JSON or CSV). Defaults to
JSON.currency (Optional[str]) – Optional currency filter (USD or VCU).
startDate (Optional[str]) – Optional start date (ISO 8601 format, e.g.,
"2025-01-01T00:00:00Z").endDate (Optional[str]) – Optional end date (ISO 8601 format, e.g.,
"2025-05-01T00:00:00Z").limit (Optional[int]) – Optional number of items per page (1-500, default
200).page (Optional[int]) – Optional page number for pagination (default
1).sortOrder (Optional[str]) – Optional sort order for timestamp (asc/desc, default
'desc').
- Returns:
Billing usage data as
BillingUsageResponsefor JSON, orbytesfor CSV.- Return type:
Union[venice_ai.types.billing.BillingUsageResponse, bytes]
- Raises:
venice_ai.exceptions.InvalidRequestError – If parameter values are invalid.
venice_ai.exceptions.AuthenticationError – If the API key is invalid.
venice_ai.exceptions.PermissionDeniedError – If access is denied.
venice_ai.exceptions.RateLimitError – If rate limits are exceeded.
venice_ai.exceptions.APIError – For other API-related errors.
Example
import asyncio from venice_ai import AsyncVeniceClient from venice_ai.types.billing import BillingFormatEnum async def get_usage_example(): async with AsyncVeniceClient(api_key="your-api-key") as client: # Get JSON usage data usage_response = await client.billing.get_usage( startDate="2025-01-01T00:00:00Z", endDate="2025-05-01T00:00:00Z", limit=10, page=1 ) # Access usage records for usage_record in usage_response['data']: print(f"Date: {usage_record['timestamp']}, Cost: {usage_record['amount']}") # Get CSV usage data usage_csv = await client.billing.get_usage( format=BillingFormatEnum.CSV, startDate="2025-01-01T00:00:00Z", endDate="2025-05-01T00:00:00Z" ) # Write CSV to file with open("usage.csv", "wb") as f: f.write(usage_csv) asyncio.run(get_usage_example())
Characters Resources¶
- class venice_ai.resources.characters.AsyncCharacters(client: venice_ai._resource.AsyncClientT)[source]
Provides methods for managing AI character definitions asynchronously.
Provides asynchronous methods to list available characters. This class mirrors the functionality of the synchronous
Charactersresource but operates in an asynchronous context.- Parameters:
client (venice_ai._async_client.AsyncVeniceClient) – The async Venice AI client instance used for API requests.
Warning
The Characters API is currently in Preview and may change in future releases.
- async list(*, extra_headers: httpx.Headers | None = None, extra_query: Dict[str, Any] | None = None, extra_body: Dict[str, Any] | None = None, timeout: float | None = None)[source]
List all characters asynchronously.
Retrieves a list of all characters usable with the Venice AI API asynchronously. Each character includes details such as ID, name, and description.
- Parameters:
extra_headers (Optional[httpx.Headers]) – Additional HTTP headers to include in the request.
extra_query (Optional[Dict[str, Any]]) – Additional query parameters to include in the request.
extra_body (Optional[Dict[str, Any]]) – Additional body parameters to include in the request.
timeout (Optional[float]) – Request timeout in seconds.
- Returns:
A list of available characters.
- Return type:
CharacterList- Raises:
venice_ai.exceptions.APIError – If the API request fails.
Example
import asyncio from venice_ai import AsyncVeniceClient async def main(): client = AsyncVeniceClient(api_key="your-api-key") characters_response = await client.characters.list() for character in characters_response.data: print(f"Character ID: {character.slug}, Name: {character.name}") await client.close() asyncio.run(main())
- class venice_ai.resources.characters.Characters(client: venice_ai._resource.SyncClientT)[source]
Provides methods for managing AI character definitions.
Characters represent pre-defined personalities or specialized AI assistants that can be referenced in chat completions requests. This resource provides methods to list available characters.
- Parameters:
client (venice_ai._client.VeniceClient) – The Venice AI client instance used for API requests.
Warning
The Characters API is currently in Preview and may change in future releases.
- list(*, extra_headers: httpx.Headers | None = None, extra_query: Dict[str, Any] | None = None, extra_body: Dict[str, Any] | None = None, timeout: float | None = None)[source]
List all characters.
Retrieves a list of all characters usable with the Venice AI API. Each character includes details such as ID, name, and description.
- Parameters:
extra_headers (Optional[httpx.Headers]) – Additional HTTP headers to include in the request.
extra_query (Optional[Dict[str, Any]]) – Additional query parameters to include in the request.
extra_body (Optional[Dict[str, Any]]) – Additional body parameters to include in the request.
timeout (Optional[float]) – Request timeout in seconds.
- Returns:
A list of available characters.
- Return type:
CharacterList- Raises:
venice_ai.exceptions.APIError – If the API request fails.
Example
from venice_ai import VeniceClient client = VeniceClient(api_key="your-api-key") characters_response = client.characters.list() for character in characters_response.data: print(f"Character ID: {character.slug}, Name: {character.name}")
- class venice_ai.resources.characters.Characters(client: venice_ai._resource.SyncClientT)[source]
Bases:
APIResourceProvides methods for managing AI character definitions.
Characters represent pre-defined personalities or specialized AI assistants that can be referenced in chat completions requests. This resource provides methods to list available characters.
- Parameters:
client (venice_ai._client.VeniceClient) – The Venice AI client instance used for API requests.
Warning
The Characters API is currently in Preview and may change in future releases.
- list(*, extra_headers: httpx.Headers | None = None, extra_query: Dict[str, Any] | None = None, extra_body: Dict[str, Any] | None = None, timeout: float | None = None)[source]
List all characters.
Retrieves a list of all characters usable with the Venice AI API. Each character includes details such as ID, name, and description.
- Parameters:
extra_headers (Optional[httpx.Headers]) – Additional HTTP headers to include in the request.
extra_query (Optional[Dict[str, Any]]) – Additional query parameters to include in the request.
extra_body (Optional[Dict[str, Any]]) – Additional body parameters to include in the request.
timeout (Optional[float]) – Request timeout in seconds.
- Returns:
A list of available characters.
- Return type:
CharacterList- Raises:
venice_ai.exceptions.APIError – If the API request fails.
Example
from venice_ai import VeniceClient client = VeniceClient(api_key="your-api-key") characters_response = client.characters.list() for character in characters_response.data: print(f"Character ID: {character.slug}, Name: {character.name}")
- class venice_ai.resources.characters.AsyncCharacters(client: venice_ai._resource.AsyncClientT)[source]
Bases:
AsyncAPIResourceProvides methods for managing AI character definitions asynchronously.
Provides asynchronous methods to list available characters. This class mirrors the functionality of the synchronous
Charactersresource but operates in an asynchronous context.- Parameters:
client (venice_ai._async_client.AsyncVeniceClient) – The async Venice AI client instance used for API requests.
Warning
The Characters API is currently in Preview and may change in future releases.
- async list(*, extra_headers: httpx.Headers | None = None, extra_query: Dict[str, Any] | None = None, extra_body: Dict[str, Any] | None = None, timeout: float | None = None)[source]
List all characters asynchronously.
Retrieves a list of all characters usable with the Venice AI API asynchronously. Each character includes details such as ID, name, and description.
- Parameters:
extra_headers (Optional[httpx.Headers]) – Additional HTTP headers to include in the request.
extra_query (Optional[Dict[str, Any]]) – Additional query parameters to include in the request.
extra_body (Optional[Dict[str, Any]]) – Additional body parameters to include in the request.
timeout (Optional[float]) – Request timeout in seconds.
- Returns:
A list of available characters.
- Return type:
CharacterList- Raises:
venice_ai.exceptions.APIError – If the API request fails.
Example
import asyncio from venice_ai import AsyncVeniceClient async def main(): client = AsyncVeniceClient(api_key="your-api-key") characters_response = await client.characters.list() for character in characters_response.data: print(f"Character ID: {character.slug}, Name: {character.name}") await client.close() asyncio.run(main())
Type Definitions¶
Type definitions for Venice AI Chat Completions API.
This module contains Pydantic models for response objects and TypedDict definitions for request objects in the Venice AI Chat Completions API, including support for tools, tool calls, log probabilities, and streaming.
- class venice_ai.types.chat.ChatCompletion(**data: Any)[source]
Represents the complete response from a chat completion request.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.ChatCompletionChoice(**data: Any)[source]
Represents a single completion choice generated by the model.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.ChatCompletionChoiceLogprobs(**data: Any)[source]
Aggregates log probability information for all tokens in a completion choice.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.ChatCompletionChunk(**data: Any)[source]
Represents a single chunk in a streaming chat completion response.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.ChatCompletionChunkChoice(**data: Any)[source]
Represents a single choice within a streaming chat completion chunk.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.ChatCompletionChunkChoiceDelta(**data: Any)[source]
Contains the incremental changes for a choice in a streaming chat completion.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.ChatCompletionChunkToolCall(**data: Any)[source]
Represents an incremental tool call within a streaming chat completion chunk. Fields are optional as they arrive incrementally.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.ChatCompletionChunkToolCallFunction(**data: Any)[source]
Represents function call details within a streaming chat completion chunk. Fields are optional as they arrive incrementally.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.ChatCompletionMessage(**data: Any)[source]
Represents a message returned by the model in a chat completion response.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.ChatCompletionTokenLogprob(**data: Any)[source]
Contains comprehensive log probability information for a single token.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.ChatCompletionTopLogprob(**data: Any)[source]
Represents log probability information for alternative tokens at a specific position.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.ChunkModelFactory(**data: Any)[source]
A protocol for classes that can be instantiated from keyword arguments. Used to define the expected interface for stream_cls in chat completions, where the class’s __init__ method should accept
**data.- Parameters:
data (
typing.Any)
- class venice_ai.types.chat.CreateChatCompletionRequest[source]
Defines the complete request structure for creating a chat completion.
This class encapsulates all parameters and options available for chat completion requests, including conversation messages, model selection, generation parameters, tool specifications, and Venice-specific features.
Used as the primary input type for chat completion endpoints, providing comprehensive control over model behavior, output format, tool usage, and specialized features. Supports both streaming and non-streaming completions with extensive customization options.
-
frequency_penalty:
typing.NotRequired[float] Optional. Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.
-
logit_bias:
typing.NotRequired[typing.Dict[str,int]] Optional. Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100.
-
logprobs:
typing.NotRequired[bool] Optional. Whether to return log probabilities of the output tokens, which appear in the
logprobsproperty of thechoiceobject. Defaults tofalse.
-
max_completion_tokens:
typing.NotRequired[int] Optional. The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length.
-
max_temp:
typing.NotRequired[float] 0 <= x <= 2.
- Type:
Optional. Maximum temperature value for dynamic temperature scaling. Range
-
max_tokens:
typing.NotRequired[int] Optional. Deprecated. Please use max_completion_tokens instead. The maximum number of tokens to generate in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length.
-
messages:
typing.Sequence[venice_ai.types.chat.MessageParam] A list of messages comprising the conversation so far.
-
min_p:
typing.NotRequired[float] 0 <= x <= 1.
- Type:
Optional. Sets a minimum probability threshold for token selection. Tokens with probabilities below this value are filtered out. Range
-
min_temp:
typing.NotRequired[float] 0 <= x <= 2.
- Type:
Optional. Minimum temperature value for dynamic temperature scaling. Range
-
model:
str ID of the model to use. See the model endpoint compatibility table for details on which models support this endpoint.
-
n:
typing.NotRequired[int] Optional. How many chat completion choices to generate for each input message. Note that you will be charged for the number of generated tokens across all of the choices. Defaults to 1.
-
parallel_tool_calls:
typing.NotRequired[bool] Optional. Whether to enable parallel function calling during tool use.
-
presence_penalty:
typing.NotRequired[float] Optional. Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.
-
repetition_penalty:
typing.NotRequired[float] Optional. Penalty for token repetition.
-
response_format:
typing.NotRequired[venice_ai.types.chat.ResponseFormat] Optional. An object specifying the format that the model must output. Setting to
{ "type": "json_object" }enables JSON mode, which guarantees the message the model generates is valid JSON.
-
seed:
typing.NotRequired[int] Optional. This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same
seedand parameters should return the same result.
-
stop:
typing.NotRequired[typing.Union[str,typing.List[str]]] Optional. Up to 4 sequences where the API will stop generating further tokens.
-
stop_token_ids:
typing.NotRequired[typing.List[int]] Optional. List of token IDs at which to stop generation.
-
stream:
typing.NotRequired[bool] Optional. If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a
data: [DONE]message. Defaults tofalse.
-
stream_options:
typing.NotRequired[venice_ai.types.chat.StreamOptions] Optional. Options for streaming response. Only used if
streamistrue.
-
temperature:
typing.NotRequired[float] Optional. What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. Defaults to 0.7.
-
tool_choice:
typing.NotRequired[typing.Union[typing.Literal['none','auto'],venice_ai.types.chat.ToolChoiceObject]] Optional. Controls which (if any) function is called by the model.
nonemeans the model will not call a function and instead generates a message.automeans the model can pick between generating a message or calling a function. Specifying a particular function via{"type": "function", "function": {"name": "my_function"}}forces the model to call that function.
-
tools:
typing.NotRequired[typing.List[venice_ai.types.chat.Tool]] Optional. A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for.
-
top_k:
typing.NotRequired[int] Optional. Number of highest probability vocabulary tokens to keep for top-k-filtering.
-
top_logprobs:
typing.NotRequired[int] Optional. An integer between 0 and 5 specifying the number of most likely tokens to return at each token position, each with an associated log probability.
logprobsmust be set totrueif this parameter is used.
-
top_p:
typing.NotRequired[float] Optional. An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. Defaults to 1.
-
user:
typing.NotRequired[str] Optional. A unique identifier representing your end-user, which can help Venice monitor and detect abuse.
-
venice_parameters:
typing.NotRequired[venice_ai.types.chat.VeniceParameters] Optional. Venice-specific parameters to extend or modify API behavior.
-
frequency_penalty:
- class venice_ai.types.chat.FunctionDefinition[source]
Defines the structure and parameters of a function that can be called by the model.
- class venice_ai.types.chat.MessageParam[source]
Defines the structure of a message in a chat conversation for requests.
- class venice_ai.types.chat.ResponseFormat[source]
Specifies the desired output format for the model’s response.
This class enables structured output generation by constraining the model to produce responses in specific formats, particularly JSON. Supports both general JSON mode and schema-constrained JSON generation for applications requiring structured data output.
Used in chat completion requests to ensure the model’s response conforms to expected formats, enabling reliable parsing and processing of model output in structured applications.
-
json_schema:
typing.NotRequired[typing.Dict[str,typing.Any]] Optional. A JSON schema object that the model’s output must adhere to. Only used if
typeisjson_schema.
-
type:
typing.Literal['json_object','json_schema'] Must be one of
json_objectorjson_schema. Setting tojson_objectenables JSON mode, directing the model to generate a valid JSON object. Setting tojson_schemaalso enables JSON mode and additionally requires the model to generate a JSON object that conforms to the provided JSON schema.
-
json_schema:
- class venice_ai.types.chat.StreamOptions[source]
Configures the behavior and features of streaming chat completion responses.
This class provides options for controlling how streaming responses are delivered, including whether to include usage statistics in the final chunk. Used in chat completion requests when streaming is enabled to customize the streaming behavior according to client needs.
Enables fine-grained control over streaming features, allowing clients to optimize for their specific use cases and processing requirements.
-
include_usage:
bool If set, an additional chunk will be streamed before the
data: [DONE]message. This chunk will contain ausagefield, providing token usage information for the entire request.
-
include_usage:
- class venice_ai.types.chat.Tool[source]
Represents a tool that the model can invoke during chat completion.
- class venice_ai.types.chat.ToolCall(**data: Any)[source]
Represents a complete tool call made by the model during chat completion. (Response DTO)
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.ToolCallFunction(**data: Any)[source]
Contains the details of a function call made by the model. (Response DTO)
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.ToolChoiceFunction[source]
Specifies a particular function to be called when using structured tool choice.
- class venice_ai.types.chat.ToolChoiceObject[source]
Defines the object form of tool choice specification for forcing specific tool usage.
- class venice_ai.types.chat.UsageData(**data: Any)[source]
Provides token usage statistics for a chat completion request.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.chat.VeniceParameters[source]
Contains Venice-specific parameters for customizing chat completion behavior.
This class provides access to Venice AI’s unique features and capabilities, including character personas, web search integration, and system prompt customization. These parameters extend the standard chat completion API with Venice-specific functionality.
Used in chat completion requests to leverage Venice AI’s distinctive features, enabling enhanced conversational experiences and specialized capabilities not available in standard chat completion APIs.
-
character_slug:
str Optional. The slug of a specific character to use for the completion. This will influence the model’s persona, response style, and behavior patterns.
-
disable_thinking:
bool Optional. On supported reasoning models, will disable thinking and strip the
<think></think>blocks from the response.
-
enable_web_citations:
bool Optional. When web search is enabled, this will request that the LLM cite its sources using a
[REF]0[/REF]format.
-
enable_web_search:
typing.Literal['on','off','auto'] Optional. Controls whether the model can perform web searches to enhance responses.
onalways enables search,offdisables it completely,auto(default) lets the model decide based on context.
-
include_search_results_in_stream:
bool Optional. Experimental feature. When set to true, the LLM will include search results in the first emitted chunk.
-
include_venice_system_prompt:
bool Optional. If
true(default), the default Venice system prompt will be included. Set tofalseto exclude it and use only the provided messages.
-
strip_thinking_response:
bool Optional. Strip
<think></think>blocks from the response. Applicable only to reasoning/thinking models.
-
character_slug:
- class venice_ai.types.chat.VeniceParametersResponse(**data: Any)[source]
Venice-specific parameters included in the chat completion response.
Contains information about Venice-specific features that were used or configured for the request, including web search settings, character information, and thinking/reasoning controls.
- Parameters:
data (
typing.Any)
-
character_slug:
typing.Optional[str] The character slug used for this request, if any.
-
disable_thinking:
bool Whether thinking was disabled for this request.
-
enable_web_citations:
bool Whether web citations were enabled for this request.
-
enable_web_search:
typing.Literal['auto','off','on'] The web search setting that was used for this request.
-
include_search_results_in_stream:
bool Whether search results were included in the stream.
-
include_venice_system_prompt:
bool Whether the Venice system prompt was included.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
-
strip_thinking_response:
bool Whether thinking responses were stripped from the output.
-
web_search_citations:
typing.List[venice_ai.types.chat.WebSearchCitation] List of web search citations if web search was performed.
- class venice_ai.types.chat.WebSearchCitation(**data: Any)[source]
Represents a web search citation in the Venice parameters response.
Contains information about web sources cited by the model when web search is enabled, including the source URL, title, content snippet, and date.
- Parameters:
data (
typing.Any)
-
content:
typing.Optional[str] A snippet of content from the web source.
-
date:
typing.Optional[str] The date of the web source in ISO format.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
-
title:
str The title of the web page or source.
-
url:
str The URL of the web source.
- class venice_ai.types.models.Model[source]
Represents a single AI model available through the Venice.ai API.
Contains comprehensive information about an AI model including its identification and specifications. The model_spec field contains all the detailed information about pricing, capabilities, and constraints.
- Parameters:
id (str) – Unique identifier for the model.
object (Literal["model"]) – Object type, always
"model".created (int) – Unix timestamp (seconds) of when the model was created.
owned_by (str) – Organization or user that owns the model.
type (ModelType) – Type of the model (e.g.,
"text","image").model_spec (ModelSpec) – Detailed specifications including pricing, capabilities, and constraints.
- class venice_ai.types.models.ModelCapabilities[source]
Defines the functional capabilities and limitations of an AI model.
Specifies what features a model supports, including code optimization, quantization method, reasoning, vision, and various other capabilities. Uses camelCase field names to match the API response format.
- Parameters:
optimizedForCode (bool) – Indicates if the model is optimized for code generation.
quantization (str) – The quantization method used (e.g., “fp16”, “fp8”).
supportsFunctionCalling (bool) – Indicates if the model supports function calling.
supportsReasoning (bool) – Indicates if the model supports reasoning capabilities.
supportsResponseSchema (bool) – Indicates if the model supports structured response schemas.
supportsVision (bool) – Indicates if the model supports vision/image understanding.
supportsWebSearch (bool) – Indicates if the model supports web search integration.
supportsLogProbs (bool) – Indicates if the model supports log probability output.
streaming (bool) – Legacy field - Indicates if the model supports streaming responses.
async (bool) – Legacy field - Indicates if the model supports asynchronous operations.
max_tokens (int) – Legacy field - Maximum number of tokens the model can process.
supports_functions (bool) – Legacy field - Use supportsFunctionCalling instead.
- class venice_ai.types.models.ModelCompatibilityList[source]
Represents a mapping of external model names to Venice.ai model IDs.
Provides compatibility mappings that allow users to reference models using external naming conventions (e.g., OpenAI model names) while automatically resolving to the corresponding Venice.ai model IDs.
- class venice_ai.types.models.ModelConstraints[source]
Defines parameter constraints and valid ranges for a model.
Contains the allowable ranges and default values for various model parameters like temperature and top_p. Used within the
Modelclass to specify parameter limits.- Parameters:
temperature (ModelConstraintsTemperature) – Constraints for the temperature parameter.
top_p (ModelConstraintsTopP) – Constraints for the top_p parameter.
- class venice_ai.types.models.ModelConstraintsTemperature[source]
Defines valid range and default value for the temperature parameter.
Specifies the constraints for the temperature parameter that controls randomness in model outputs. Used within
ModelConstraints. Note: The API may only return ‘default’ without min/max values.
- class venice_ai.types.models.ModelConstraintsTopP[source]
Defines valid range and default value for the top_p parameter.
Specifies the constraints for the top_p parameter that controls nucleus sampling in model outputs. Used within
ModelConstraints. Note: The API may only return ‘default’ without min/max values.
- class venice_ai.types.models.ModelList[source]
Represents a collection of AI models returned by the list models endpoint.
Contains a list of available
Modelobjects along with metadata about the collection. Typically returned when querying for available models through the API.- Parameters:
object (Literal["list"]) – Object type, always
"list".data (List[Model]) – A list of available
Modelobjects.type (Optional[ModelType]) – Optional. The type of models in the list, if filtered.
- class venice_ai.types.models.ModelPricing[source]
Represents pricing information for an AI model.
Defines the cost structure for using a model, including costs per token, image, or time unit depending on the model type. Used within the
Modelclass to provide billing information.The pricing structure now supports both USD and VCU (Venice Compute Units) for accurate cost tracking and billing.
- Parameters:
input (PricingUnit) – Pricing for input operations.
output (PricingUnit) – Pricing for output operations.
input_cost_per_mtok (float) – Legacy: Cost for input per 1000 tokens (USD only).
output_cost_per_mtok (float) – Legacy: Cost for output per 1000 tokens (USD only).
input_cost_per_image (float) – Cost for input per image.
output_cost_per_image (float) – Cost for output per image.
input_cost_per_second (float) – Cost for input per second (e.g., audio).
output_cost_per_second (float) – Cost for output per second (e.g., audio).
- class venice_ai.types.models.ModelSpec[source]
Defines the specifications for a model including pricing and capabilities.
Contains detailed information about a model’s pricing structure, capabilities, constraints, and other specifications. This is the main container for model metadata in the API response.
- Parameters:
pricing (ModelPricing) – Pricing information for the model with USD and VCU costs.
availableContextTokens (int) – Maximum context window size in tokens.
capabilities (ModelCapabilities) – Model capabilities and feature support.
constraints (ModelConstraints) – Parameter constraints for the model.
name (str) – Human-readable name of the model.
modelSource (str) – URL or reference to the model source.
offline (bool) – Whether the model is currently offline.
traits (List[str]) – List of model traits (e.g., “default”, “fastest”).
beta (bool) – Indicates if this is a beta model (optional).
- class venice_ai.types.models.ModelTraitList[source]
Represents a mapping of model traits to their corresponding model IDs.
Provides a way to map semantic model traits (like “default”, “fastest”, “most_accurate”) to specific model IDs. Used for trait-based model selection through the API.
- venice_ai.types.models.ModelType
Type alias for valid model types in the Venice.ai API.
Defines the available categories of AI models that can be filtered when listing models, traits, or compatibility mappings. Each type represents a different class of AI functionality:
"embedding": Models that generate vector embeddings from text"image": Models for image generation and manipulation"text": Models for text generation and chat completions"tts": Text-to-speech models for audio generation"upscale": Models for image upscaling and enhancement
alias of
Literal[‘embedding’, ‘image’, ‘text’, ‘tts’, ‘upscale’]
- class venice_ai.types.models.PricingDetail[source]
Represents pricing details for input and output.
- Parameters:
input (PricingUnit) – Pricing for input (per 1000 tokens for text models).
output (PricingUnit) – Pricing for output (per 1000 tokens for text models).
- class venice_ai.types.models.PricingUnit[source]
Represents a pricing unit with both USD and VCU values.
Type definitions for Venice AI image-related API endpoints.
- class venice_ai.types.image.GenerateImageRequest[source]
Represents parameters for an image generation request to the /image/generate endpoint.
This model defines the structure for requesting image generation using Venice AI’s native image generation API. It provides comprehensive control over generation parameters including model selection, prompts, dimensions, and various quality and style settings.
- Parameters:
model (str) – ID of the model to use for image generation (e.g.,
"venice-sd35").prompt (str) – Text prompt describing the image to generate.
cfg_scale (float) – Optional. Classifier Free Guidance scale (1.0-30.0). Higher values adhere more strictly to the prompt.
embed_exif_metadata (bool) – Optional. Whether to embed generation metadata in EXIF data.
format (Literal["jpeg", "png", "webp"]) – Optional. Output image format.
height (int) – Optional. Height of the generated image in pixels.
hide_watermark (bool) – Optional. Whether to hide the Venice AI watermark from the generated image.
lora_strength (int) – Optional. Strength of LoRA model adaptation (0-100).
negative_prompt (str) – Optional. Text describing what to avoid in the generated image.
return_binary (bool) – Optional. If
True, return raw image bytes instead of JSON response with base64 data.safe_mode (bool) – Optional. Whether to enable content filtering for safer outputs.
seed (int) – Optional. Random seed for reproducible image generation results.
steps (int) – Optional. Number of diffusion steps. Higher values generally improve quality but increase generation time.
style_preset (str) – Optional. Style preset ID to apply to the generated image.
width (int) – Optional. Width of the generated image in pixels.
- class venice_ai.types.image.ImageDataItem(**data: Any)[source]
Represents an individual image data item within a
SimpleImageResponse.This model defines the structure for a single generated image in OpenAI-compatible responses, providing either base64-encoded image data or a URL reference to the generated image depending on the requested response format.
Contains either base64 encoded image data or a URL to the image, but not both.
- Parameters:
b64_json (Optional[str]) – Base64-encoded image data as a JSON string (when
response_formatis"b64_json").url (Optional[str]) – URL pointing to the generated image (when
response_formatis"url").data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.image.ImageResponse(**data: Any)[source]
Represents the response structure from the /image/generate endpoint.
This model defines the complete response format for Venice AI’s native image generation API, containing the generated images as base64-encoded data along with metadata including timing information and the original request parameters.
- Parameters:
id (str) – Unique identifier for the image generation request, used for tracking and reference.
images (List[str]) – List of base64-encoded image data strings representing the generated images.
request (Optional[Dict[str, Any]]) – Optional. Echo of the original request parameters that were used for generation.
timing (TimingInfo) – Detailed timing information and performance metrics for the request.
created (Optional[str]) – Optional. ISO 8601 timestamp indicating when the image generation was completed.
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.image.ImageStyleEnum(*values)[source]
Represents common or example styles for image generation.
This enum provides a static, curated list of frequently used image styles. For a comprehensive and dynamically updated list of all available styles, it is recommended to use the
venice_ai.resources.image.Image.get_available_styles()method (or its asynchronous counterpartvenice_ai.resources.image.AsyncImage.get_available_styles()) to fetch the current styles directly from the API.Example static values:
- ANALOG_FILM = 'Analog Film'
Vintage analog film photography style
- ANIME = 'Anime'
Japanese anime/manga artistic style
- CINEMATIC = 'Cinematic'
Movie-like cinematic style with dramatic lighting
- COMIC_BOOK = 'Comic Book'
Comic book illustration style
- THREE_D_MODEL = '3D Model'
3D rendered model style
- class venice_ai.types.image.ImageStyleList[source]
Represents the response structure from the /image/styles endpoint.
This model defines the format for retrieving available image style presets from the Venice AI API. These styles can be used with the
style_presetparameter in image generation requests to influence the artistic direction and visual characteristics of generated images.- Parameters:
data (List[str]) – List of available image style preset names that can be used in generation requests.
object (Literal["list"]) – Type of the response object, always
"list"to indicate this is a list response.
- class venice_ai.types.image.SimpleGenerateImageRequest[source]
Represents parameters for an OpenAI-compatible image generation request to the /images/generations endpoint.
This model provides a simplified interface for image generation that maintains compatibility with OpenAI’s image generation API. It offers streamlined parameters for common image generation tasks while supporting Venice AI’s enhanced features like custom quality settings and output formats.
- Parameters:
prompt (str) – Text prompt describing the image to generate.
background (Optional[Literal["transparent", "opaque", "auto"]]) – Optional. Background style for the generated image.
model (str) – ID of the model to use for image generation.
moderation (Optional[Literal["low", "auto"]]) – Optional. Content moderation level to apply during generation.
n (Optional[int]) – Optional. Number of images to generate (typically 1-10).
output_compression (Optional[int]) – Optional. Output image compression level (0-100, where 100 is highest quality).
output_format (Literal["jpeg", "png", "webp"]) – Optional. Output image format.
quality (Optional[Literal["auto", "high", "medium", "low", "hd", "standard"]]) – Optional. Image quality setting that affects generation parameters.
response_format (Optional[Literal["b64_json", "url"]]) – Optional. Format of the response data (base64 JSON or URL).
size (Optional[Literal["auto", "256x256", "512x512", "1024x1024", "1536x1024", "1024x1536", "1792x1024", "1024x1792"]]) – Optional. Dimensions of the generated image in pixels.
style (Optional[Literal["vivid", "natural"]]) – Optional. Artistic style of the generated image.
user (str) – Optional. User identifier for tracking and analytics purposes.
- class venice_ai.types.image.SimpleImageResponse(**data: Any)[source]
Represents the response structure from the /images/generations (OpenAI-compatible) endpoint.
This model provides an OpenAI-compatible response format for image generation requests, containing a list of generated images and creation timestamp. It maintains compatibility with existing OpenAI client libraries and workflows.
- Parameters:
created (int) – Unix timestamp (seconds since epoch) indicating when the image generation was initiated.
images (List[ImageDataItem]) – List of image data items. The API provides this under the ‘data’ key.
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.image.TimingInfo(**data: Any)[source]
Represents timing metrics for image generation and processing operations.
This model provides detailed performance information about various stages of image generation requests, enabling monitoring and optimization of processing times across different components of the Venice AI pipeline. All timing values are measured in seconds.
- Parameters:
inferenceDuration (float) – Duration of the actual inference/generation process in seconds.
inferencePreprocessingTime (float) – Time spent on preprocessing operations before inference begins, in seconds.
inferenceQueueTime (float) – Time spent waiting in the inference queue before processing starts, in seconds.
total (float) – Total time taken for the entire request from start to completion, in seconds.
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.image.UpscaleImageRequest[source]
Represents parameters for an image upscaling request to the /image/upscale endpoint.
This model defines the structure for requesting image upscaling and enhancement operations. It allows for scaling existing images to higher resolutions while optionally applying AI-powered enhancements to improve quality and detail.
Note: The ‘image’ data is sent base64-encoded within the JSON payload.
- Parameters:
enhance (Literal["true", "false"]) – Optional. Whether to enhance image quality during upscaling (
"true"or"false").enhance_creativity (Optional[float]) – Optional. Creativity level for enhancement (0.0-1.0, where 1.0 is most creative).
enhance_prompt (str) – Optional. Text prompt to guide the enhancement process.
replication (Optional[float]) – Optional. Replication factor for matching the original image (0.0-1.0, where 1.0 matches exactly).
scale (float) – Optional. Scaling factor for upscaling (e.g.,
2.0for 2x upscaling).
Type definitions for Venice AI API Keys functionality.
- class venice_ai.types.api_keys.ApiKey[source]
Represents a complete API key object in the Venice AI system.
This type defines the structure of an API key as returned by the Venice AI API key management endpoints. Contains all metadata, configuration, and usage information associated with an API key, including its type, limits, creation details, and current usage statistics.
Retrieved from /api_keys endpoint.
-
apiKeyType:
typing.Literal['INFERENCE','ADMIN'] Type of the API key, determining its access permissions and capabilities.
-
consumptionLimits:
venice_ai.types.api_keys.ConsumptionLimit Consumption limits and spending constraints associated with this API key.
-
createdAt:
typing.Optional[str] ISO 8601 timestamp indicating when the API key was created.
-
description:
str Human-readable description or name assigned to the API key.
-
expiresAt:
typing.Optional[str] ISO 8601 timestamp when the API key expires, or None if it never expires.
-
id:
str Unique identifier for the API key used in management operations.
-
last6Chars:
str Last 6 characters of the actual API key value for identification purposes.
-
lastUsedAt:
typing.Optional[str] ISO 8601 timestamp of the most recent API request made with this key.
-
usage:
venice_ai.types.api_keys.ApiKeyUsage Current usage statistics and consumption metrics for this API key.
-
apiKeyType:
- class venice_ai.types.api_keys.ApiKeyCreateRequest[source]
Request payload for creating a new API key.
This type defines the structure of the request body used when creating a new API key through the Venice AI API. Includes all configurable parameters such as key type, consumption limits, expiration settings, and optional Web3 integration parameters.
Used with POST /api_keys endpoint.
-
apiKeyType:
typing.Literal['INFERENCE','ADMIN'] Type of API key to create, determining access permissions and capabilities.
-
consumptionLimit:
venice_ai.types.api_keys.ConsumptionLimit Spending and usage limits to apply to the new API key.
-
description:
str Human-readable description or name for the new API key.
-
expiresAt:
typing.Optional[str] Optional expiration date in ISO 8601 format, or empty string for no expiration.
-
web3_address:
typing.Optional[str] Optional Web3 wallet address for blockchain-authenticated API keys.
-
web3_network_id:
typing.Optional[str] Optional Web3 network identifier for blockchain-authenticated API keys.
-
apiKeyType:
- class venice_ai.types.api_keys.ApiKeyCreateResponse[source]
Response payload returned after successful API key creation.
This type represents the response structure returned by the Venice AI API when a new API key is successfully created. Contains the complete details of the newly created key, including the secret key value which is only shown once during creation for security purposes.
Contains the newly created API key details.
-
data:
typing.Dict[str,typing.Union[str,venice_ai.types.api_keys.ConsumptionLimit,None]] Dictionary containing the created API key details, including the secret key value (shown only once).
-
success:
bool Boolean flag indicating whether the API key creation operation was successful.
-
data:
- class venice_ai.types.api_keys.ApiKeyGenerateWeb3KeyCreateRequest[source]
Request payload for creating a new Web3-authenticated API key.
This type defines the structure of the request body used when creating a Web3 API key through wallet signature verification. Includes all standard API key parameters plus Web3-specific fields for address verification, signature proof, and the authentication token.
Used with POST /api_keys/generate_web3_key endpoint.
-
address:
str Web3 wallet address used for blockchain-based authentication.
-
apiKeyType:
typing.Literal['INFERENCE','ADMIN'] Type of API key to create, determining access permissions and capabilities.
-
consumptionLimit:
venice_ai.types.api_keys.ConsumptionLimit Spending and usage limits to apply to the new Web3 API key.
-
description:
str Human-readable description or name for the new Web3 API key.
-
expiresAt:
typing.Optional[str] Optional expiration date in ISO 8601 format, or None for no expiration.
-
signature:
str Cryptographic signature proving ownership of the specified wallet address.
-
token:
str Authentication token obtained from the preliminary GET request.
-
address:
- class venice_ai.types.api_keys.ApiKeyGenerateWeb3KeyCreateResponse[source]
Response payload returned after successful Web3 API key creation.
This type represents the response structure returned by the Venice AI API when a new Web3 API key is successfully created through wallet signature verification. Contains the complete details of the newly created key, including the secret key value which is only shown once during creation.
Response format for Web3 API key creation.
Contains the newly created API key details.
-
data:
typing.Dict[str,typing.Union[str,venice_ai.types.api_keys.ConsumptionLimit,None]] Dictionary containing the created Web3 API key details, including the secret key value (shown only once).
-
success:
bool Boolean flag indicating whether the Web3 API key creation operation was successful.
-
data:
- class venice_ai.types.api_keys.ApiKeyGenerateWeb3KeyGetResponse[source]
Response payload for Web3 key generation token retrieval.
This type represents the response structure returned when requesting a token for Web3 API key generation. The token is required as part of the Web3 key creation process to ensure secure authentication through wallet signature verification.
Response from the GET /api_keys/generate_web3_key endpoint.
Contains token needed for Web3 key generation.
-
data:
typing.Dict[str,str] Dictionary containing the authentication token required for subsequent Web3 key creation.
-
success:
bool Boolean flag indicating whether the token retrieval operation was successful.
-
data:
- class venice_ai.types.api_keys.ApiKeyList[source]
Response payload containing a collection of API key objects.
This type represents the response structure returned by the Venice AI API when retrieving a list of API keys. Provides a standardized container for multiple API key objects with metadata indicating the response type.
Retrieved from GET /api_keys endpoint.
-
data:
typing.List[venice_ai.types.api_keys.ApiKey] Array of API key objects containing metadata and configuration details.
-
object:
typing.Literal['list'] Response type identifier, always “list” for collection responses.
-
data:
- class venice_ai.types.api_keys.ApiKeyRateLimitItem[source]
Represents rate limit configuration for a specific API model.
This type defines the rate limiting rules applied to a particular model when accessed through an API key. Contains the model identifier and associated rate limit policies that govern request frequency and volume.
-
apiModelId:
str Unique identifier of the API model to which these rate limits apply.
-
rateLimits:
typing.List[typing.Dict[str,typing.Union[float,str]]] Array of rate limiting rules and policies governing usage of this model.
-
apiModelId:
- class venice_ai.types.api_keys.ApiKeyUsage[source]
Represents usage statistics and metrics for an API key.
This type encapsulates usage information for an API key, providing insights into consumption patterns over specific time periods. Used to track and monitor API key activity for billing and rate limiting purposes.
-
trailingSevenDays:
typing.Dict[str,str] Usage statistics for the trailing 7-day period, containing ‘usd’ and ‘vcu’ consumption values.
-
trailingSevenDays:
- class venice_ai.types.api_keys.ApiTier[source]
Represents API tier information and billing configuration.
This type defines the characteristics of an API tier, including its identifier and billing status. API tiers determine access levels, rate limits, and whether usage is subject to charges.
-
id:
str Unique identifier of the API tier level.
-
isCharged:
bool Boolean flag indicating whether usage under this tier incurs billing charges.
-
id:
- class venice_ai.types.api_keys.Balances[source]
Represents current account balances in supported currencies.
This type contains the available balances for an account across different currency types supported by the Venice AI platform, including traditional USD and Venice Compute Units (VCU).
-
USD:
float Current account balance in US Dollars.
-
VCU:
float Current account balance in Venice Compute Units.
-
USD:
- class venice_ai.types.api_keys.ConsumptionLimit[source]
Defines consumption limits for API keys within each billing epoch.
This type represents the spending and usage constraints that can be applied to API keys to control resource consumption. Limits can be specified in both USD currency and Venice Compute Units (VCU).
-
usd:
typing.Optional[float] Optional spending limit in US Dollars for the billing period.
-
vcu:
typing.Optional[float] Optional usage limit in Venice Compute Units for the billing period.
-
usd:
- class venice_ai.types.api_keys.RateLimitInfo[source]
Comprehensive rate limit and access information for an API key.
This type represents the complete rate limiting context for an API key, including current access permissions, tier information, account balances, key expiration details, and specific rate limit configurations. Used to determine whether requests can be processed and what limits apply.
Retrieved from /api_keys/rate_limits endpoint.
-
accessPermitted:
bool Boolean flag indicating whether API access is currently permitted based on rate limits and account status.
-
apiTier:
venice_ai.types.api_keys.ApiTier API tier configuration and billing information associated with this key.
-
balances:
venice_ai.types.api_keys.Balances Current account balances across supported currency types.
-
keyExpiration:
typing.Optional[str] ISO 8601 timestamp when the API key expires, or None if it never expires.
-
nextEpochBegins:
str ISO 8601 timestamp indicating when the next rate limiting epoch period begins.
-
rateLimits:
typing.List[venice_ai.types.api_keys.ApiKeyRateLimitItem] Array of model-specific rate limiting rules and configurations applied to this key.
-
accessPermitted:
- class venice_ai.types.api_keys.RateLimitLog[source]
Represents a single rate limit event log entry.
This type defines the structure of individual rate limit log entries that track rate limiting events for API keys. Contains details about the key, model, tier, event type, and timing information for auditing and monitoring rate limit enforcement.
Retrieved from /api_keys/rate_limits/log endpoint.
-
apiKeyId:
str Unique identifier of the API key that triggered this rate limit event.
-
modelId:
str Identifier of the API model involved in the rate limiting event.
-
rateLimitTier:
str Rate limit tier that was active when this event occurred.
-
rateLimitType:
str Type of rate limit event (e.g., “exceeded”, “reset”, “warning”).
-
timestamp:
str ISO 8601 timestamp when this rate limit event was recorded.
-
apiKeyId:
- class venice_ai.types.api_keys.RateLimitLogList[source]
Response payload containing a collection of rate limit log entries.
This type represents the response structure returned by the Venice AI API when retrieving rate limit logs. Provides a standardized container for multiple rate limit log entries with metadata indicating the response type.
Retrieved from GET /api_keys/rate_limits/log endpoint.
-
data:
typing.List[venice_ai.types.api_keys.RateLimitLog] Array of rate limit log entries containing event details and timestamps.
-
object:
typing.Literal['list'] Response type identifier, always “list” for collection responses.
-
data:
Type definitions for Venice AI Audio API.
This module contains TypedDict definitions and Enums for request objects in the Venice AI Audio API, covering the speech creation endpoint.
- class venice_ai.types.audio.CreateSpeechRequest[source]
Request parameters for creating speech audio from text input.
This TypedDict defines the structure for requests to the POST /audio/speech endpoint, which converts text into spoken audio using specified voice characteristics and output format. The request allows customization of voice selection, audio format, playback speed, and user identification for tracking purposes.
- model
ID of the model to use for speech generation (e.g., “tts-kokoro”).
- input
The text to convert to speech. Maximum length varies by model.
- voice
The voice to use for the generated audio. See
Voicefor available options.
- response_format
Optional. The format to return the audio in. Defaults to “mp3”. See
ResponseFormatfor available formats.
- speed
Optional. The speed of the generated audio. Select a value from 0.25 to 4.0. Defaults to 1.0.
- user
Optional. A unique identifier representing the end-user, which can help Venice AI to monitor and detect abuse.
- class venice_ai.types.audio.ResponseFormat(*values)[source]
Available audio response formats for speech generation output.
This enumeration defines the supported audio file formats that can be requested when generating speech from text. The format determines the encoding, compression, and quality characteristics of the returned audio data from the text-to-speech endpoint. Different formats offer trade-offs between file size, quality, and compatibility.
- class venice_ai.types.audio.Voice(*values)[source]
Available voices for speech generation in the Venice AI Audio API.
This enumeration defines the complete set of voice options that can be used when generating speech from text via the text-to-speech endpoint. Each voice represents different speaker characteristics including gender, accent, and vocal qualities. Voice names follow a pattern indicating language/region and gender (e.g.,
affor American Female,amfor American Male).
- class venice_ai.types.audio.VoiceDetail[source]
Detailed information about a single text-to-speech voice.
This TypedDict represents the structure of voice information returned by the get_voices() method. It contains metadata about a voice including its unique identifier, associated model, gender characteristics, and regional/language information derived from the voice ID.
- id
The unique identifier for the voice as provided by the API (e.g., “af_alloy”, “zm_yunjian”).
- model_id
The ID of the TTS model this voice is associated with (e.g., “tts-kokoro”).
- gender
The perceived gender of the voice, parsed from the voice ID prefix. “unknown” if the prefix is not recognized or ambiguous.
- region_code
The raw two-letter prefix from the voice ID that typically indicates region/language and gender (e.g., “af”, “zm”).
- language
A descriptive name of the primary language associated with the voice, derived from the region_code (e.g., “American English”, “Mandarin Chinese”).
- accent
A descriptive name of the accent or locale associated with the voice, derived from the region_code (e.g., “US”, “Standard Chinese”).
- class venice_ai.types.audio.VoiceList[source]
A list of voice details with optional filtering metadata.
This TypedDict represents the structure returned by the get_voices() method, containing a list of VoiceDetail objects along with metadata about any filters that were applied to generate the list. This follows the standard API pattern for list responses.
- object
A string indicating the type of API object, always “list” for lists.
- data
A list containing VoiceDetail objects.
- model_id_filter
The model_id that was used to filter the voices, if any. None if no model ID filter was applied.
- gender_filter
The gender that was used to filter the voices, if any. None if no gender filter was applied.
- region_code_filter
The region_code (e.g., “af”, “zm”) that was used to filter the voices, if any. None if no region code filter was applied.
Type definitions for Venice AI Embeddings API.
This module contains TypedDict definitions for request and response objects in the Venice AI Embeddings API, covering the embeddings creation endpoint.
- class venice_ai.types.embeddings.CreateEmbeddingRequest[source]
Request parameters for creating embeddings from text or token inputs.
This TypedDict defines the structure for requests to the POST /embeddings endpoint, which generates vector embeddings from input text or tokens using specified models. The embeddings can be used for semantic search, clustering, and similarity tasks.
- model
ID of the embedding model to use.
- input
Text or tokens to embed. Can be a string, list of strings, list of tokens, or list of token lists.
- dimensions
Optional. Number of dimensions for the output embeddings.
- encoding_format
Optional. Format for returned embeddings (
"float"or"base64"). Defaults to"float".
- user
Optional. Unique identifier for the end-user.
-
dimensions:
typing.Optional[int] The number of dimensions the resulting output embedding should have. Only supported in text-embedding-3 and later models. Optional parameter.
-
encoding_format:
typing.Literal['float','base64'] The format to return the embeddings in. Can be
"float"(default) or"base64". Optional parameter.
-
input:
typing.Union[str,typing.List[str],typing.List[int],typing.List[typing.List[int]]] Input text or tokens to embed.
-
model:
str ID of the model to use.
-
user:
typing.Optional[str] A unique identifier representing your end-user, which can help Venice AI monitor and detect abuse. Optional parameter.
- class venice_ai.types.embeddings.Embedding[source]
Represents a single embedding vector with its metadata.
This TypedDict defines an individual embedding result containing the vector representation of input text along with its position index and object type. Each embedding is part of the response from the /embeddings endpoint and contains the actual numerical vector that can be used for similarity calculations.
- embedding
Embedding vector as a list of floats or base64-encoded string.
- index
Index of this embedding in the list.
- object
Type of the object, always “embedding”.
-
embedding:
typing.Union[typing.List[float],str] The embedding vector, which is a list of floats or a base64-encoded string (when encoding_format=”base64”). The length of vector depends on the model used.
-
index:
int The index of the embedding in the list of embeddings.
-
object:
typing.Literal['embedding'] The type of the object, which is always
"embedding".
- class venice_ai.types.embeddings.EmbeddingList[source]
Complete response from the embeddings creation endpoint.
This TypedDict represents the full response structure returned by the POST /embeddings endpoint, containing a list of embedding vectors, model information, and usage statistics. This is the primary response format for all embedding generation requests.
- data
List of embedding objects.
- model
Model used to generate the embeddings.
- object
Type of the object, always “list”.
- usage
Token usage statistics for the request.
-
data:
typing.List[venice_ai.types.embeddings.Embedding] The list of embeddings generated by the model.
-
model:
str The model ID used to generate the embeddings.
-
object:
typing.Literal['list'] The object type, which is always
"list".
-
usage:
venice_ai.types.embeddings.EmbeddingUsage The usage statistics for the request.
- class venice_ai.types.embeddings.EmbeddingUsage[source]
Token usage statistics for an embedding request.
This TypedDict provides information about token consumption during the embedding generation process, including both prompt tokens and total tokens used. This information is useful for tracking API usage and costs.
- prompt_tokens
Number of tokens in the input.
- total_tokens
Total number of tokens used in the request.
-
prompt_tokens:
int The number of tokens used by the input prompt.
-
total_tokens:
int The total number of tokens used by the request.
Type definitions for Venice AI Billing API.
This module contains TypedDict definitions for request parameters and response objects in the Venice AI Billing API, covering the billing usage endpoint.
- class venice_ai.types.billing.BillingFormatEnum(*values)[source]
Defines available output formats for billing usage data responses.
This enumeration specifies the supported data formats that can be requested when retrieving billing usage information from the Venice AI API. Different formats may be suitable for different use cases, such as programmatic processing or data export.
Used to specify the desired response format in billing usage API requests.
- CSV = 'csv'
CSV format - returns raw CSV data as bytes for export purposes.
- JSON = 'json'
JSON format - returns structured data as BillingUsageResponse.
- class venice_ai.types.billing.BillingUsageEntry[source]
Represents a single billing usage record from the Venice AI API.
This model defines the structure of individual usage entries returned by the billing usage endpoint. Each entry represents a billable event with associated costs, units consumed, and metadata about the API usage.
Used as the primary data structure for tracking and reporting API usage costs across different services and time periods.
-
amount:
float Total amount charged for this usage entry.
-
currency:
typing.Literal['USD','VCU'] Currency denomination for the charge (either USD or Venice Compute Units).
-
inferenceDetails:
typing.Optional[venice_ai.types.billing.InferenceDetails] Detailed inference metadata, present only for LLM-related usage entries.
-
notes:
str Additional notes or description associated with this billing entry.
-
pricePerUnitUsd:
float Price per unit in USD for this specific usage type.
-
sku:
str Stock Keeping Unit (SKU) identifier for the product or service used.
-
timestamp:
str ISO 8601 formatted timestamp indicating when this usage occurred.
-
units:
float Quantity of units consumed for this billing entry.
-
amount:
- class venice_ai.types.billing.BillingUsagePagination[source]
Represents pagination metadata for billing usage API responses.
This model contains information about the pagination state of billing usage queries, including current page position, total available records, and pagination limits. Used in conjunction with billing usage responses to enable efficient navigation through large datasets.
Essential for handling paginated billing data retrieval from the Venice AI API.
-
limit:
float Maximum number of items returned per page in the current request.
-
page:
float Current page number in the paginated result set (1-based).
-
total:
float Total number of billing usage entries available across all pages.
-
totalPages:
float Total number of pages available for the current query parameters.
-
limit:
- class venice_ai.types.billing.BillingUsageRequestParams[source]
Represents query parameters for filtering and paginating billing usage data.
This model defines the optional parameters that can be used to customize billing usage queries to the Venice AI API. Supports filtering by date range, currency type, and pagination controls to retrieve specific subsets of billing data.
All parameters are optional and will use API defaults when not specified. Used to construct targeted billing usage requests based on specific criteria.
-
currency:
typing.Optional[typing.Literal['USD','VCU']] Filter results by currency type (USD for US Dollars, VCU for Venice Compute Units).
-
endDate:
typing.Optional[str] 00:00Z”).
- Type:
End date for the billing period filter in ISO 8601 format (e.g., “2025-05-01T00
-
limit:
typing.Optional[int] 1-500, default: 200).
- Type:
Maximum number of items to return per page (valid range
-
page:
typing.Optional[int] 1).
- Type:
Page number for pagination, starting from 1 (default
-
sortOrder:
typing.Optional[typing.Literal['asc','desc']] ‘desc’).
- Type:
Sort order for results by timestamp (ascending or descending, default
-
startDate:
typing.Optional[str] 00:00Z”).
- Type:
Start date for the billing period filter in ISO 8601 format (e.g., “2025-01-01T00
-
currency:
- class venice_ai.types.billing.BillingUsageResponse[source]
Represents the complete response structure from the billing usage endpoint.
This model serves as the top-level container for billing usage data returned by the Venice AI API. It combines the actual usage records with pagination metadata, providing a comprehensive view of billing information for a given query.
Used as the primary response type for all billing usage API calls.
-
data:
typing.List[venice_ai.types.billing.BillingUsageEntry] Array of billing usage records for the requested time period and filters.
-
pagination:
venice_ai.types.billing.BillingUsagePagination Pagination metadata including current page, total items, and page limits.
-
data:
- class venice_ai.types.billing.InferenceDetails[source]
Represents detailed information about an inference request for billing purposes.
This model contains metadata about LLM inference requests, including token counts and execution metrics. Used within billing usage entries to provide granular details about API usage costs and performance.
Note
These details are only present for LLM usage entries and may be absent for other types of API usage.
-
completionTokens:
typing.Optional[float] Number of tokens generated in the completion response (present only for LLM inference requests).
-
inferenceExecutionTime:
typing.Optional[float] Total execution time for the inference request in milliseconds.
-
promptTokens:
typing.Optional[float] Number of tokens in the input prompt (present only for LLM inference requests).
-
requestId:
typing.Optional[str] Unique identifier for the specific inference request.
-
completionTokens:
- class venice_ai.types.characters.Character(**data: Any)[source]
Represents an AI character definition in the Venice AI system.
This model defines a complete AI character with all its attributes including metadata, configuration settings, and behavioral parameters. Characters are used in chat completions and other AI interactions to provide specific personalities, knowledge bases, and response styles.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class venice_ai.types.characters.CharacterList(**data: Any)[source]
Represents a paginated collection of AI characters.
This model serves as a container for multiple character objects, typically returned by character listing and search API endpoints. It follows the standard API response format with a data array containing character objects and an object type identifier for response validation and parsing.
- Parameters:
data (
typing.Any)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Exceptions¶
- exception venice_ai.exceptions.APIConnectionError(message: str = 'Connection error', *, original_error: Exception | None = None, request: Any | None = None, response: httpx.Response | None = None)[source]
Raised when there’s an issue connecting to the Venice AI API.
This exception is raised when network-level connectivity issues prevent the client from establishing a connection to the API server. This could be due to:
Network connectivity problems
DNS resolution failures
Connection timeouts during establishment
SSL/TLS handshake failures
Proxy configuration issues
- Parameters:
message (str) – Human-readable description of the error. Defaults to “Connection error”.
original_error (Optional[Exception]) – Optional. The original exception that caused this error.
request (Optional[Any]) – Optional. The
httpx.Requestobject associated with the error.response (Optional[httpx.Response]) – Optional. The
httpx.Responseobject if available.
- Variables:
original_error (Optional[Exception]) – Original exception that caused this error.
request (Optional[Any]) –
httpx.Requestobject associated with the error, if available.
- exception venice_ai.exceptions.APIError(message: str, *, request: httpx.Request | None = None, response: httpx.Response, body: Any | None = None)[source]
Raised when the API returns a non-2xx status code.
This is a general exception for API-related errors, including server errors (5xx status codes) and unhandled client errors (4xx status codes). More specific exception subclasses are available for common HTTP status codes.
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject that led to the error.response (httpx.Response) – The
httpx.Responseobject from the API call.body (Optional[Any]) – Optional. The parsed response body, if available.
- Variables:
status_code (int) – HTTP status code of the response.
body (Optional[Any]) – Parsed response body, if available.
- exception venice_ai.exceptions.APIResponseProcessingError(message: str, *, original_error: Exception | None = None, response: httpx.Response | None = None)[source]
Raised when there’s an error processing the API response.
This exception is raised when the client successfully receives a response from the Venice AI API but encounters an error while processing the response data. This could be due to:
Unexpected response format or structure
JSON parsing errors
Missing expected fields in the response
Data type conversion failures
Response validation errors
- Parameters:
- Variables:
original_error (Optional[Exception]) – Original exception that caused this error.
- exception venice_ai.exceptions.APITimeoutError(message: str = 'Request timed out', *, original_error: Exception | None = None, request: Any | None = None, response: httpx.Response | None = None)[source]
Raised when an API request times out.
This exception is raised when a request to the Venice AI API takes too long to complete and exceeds the configured timeout limit. This can occur during:
Long-running operations (e.g., image generation, large file processing)
Network latency issues
Server processing delays
Read timeout while waiting for response data
- Parameters:
message (str) – Human-readable description of the error. Defaults to “Request timed out”.
original_error (Optional[Exception]) – Optional. The original exception that caused this error.
request (Optional[Any]) – Optional. The
httpx.Requestobject associated with the error.response (Optional[httpx.Response]) – Optional. The
httpx.Responseobject if available.
- Variables:
original_error (Optional[Exception]) – Original exception that caused this error.
request (Optional[Any]) –
httpx.Requestobject associated with the error, if available.
- exception venice_ai.exceptions.AuthenticationError(message: str, *, request: httpx.Request | None = None, response: httpx.Response, body: Any | None = None)[source]
Raised for 401 Unauthorized errors, typically due to an invalid API key.
This exception is raised when the API returns a 401 status code, indicating that the request lacks valid authentication credentials. This commonly occurs when:
The API key is missing or invalid
The API key has been revoked or expired
The API key lacks the necessary permissions for the requested operation
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject that led to the error.response (httpx.Response) – The
httpx.Responseobject from the API call.body (Optional[Any]) – Optional. The parsed response body, if available.
- exception venice_ai.exceptions.ConflictError(message: str, *, request: httpx.Request | None = None, response: httpx.Response, body: Any | None = None)[source]
Raised for 409 Conflict errors when a resource conflict occurs.
This exception is raised when the API returns a 409 status code, indicating that the request could not be completed due to a conflict with the current state of the resource. This may occur when:
Attempting to create a resource that already exists
Concurrent modifications to the same resource
Business logic constraints prevent the operation
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject that led to the error.response (httpx.Response) – The
httpx.Responseobject from the API call.body (Optional[Any]) – Optional. The parsed response body, if available.
- exception venice_ai.exceptions.InternalServerError(message: str, *, request: httpx.Request | None = None, response: httpx.Response, body: Any | None = None)[source]
Raised for 500 Internal Server Error and other 5xx server-side errors.
This exception is raised when the API returns a 5xx status code, indicating that an error occurred on the API server’s end. This includes various server-side failures such as:
Internal server errors (500)
Service unavailable (503)
Gateway timeout (504)
Inference failures
Upscale failures
Unknown server errors
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject that led to the error.response (httpx.Response) – The
httpx.Responseobject from the API call.body (Optional[Any]) – Optional. The parsed response body, if available.
- exception venice_ai.exceptions.InvalidRequestError(message: str, *, request: httpx.Request | None = None, response: httpx.Response, body: Any | None = None)[source]
Raised for 400 Bad Request errors due to invalid request parameters.
This exception is raised when the API returns a 400 status code, indicating that the server cannot process the request due to client error. This typically occurs when:
Required fields are missing from the request
Parameter values are invalid or malformed
The request payload format is incorrect
File size exceeds limits (413 status code also maps to this exception)
Unsupported content type (415 status code also maps to this exception)
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject that led to the error.response (httpx.Response) – The
httpx.Responseobject from the API call.body (Optional[Any]) – Optional. The parsed response body, if available.
- exception venice_ai.exceptions.MissingStreamClassError(message: str, *, request: httpx.Request | None = None, response: httpx.Response | None = None)[source]
Raised when stream=True but no stream_cls is provided.
This exception is raised when attempting to use streaming functionality but the required stream class parameter is not provided. This typically occurs in chat completions or other streaming operations where the client needs to know how to handle the streamed response data.
- Parameters:
message (str) – Human-readable description of the error.
request (
typing.Optional[httpx.Request])response (
typing.Optional[httpx.Response])
- exception venice_ai.exceptions.NotFoundError(message: str, *, request: httpx.Request | None = None, response: httpx.Response, body: Any | None = None)[source]
Raised for 404 Not Found errors when a requested resource is not found.
This exception is raised when the API returns a 404 status code, indicating that the requested resource could not be found. This commonly occurs when:
An incorrect model name is specified
A character slug does not exist
An API endpoint path is invalid
A resource identifier (ID) is not found
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject that led to the error.response (httpx.Response) – The
httpx.Responseobject from the API call.body (Optional[Any]) – Optional. The parsed response body, if available.
- exception venice_ai.exceptions.PaymentRequiredError(message: str, *, request: httpx.Request | None = None, response: httpx.Response, body: Any | None = None)[source]
Raised for 402 Payment Required errors when there is insufficient balance.
This exception is raised when the API returns a 402 status code, indicating that the request cannot be processed due to insufficient USD or VCU (Venice Compute Units) balance in the account. The client should add funds or credits before retrying.
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject that led to the error.response (httpx.Response) – The
httpx.Responseobject from the API call.body (Optional[Any]) – Optional. The parsed response body, if available.
- exception venice_ai.exceptions.PermissionDeniedError(message: str, *, request: httpx.Request | None = None, response: httpx.Response, body: Any | None = None)[source]
Raised for 403 Forbidden errors when access is denied.
This exception is raised when the API returns a 403 status code, indicating that the client does not have permission to perform the requested action. The request was valid and authenticated, but the server is refusing to authorize it.
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject that led to the error.response (httpx.Response) – The
httpx.Responseobject from the API call.body (Optional[Any]) – Optional. The parsed response body, if available.
- exception venice_ai.exceptions.RateLimitError(message: str, *, request: httpx.Request | None = None, response: httpx.Response, body: Any | None = None, retry_after_seconds: int | None = None)[source]
Raised for 429 Too Many Requests errors when rate limits are exceeded.
This exception is raised when the API returns a 429 status code, indicating that the client has sent too many requests in a given time frame and has exceeded the rate limit. The client should wait before making additional requests.
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject that led to the error.response (httpx.Response) – The
httpx.Responseobject from the API call.body (Optional[Any]) – Optional. The parsed response body, if available.
retry_after_seconds (Optional[int]) – Optional. The number of seconds to wait before retrying, parsed from the Retry-After header.
- Variables:
retry_after_seconds (Optional[int]) – Number of seconds to wait before retrying, if available.
- exception venice_ai.exceptions.ServiceUnavailableError(message: str, *, request: httpx.Request | None = None, response: httpx.Response, body: Any | None = None)[source]
Raised for 503 Service Unavailable errors when the service is temporarily unavailable.
This exception is raised when the API returns a 503 status code, indicating that the service is temporarily unavailable. This commonly occurs when:
The requested model is at capacity
The service is undergoing maintenance
Temporary server overload
Clients should implement retry logic with exponential backoff when encountering this error.
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject that led to the error.response (httpx.Response) – The
httpx.Responseobject from the API call.body (Optional[Any]) – Optional. The parsed response body, if available.
- exception venice_ai.exceptions.StreamClosedError(message: str = 'Connection error', *, original_error: Exception | None = None, request: Any | None = None, response: httpx.Response | None = None)[source]
Raised when an attempt is made to operate on a stream whose underlying connection has been closed.
This exception is raised when trying to iterate over a stream whose underlying httpx.Response object has been closed. Once a stream’s underlying connection is closed, it cannot be iterated.
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject associated with the error.response (Optional[httpx.Response]) – Optional. The
httpx.Responseobject if available.original_error (
typing.Optional[Exception])
- exception venice_ai.exceptions.StreamConsumedError(message: str = 'Connection error', *, original_error: Exception | None = None, request: Any | None = None, response: httpx.Response | None = None)[source]
Raised when an attempt is made to operate on a stream that has already been consumed.
This exception is raised when trying to iterate over a stream that has already been fully consumed or exhausted. Once a stream has been consumed, it cannot be re-iterated.
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject associated with the error.response (Optional[httpx.Response]) – Optional. The
httpx.Responseobject if available.original_error (
typing.Optional[Exception])
- exception venice_ai.exceptions.UnprocessableEntityError(message: str, *, request: httpx.Request | None = None, response: httpx.Response, body: Any | None = None)[source]
Raised for 422 Unprocessable Entity errors due to validation failures.
This exception is raised when the API returns a 422 status code, indicating that the request was well-formed but contained semantic errors that prevented it from being processed. This typically occurs when:
Request data fails server-side validation rules
Business logic constraints are violated
Data format is correct but values are semantically invalid
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject that led to the error.response (httpx.Response) – The
httpx.Responseobject from the API call.body (Optional[Any]) – Optional. The parsed response body, if available.
- exception venice_ai.exceptions.VeniceError(message: str, *, request: httpx.Request | None = None, response: httpx.Response | None = None)[source]
Base exception for all errors raised by the Venice AI client.
This is the parent class for all custom exceptions in the Venice AI library. All other exception classes inherit from this base exception.
- Parameters:
message (str) – Human-readable description of the error.
request (Optional[httpx.Request]) – Optional. The
httpx.Requestobject associated with the error.response (Optional[httpx.Response]) – Optional. The
httpx.Responseobject if the error originated from an API call.
- Variables:
message (str) – Human-readable description of the error.
request_obj (Optional[httpx.Request]) – The
httpx.Requestobject associated with the error, if available.response_obj (Optional[httpx.Response]) – The
httpx.Responseobject if the error originated from an API call.
Utilities¶
For utility functions provided by the library, please see the Client Utilities page.