Models

Built-in models

PyGPT has a preconfigured list of models (as of 2026-02-06):

  • bielik-11b-v2.3-instruct:Q4_K_M (Ollama)

  • chatgpt-4o-latest (OpenAI)

  • claude-3-5-sonnet-20240620 (Anthropic)

  • claude-3-7-sonnet (Anthropic)

  • claude-3-opus (Anthropic)

  • claude-3-opus (Anthropic)

  • claude-opus-4-0 (Anthropic)

  • claude-sonnet-4-0 (Anthropic)

  • claude-opus-4-5 (Anthropic)

  • claude-sonnet-4-5 (Anthropic)

  • codellama (Ollama)

  • codex-mini (OpenAI)

  • dall-e-2 (OpenAI)

  • dall-e-3 (OpenAI)

  • deep-research-pro-preview-12-2025 (Google)

  • deepseek-chat (DeepSeek)

  • deepseek-r1:1.5b (Ollama)

  • deepseek-r1:14b (Ollama)

  • deepseek-r1:32b (Ollama)

  • deepseek-r1:7b (Ollama)

  • deepseek-reasoner (DeepSeek)

  • gemini-2.5-computer-use-preview-10-2025 (Google)

  • gemini-1.5-flash (Google)

  • gemini-1.5-pro (Google)

  • gemini-2.0-flash-exp (Google)

  • gemini-2.5-flash (Google)

  • gemini-2.5-flash-preview-native-audio-dialog (Google, real-time)

  • gemini-2.5-pro (Google)

  • gemini-3-flash-preview (Google)

  • gemini-3-pro-image-preview (Google)

  • gemini-3-pro-preview (Google)

  • gpt-3.5-turbo (OpenAI)

  • gpt-3.5-turbo-16k (OpenAI)

  • gpt-3.5-turbo-instruct (OpenAI)

  • gpt-4 (OpenAI)

  • gpt-4-32k (OpenAI)

  • gpt-4-turbo (OpenAI)

  • gpt-4-vision-preview (OpenAI)

  • gpt-4.1 (OpenAI)

  • gpt-4.1-mini (OpenAI)

  • gpt-4.1-nano (OpenAI)

  • gpt-4o (OpenAI)

  • gpt-4o-realtime-preview (OpenAI, real-time)

  • gpt-4o-mini (OpenAI)

  • gpt-5 (OpenAI)

  • gpt-5-mini (OpenAI)

  • gpt-5-nano (OpenAI)

  • gpt-5.2 (OpenAI)

  • gpt-image-1 (OpenAI)

  • gpt-image-1.5 (OpenAI)

  • gpt-oss:20b (OpenAI - via Ollama and HuggingFace Router)

  • gpt-oss:120b (OpenAI - via Ollama and HuggingFace Router)

  • gpt-realtime (OpenAI, real-time)

  • grok-2-vision (xAI)

  • grok-3 (xAI)

  • grok-3-fast (xAI)

  • grok-3-mini (xAI)

  • grok-3-mini-fast (xAI)

  • grok-4 (xAI)

  • grok-imagine-image (xAI)

  • grok-imagine-video (xAI)

  • llama2-uncensored (Ollama)

  • llama3.1 (Ollama)

  • llama3.1:70b (Ollama)

  • mistral (Ollama)

  • mistral-large (Ollama)

  • mistral-small3.1 (Ollama)

  • nano-banana-pro-preview (Google)

  • o1 (OpenAI)

  • o1-mini (OpenAI)

  • o1-pro (OpenAI)

  • o3 (OpenAI)

  • o3-deep-research (OpenAI)

  • o3-mini (OpenAI)

  • o3-pro (OpenAI)

  • o4-mini (OpenAI)

  • o4-mini-deep-research (OpenAI)

  • qwen2:7b (Ollama)

  • qwen2.5-coder:7b (Ollama)

  • qwen3:8b (Ollama)

  • qwen3:30b-a3b (Ollama)

  • r1 (Perplexity)

  • sonar (Perplexity)

  • sonar-deep-research (Perplexity)

  • sonar-pro (Perplexity)

  • sonar-reasoning (Perplexity)

  • sonar-reasoning-pro (Perplexity)

  • sora-2 (OpenAI)

  • veo-3.0-generate-preview (Google)

  • veo-3.0-fast-generate-preview (Google)

  • veo-3.1-generate-preview (Google)

  • veo-3.1-fast-generate-preview (Google)

All models are specified in the configuration file models.json, which you can customize. This file is located in your working directory. You can add new models provided directly by OpenAI API (or compatible), Google Gen AI API, Anthropic API, xAI API, and those supported by LlamaIndex or Ollama to this file. Configuration for LlamaIndex in placed in llama_index key.

You can import new models by manually editing models.json or by using the model importer in the Config -> Models -> Import menu.

Tip

The models on the list are sorted by provider, not by manufacturer. A model from a particular manufacturer may be available through different providers (e.g., OpenAI models can be provided by the OpenAI API or by OpenRouter). If you want to use a specific model through a particular provider, you need to configure the provider in Config -> Models -> Edit, or import it directly via Config -> Models -> Import.

Tip

Anthropic and Deepseek API providers use VoyageAI for embeddings (Chat with Files and attachments RAG), so you must also configure the Voyage API key if you want to use embeddings from these providers.

Adding a custom model

You can add your own models. See the section Extending PyGPT / Adding a new model for more info.

There is built-in support for those LLM providers:

  • Anthropic

  • Azure OpenAI (native SDK)

  • Deepseek API

  • Google (native SDK)

  • HuggingFace API

  • HuggingFace Router (wrapper for OpenAI compatible ChatCompletions)

  • Local models (OpenAI API compatible)

  • Mistral AI

  • Ollama

  • OpenAI (native SDK)

  • OpenRouter

  • Perplexity

  • xAI (native SDK)

How to use local or non-GPT models

Llama 3, Mistral, DeepSeek, Qwen, gpt-oss, and other local models

How to use locally installed Llama 3, DeepSeek, Mistral, etc. models:

  1. Choose a working mode: Chat or Chat with Files.

  2. On the models list - select, edit, or add a new model (with ollama provider). You can edit the model settings through the menu Config -> Models -> Edit, then configure the model parameters in the advanced section.

  3. Download and install Ollama from here: https://github.com/ollama/ollama

For example, on Linux:

$ curl -fsSL https://ollama.com/install.sh | sh
  1. Run the model (e.g. Llama 3) locally on your machine. For example, on Linux:

$ ollama run llama3.1
  1. Return to PyGPT and select the correct model from models list to chat with selected model using Ollama running locally.

Example available models:

  • llama3.1

  • codellama

  • mistral

  • llama2-uncensored

  • deepseek-r1

etc.

You can add more models by editing the models list.

Real-time importer

You can also import models in real-time from a running Ollama instance using the Config -> Models -> Import... tool.

Custom Ollama endpoint

The default endpoint for Ollama is: http://localhost:11434

You can change it globally by setting the environment variable OLLAMA_API_BASE in Settings -> General -> Advanced -> Application environment.

You can also change the “base_url” for a specific model in its configuration:

Config -> Models -> Edit, then in the Advanced -> [LlamaIndex] ENV Vars section add the variable:

NAME: OLLAMA_API_BASE VALUE: http://my_endpoint.com:11434

List of all models supported by Ollama:

https://ollama.com/library

https://github.com/ollama/ollama

IMPORTANT: Remember to define the correct model name in the **kwargs list in the model settings.

Using local embeddings

Refer to: https://docs.llamaindex.ai/en/stable/examples/embeddings/ollama_embedding/

You can use an Ollama instance for embeddings. Simply select the ollama provider in:

Config -> Settings -> Indexes / LlamaIndex -> Embeddings -> Embeddings provider

Define parameters like model name and Ollama base URL in the Embeddings provider **kwargs list, e.g.:

  • name: model_name, value: llama3.1, type: str

  • name: base_url, value: http://localhost:11434, type: str

Google Gemini, Anthropic Claude, xAI Grok, etc.

If you want to use non-OpenAI models in Chat with Files and Agents (LlamaIndex) modes, then remember to configure the required parameters like API keys in the model config fields. Chat mode works via OpenAI SDK (compatible API), Chat with Files and Agents (LlamaIndex) modes works via LlamaIndex.

Google Gemini

Required ENV:

  • GOOGLE_API_KEY = {api_key_google}

Required **kwargs:

  • model

Anthropic Claude

Required ENV:

  • ANTHROPIC_API_KEY = {api_key_anthropic}

Required **kwargs:

  • model

xAI Grok (Chat mode only)

Required ENV:

  • OPENAI_API_KEY = {api_key_xai}

  • OPENAI_API_BASE = {api_endpoint_xai}

Required **kwargs:

  • model

Mistral AI

Required ENV:

  • MISTRAL_API_KEY = {api_key_mistral}

Required **kwargs:

  • model

Perplexity

Required ENV:

  • PPLX_API_KEY = {api_key_perplexity}

Required **kwargs:

  • model

HuggingFace API (Chat with Files mode only)

Required ENV:

  • HUGGING_FACE_TOKEN = {api_key_hugging_face}

Required **kwargs:

  • model_name | model

  • token

  • provider = auto