How to Migrate from ChatGPT to Open WebUI + Ollama 2026

Q: What You're Building?

The stack has two components: Ollama: A runtime that downloads and serves open source language models locally. Think of it as the backend that loads the AI models and handles inference. Supports 100+ models. Open WebUI: A web interface that looks and works like ChatGPT. Chat history, conversation management, file uploads, model switching, user accounts, and more. Connects to Ollama as its model backend. Together they give you a ChatGPT-like interface with complete data privacy.

Why Leave ChatGPT?

ChatGPT Plus costs $20/month ($240/year). Every conversation you have is sent to OpenAI's servers, logged, and potentially used for training. At scale — a team of 10 paying for Plus — that's $2,400/year.

More significantly: if your work involves confidential information, client data, code that can't leave your organization, or anything sensitive, ChatGPT's cloud model creates real risk.

Open WebUI + Ollama solves both problems:

Cost: Self-hosted LLMs run on your hardware. No monthly subscription.
Privacy: All conversation data stays on your machine or server. Nothing leaves your network.
Quality: In 2026, open models like Llama 3.3 70B and Mistral Small 3 24B perform comparably to GPT-4 for most everyday tasks.

This guide walks you through setting up a ChatGPT-equivalent experience that you control.

What You're Building

The stack has two components:

Ollama: A runtime that downloads and serves open source language models locally. Think of it as the backend that loads the AI models and handles inference. Supports 100+ models.

Open WebUI: A web interface that looks and works like ChatGPT. Chat history, conversation management, file uploads, model switching, user accounts, and more. Connects to Ollama as its model backend.

Together they give you a ChatGPT-like interface with complete data privacy.

Hardware Requirements

Your hardware determines which models you can run and how fast they respond.

For Local Use (Your Own Machine)

Hardware	RAM	Models You Can Run
MacBook M1/M2/M3 (8GB)	8GB	3B-7B models (fast)
MacBook M1/M2/M3 (16GB)	16GB	7B-13B models (fast)
MacBook M3 Pro/Max (36GB)	36GB	30B-70B models (fast)
PC with NVIDIA RTX 3060 (12GB VRAM)	12GB	7B-13B models (GPU-fast)
PC with NVIDIA RTX 4090 (24GB VRAM)	24GB	30B-70B models (GPU-fast)

Apple Silicon (M-series) Macs have unified memory — both CPU and GPU share it. This makes Macs excellent for local model inference without a discrete GPU.

For Server/Team Use (VPS)

To serve a team, you need a server with enough RAM for your chosen model:

Model Size	RAM Required	Hetzner Server	Monthly Cost
7B (good quality)	8GB	CPX21	$6.50
13B (better quality)	16GB	CPX31	$10
70B (best quality)	48GB	CCX43	$54

CPU-only inference is slow but works: On a CPX21 without GPU, 7B models generate approximately 5-10 tokens/second — comfortable for interactive use.

GPU servers: For 70B models at useful speed, GPU instances are necessary. GPU cloud servers cost $1-8/hour. Run them on-demand if you don't need 24/7 service.

Step 1: Install Ollama

macOS

# Install via Homebrew
brew install ollama

# Or download from ollama.com
curl -L https://ollama.com/download/ollama-darwin.zip -o ollama.zip
unzip ollama.zip

Linux

curl -fsSL https://ollama.com/install.sh | sh

Ollama installs as a system service and starts automatically.

Windows

Download the installer from ollama.com and run it. Ollama integrates with WSL2 for GPU support.

Verify Installation

ollama --version
# ollama version 0.5.x

Step 2: Download Your First Model

Choose a model based on your hardware:

Recommended Starting Models

For 8GB RAM (lighter models):

ollama pull llama3.2:3b      # Meta Llama 3.2 3B — fast, decent quality
ollama pull phi4-mini         # Microsoft Phi-4 Mini — surprisingly capable

For 16GB RAM (sweet spot):

ollama pull llama3.1:8b      # Meta Llama 3.1 8B — good balance
ollama pull mistral           # Mistral 7B v0.3 — fast and capable
ollama pull qwen2.5:7b       # Alibaba Qwen 2.5 7B — strong coding

For 32GB+ RAM (high quality):

ollama pull llama3.1:70b     # Meta Llama 3.1 70B — near GPT-4 quality
ollama pull qwen2.5:32b      # Qwen 2.5 32B — excellent coding
ollama pull mistral-small3   # Mistral Small 3 24B — efficient and capable

Model downloads range from 2GB (3B models) to 40GB (70B models). Download completes to ~/.ollama/models/.

Test Your Model

ollama run llama3.1:8b
# >>> Hello! How can I help you today?

Type a message and press Enter. When the model responds, you've confirmed Ollama is working. Exit with /bye or Ctrl+D.

Step 3: Install Open WebUI

Open WebUI provides the ChatGPT-like interface. Install via Docker:

Docker (Simplest)

# If Ollama is running locally on the same machine
docker run -d \
  --name open-webui \
  --network=host \
  -v open-webui:/app/backend/data \
  -e OLLAMA_BASE_URL=http://127.0.0.1:11434 \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Access Open WebUI at http://localhost:8080

Docker Compose (Recommended for Production)

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    volumes:
      - ollama_data:/root/.ollama
    restart: always
    # For GPU support, uncomment:
    # deploy:
    #   resources:
    #     reservations:
    #       devices:
    #         - capabilities: [gpu]

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    ports:
      - "3000:8080"
    volumes:
      - open-webui:/app/backend/data
    environment:
      OLLAMA_BASE_URL: http://ollama:11434
      WEBUI_SECRET_KEY: your-secret-key-here
    restart: always
    depends_on:
      - ollama

volumes:
  ollama_data:
  open-webui:

docker compose up -d

# Pull models via the Ollama container
docker exec ollama ollama pull llama3.1:8b

Configure Open WebUI

Navigate to http://localhost:3000 (or your server IP)
Create an admin account (first account automatically becomes admin)
Open WebUI will auto-detect your Ollama installation

Step 4: Configure to Match Your ChatGPT Workflow

Set Default Model

Go to Settings → Interface → Default Model and select your preferred model. For daily use, Llama 3.1 8B or Mistral 7B work well.

Enable Web Search

Open WebUI supports web search integration — similar to ChatGPT's browsing feature. Configure in Settings → Web Search:

Options:

SearXNG (self-hosted, fully private): Best for privacy
Brave Search API (free tier: 2,000 queries/month)
Tavily (developer-friendly API)

Enable Document Upload (RAG)

Open WebUI has built-in RAG (Retrieval-Augmented Generation) — upload PDFs, text files, and other documents and chat with them.

Configure in Settings → Documents:

Set chunk size and overlap
Configure the embedding model (Open WebUI downloads one automatically)

This replicates ChatGPT's "Upload files" feature, but your documents never leave your server.

Connect External API Models (Optional)

If you also want access to cloud models like GPT-4 or Claude alongside local models, add API keys in Settings → Connections:

OpenAI API: sk-...
Anthropic API: sk-ant-...

Open WebUI shows all models (local Ollama + API) in a single model dropdown. Switch between a local Llama model and cloud GPT-4 in the same interface.

Step 5: Migrate Your ChatGPT Habits

Export ChatGPT History (Optional)

If you want to preserve your ChatGPT conversation history:

ChatGPT → Settings → Data Controls → Export Data
Wait for email with download link
Download ZIP containing conversations.json

Open WebUI doesn't have an import function for ChatGPT exports, but you can reference old conversations manually or build a simple script to reformat and import them.

Custom System Prompts

ChatGPT Plus lets you set custom instructions. Open WebUI calls these "System Prompts":

Settings → Interface → System Prompt
Add your custom instructions (persona, response style, domain context)

Example system prompt for a coding assistant:

You are an expert programmer. When writing code:
- Default to TypeScript unless another language is specified
- Include error handling
- Add brief comments for non-obvious logic
- Suggest testing approaches when relevant

Conversation Management

Open WebUI mirrors ChatGPT's conversation sidebar:

Previous conversations organized by date
Search through conversation history
Pin important conversations
Archive or delete old chats

Keyboard Shortcuts

Open WebUI supports many ChatGPT-familiar shortcuts:

Ctrl/Cmd + Enter: Submit message
Ctrl/Cmd + Shift + O: New conversation
↑: Edit last message

Step 6: Team Setup (Optional)

If you're replacing ChatGPT for a team:

Enable Multi-User Mode

Open WebUI supports multiple user accounts:

Admin creates accounts via Admin Panel → Users → Add User
Or enable Open Registration for self-service signup
Assign roles: Admin, User

Set Usage Policies

In Admin Panel → Settings:

Limit which models specific user groups can access
Set rate limits (if needed for resource management)
Configure shared conversation spaces

Configure Authentication (Team)

For team deployments, integrate with your identity provider:

OAuth: Google, GitHub, Microsoft, Authentik, Keycloak
LDAP: Active Directory integration
SAML: Enterprise SSO

Model Quality Comparison vs ChatGPT

In 2026, the gap between the best open models and ChatGPT has closed significantly for common tasks:

Task	ChatGPT GPT-4o	Llama 3.1 70B	Mistral Small 3
General Q&A	Excellent	Excellent	Very Good
Code generation	Excellent	Excellent	Excellent
Creative writing	Excellent	Very Good	Good
Reasoning	Excellent	Very Good	Good
Math	Excellent	Very Good	Good
Multimodal (images)	Yes	Via llava model	Limited
Response speed (local)	Fast (cloud)	Slow (needs hardware)	Fast

For most everyday tasks — code help, writing assistance, question answering, summarization — local 7B-13B models are sufficient. 70B models approach GPT-4 quality.

Cost Analysis

ChatGPT Plus (Per Year)

Users	Annual Cost
1	$240
5	$1,200
10	$2,400
25	$6,000

Self-Hosted Open WebUI + Ollama

Personal Mac use: $0 (runs on your existing hardware)

Dedicated server for a team:

Server	Model Quality	Annual
Hetzner CPX21 (8GB)	7B models	$78
Hetzner CPX31 (16GB)	13B models	$120
Hetzner CPX51 (32GB)	30B models	$360

For a 10-person team: Open WebUI server at $120/year vs ChatGPT Plus at $2,400/year. Savings: $2,280/year.

Find More AI Alternatives

Browse all ChatGPT alternatives on OSSAlt — compare Open WebUI, LibreChat, Jan, AnythingLLM, and every other open source AI interface with deployment guides and model comparisons.

Comments