Self-Host Perplexica: Open Source Perplexity 2026
Self-Host Perplexica: Open Source Perplexity AI in 2026
TL;DR
Perplexica is the closest open-source equivalent to Perplexity AI. It combines SearXNG (private web search) with any LLM backend — local Ollama models or cloud providers — to give you cited, AI-generated answers to your questions without sending data to Perplexity's servers. Self-hosting takes 10 minutes with Docker Compose and costs nothing to run if you already have a server.
Key Takeaways
- Perplexica searches the web via SearXNG, reranks results with embeddings, and generates cited answers using your LLM of choice
- Fully private: no queries leave your infrastructure when using Ollama + SearXNG
- LLM flexibility: supports Ollama (local), OpenAI, Anthropic Claude, Google Gemini, Groq
- Search modes: Web, Academic, YouTube, Reddit/Discussions, Wolfram Alpha
- License: MIT — fully open-source, no cloud dependency required
- Requirements: 2GB+ RAM for the stack; 8GB+ if running local Ollama models
- Perplexity Pro costs $20/month — Perplexica costs $5–10/month in VPS fees or runs free on existing hardware
What Is Perplexica?
Perplexity AI became popular by combining real-time web search with LLM-generated answers and source citations. Rather than a keyword search engine returning links, it answers questions directly with inline citations you can verify.
Perplexica replicates this workflow entirely with open-source components:
User query
↓
SearXNG (private meta-search engine)
↓ top N results
Embedding reranker (filters most relevant results)
↓ relevant context
LLM (Ollama/OpenAI/Claude/Gemini)
↓
Cited answer with source links
The key difference from just using an LLM: Perplexica fetches current web content before generating the answer. Ask about a news event from yesterday and it finds the relevant articles first, then synthesizes a response — the LLM isn't relying on training data.
What You Get vs. Perplexity Pro
| Feature | Perplexity Pro ($20/mo) | Perplexica (self-hosted) |
|---|---|---|
| AI-powered web search | ✅ | ✅ |
| Source citations | ✅ | ✅ |
| Local/private queries | ❌ | ✅ |
| Model choice | Limited | Any (Ollama, OpenAI, Claude…) |
| Academic search | ✅ | ✅ |
| Reddit/discussion search | ✅ | ✅ |
| YouTube search | ✅ | ✅ |
| Image generation | ✅ | ❌ |
| API access | ✅ (paid) | ✅ (self-hosted) |
| Mobile app | ✅ | ❌ |
| Monthly cost | $20 | $0–10 |
The main gaps vs. Perplexity Pro: no mobile app, no image generation in the base install. Everything else the core search-and-answer workflow is replicated.
System Requirements
Minimum (cloud LLM like OpenAI/Groq):
RAM: 2GB
CPU: 1 vCPU
Storage: 5GB
Recommended (local Ollama models):
RAM: 16GB+ (for 7B models like Llama 3.1)
GPU: Optional but recommended (NVIDIA for CUDA)
Storage: 20GB+ (model weights)
Tested on:
- Ubuntu 22.04 / Debian 12
- VPS: Hetzner CPX21 (~$7/mo) for cloud LLM mode
- Local: M-series Mac, modern Linux desktop for Ollama mode
Docker Compose Install
Perplexica ships as a multi-container Docker Compose stack. SearXNG is included automatically — you don't install it separately.
Step 1: Clone and Configure
git clone https://github.com/ItzCrazyKns/Perplexica.git
cd Perplexica
cp sample.config.toml config.toml
Edit config.toml:
[GENERAL]
PORT = 3000 # Web UI port
SIMILARITY_MEASURE = "cosine" # or "dot_product"
[API_KEYS]
OPENAI = "sk-..." # Optional — for OpenAI models
GROQ = "gsk_..." # Optional — for Groq (fast free tier)
ANTHROPIC = "" # Optional — for Claude
[API_ENDPOINTS]
OLLAMA = "http://ollama:11434" # If running Ollama in Docker
SEARXNG = "http://searxng:8080"
[CHAT_MODEL]
PROVIDER = "ollama" # or "openai", "anthropic", "groq"
MODEL = "llama3.1:8b" # Model name for your provider
CUSTOM_OPENAI_BASE_URL = "" # For OpenAI-compatible endpoints
CUSTOM_OPENAI_KEY = ""
[EMBEDDING_MODEL]
PROVIDER = "local" # Uses local HuggingFace model
MODEL = "togethercomputer/m2-bert-80M-8k-retrieval"
Step 2: Docker Compose
# docker-compose.yml — full Perplexica stack
version: "3.8"
services:
perplexica-backend:
image: itzcrazykns/perplexica-backend:main
depends_on:
- searxng
ports:
- "3001:3001"
volumes:
- ./config.toml:/home/perplexica/config.toml
- perplexica-uploads:/home/perplexica/uploads
networks:
- perplexica-network
restart: unless-stopped
perplexica-frontend:
image: itzcrazykns/perplexica-frontend:main
depends_on:
- perplexica-backend
ports:
- "3000:3000"
environment:
NEXT_PUBLIC_API_URL: http://localhost:3001
NEXT_PUBLIC_WS_URL: ws://localhost:3001
networks:
- perplexica-network
restart: unless-stopped
searxng:
image: searxng/searxng:latest
volumes:
- ./searxng:/etc/searxng
networks:
- perplexica-network
restart: unless-stopped
networks:
perplexica-network:
driver: bridge
volumes:
perplexica-uploads:
# Start the full stack
docker compose up -d
# Check all containers are running
docker compose ps
# View logs if something fails
docker compose logs -f perplexica-backend
Open http://localhost:3000 — the search UI is ready.
Step 3: Configure SearXNG
SearXNG needs its settings file to allow JSON format (required by Perplexica):
# Create SearXNG config directory
mkdir -p searxng
# Create settings.yml
cat > searxng/settings.yml << 'EOF'
use_default_settings: true
server:
secret_key: "$(openssl rand -hex 32)"
limiter: false
image_proxy: true
search:
safe_search: 0
autocomplete: ""
default_lang: "en"
outgoing:
request_timeout: 10.0
engines:
- name: google
engine: google
shortcut: g
- name: bing
engine: bing
shortcut: b
- name: duckduckgo
engine: duckduckgo
shortcut: ddg
ui:
static_use_hash: true
enabled_plugins:
- 'Hash plugin'
- 'Search on category select'
- 'Tracker URL remover'
formats:
- html
- json # Required by Perplexica
EOF
Adding Ollama for Local Models
For fully private search (no data leaves your server), add Ollama to the stack:
# Add to docker-compose.yml services:
ollama:
image: ollama/ollama:latest
volumes:
- ollama_data:/root/.ollama
ports:
- "11434:11434"
networks:
- perplexica-network
restart: unless-stopped
# For GPU support, add:
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: all
# capabilities: [gpu]
volumes:
ollama_data: # Add this alongside perplexica-uploads
# Pull a model after starting Ollama
docker exec -it perplexica-ollama-1 ollama pull llama3.1:8b
# For a smaller, faster model on 4GB RAM:
docker exec -it perplexica-ollama-1 ollama pull mistral:7b
# Update config.toml to use Ollama:
# PROVIDER = "ollama"
# MODEL = "llama3.1:8b"
# OLLAMA = "http://ollama:11434"
With this configuration, your search queries go to your own SearXNG instance and answers are generated by your local Llama 3.1 — nothing reaches external services.
Search Modes
Perplexica ships with specialized search modes beyond general web search:
Web (default)
→ SearXNG web results → LLM answer with citations
Academic
→ Searches arXiv, Semantic Scholar, PubMed
→ Best for research questions
YouTube
→ Searches YouTube, returns video links with summaries
→ Useful for "how to" questions
Reddit / Discussions
→ Searches Reddit for community discussion
→ Good for "what do people think about X" questions
Wolfram Alpha (optional, requires API key)
→ Computational/factual queries
→ Math, unit conversions, factual lookups
Configure the default mode in config.toml:
[FOCUS_MODE]
DEFAULT = "webSearch" # or "academicSearch", "youtubeSearch", "redditSearch"
Reverse Proxy with Caddy
For public access with HTTPS:
# Caddyfile
search.yourdomain.com {
reverse_proxy localhost:3000
}
# With Caddy installed:
caddy run --config /etc/caddy/Caddyfile
# Or add to docker-compose.yml as a Caddy service
Cost Comparison
| Setup | Monthly Cost | Privacy | Speed |
|---|---|---|---|
| Perplexity Pro | $20 | ❌ Cloud | ⚡ Fast |
| Perplexica + Groq (free tier) | $5 (VPS only) | ⚠️ Groq API | ⚡ Fast |
| Perplexica + OpenAI | $5–15 (VPS + API) | ⚠️ OpenAI API | ⚡ Fast |
| Perplexica + Ollama (cloud) | $15–30 (GPU VPS) | ✅ Full | ⚡ Moderate |
| Perplexica + Ollama (local) | $0 (existing hw) | ✅ Full | ⚡ Depends on GPU |
The sweet spot for most developers: run Perplexica on a $5/month VPS with Groq's free API tier (28K tokens/minute free) for fast, nearly free AI-powered search with only the LLM call leaving your server.
More self-hosted AI search tools at OSSAlt.
Related: Best Open Source Cursor Alternatives 2026 · Self-Host Coolify: Open Source Vercel Alternative