Skip to main content

Self-Host Perplexica: Open Source Perplexity 2026

·OSSAlt Team
perplexicaperplexityself-hostai-searchsearxngollamaopen-source2026

Self-Host Perplexica: Open Source Perplexity AI in 2026

TL;DR

Perplexica is the closest open-source equivalent to Perplexity AI. It combines SearXNG (private web search) with any LLM backend — local Ollama models or cloud providers — to give you cited, AI-generated answers to your questions without sending data to Perplexity's servers. Self-hosting takes 10 minutes with Docker Compose and costs nothing to run if you already have a server.

Key Takeaways

  • Perplexica searches the web via SearXNG, reranks results with embeddings, and generates cited answers using your LLM of choice
  • Fully private: no queries leave your infrastructure when using Ollama + SearXNG
  • LLM flexibility: supports Ollama (local), OpenAI, Anthropic Claude, Google Gemini, Groq
  • Search modes: Web, Academic, YouTube, Reddit/Discussions, Wolfram Alpha
  • License: MIT — fully open-source, no cloud dependency required
  • Requirements: 2GB+ RAM for the stack; 8GB+ if running local Ollama models
  • Perplexity Pro costs $20/month — Perplexica costs $5–10/month in VPS fees or runs free on existing hardware

What Is Perplexica?

Perplexity AI became popular by combining real-time web search with LLM-generated answers and source citations. Rather than a keyword search engine returning links, it answers questions directly with inline citations you can verify.

Perplexica replicates this workflow entirely with open-source components:

User query
    ↓
SearXNG (private meta-search engine)
    ↓ top N results
Embedding reranker (filters most relevant results)
    ↓ relevant context
LLM (Ollama/OpenAI/Claude/Gemini)
    ↓
Cited answer with source links

The key difference from just using an LLM: Perplexica fetches current web content before generating the answer. Ask about a news event from yesterday and it finds the relevant articles first, then synthesizes a response — the LLM isn't relying on training data.


What You Get vs. Perplexity Pro

FeaturePerplexity Pro ($20/mo)Perplexica (self-hosted)
AI-powered web search
Source citations
Local/private queries
Model choiceLimitedAny (Ollama, OpenAI, Claude…)
Academic search
Reddit/discussion search
YouTube search
Image generation
API access✅ (paid)✅ (self-hosted)
Mobile app
Monthly cost$20$0–10

The main gaps vs. Perplexity Pro: no mobile app, no image generation in the base install. Everything else the core search-and-answer workflow is replicated.


System Requirements

Minimum (cloud LLM like OpenAI/Groq):
  RAM: 2GB
  CPU: 1 vCPU
  Storage: 5GB

Recommended (local Ollama models):
  RAM: 16GB+ (for 7B models like Llama 3.1)
  GPU: Optional but recommended (NVIDIA for CUDA)
  Storage: 20GB+ (model weights)

Tested on:
  - Ubuntu 22.04 / Debian 12
  - VPS: Hetzner CPX21 (~$7/mo) for cloud LLM mode
  - Local: M-series Mac, modern Linux desktop for Ollama mode

Docker Compose Install

Perplexica ships as a multi-container Docker Compose stack. SearXNG is included automatically — you don't install it separately.

Step 1: Clone and Configure

git clone https://github.com/ItzCrazyKns/Perplexica.git
cd Perplexica
cp sample.config.toml config.toml

Edit config.toml:

[GENERAL]
PORT = 3000                    # Web UI port
SIMILARITY_MEASURE = "cosine"  # or "dot_product"

[API_KEYS]
OPENAI = "sk-..."              # Optional — for OpenAI models
GROQ = "gsk_..."               # Optional — for Groq (fast free tier)
ANTHROPIC = ""                 # Optional — for Claude

[API_ENDPOINTS]
OLLAMA = "http://ollama:11434" # If running Ollama in Docker
SEARXNG = "http://searxng:8080"

[CHAT_MODEL]
PROVIDER = "ollama"            # or "openai", "anthropic", "groq"
MODEL = "llama3.1:8b"          # Model name for your provider
CUSTOM_OPENAI_BASE_URL = ""    # For OpenAI-compatible endpoints
CUSTOM_OPENAI_KEY = ""

[EMBEDDING_MODEL]
PROVIDER = "local"             # Uses local HuggingFace model
MODEL = "togethercomputer/m2-bert-80M-8k-retrieval"

Step 2: Docker Compose

# docker-compose.yml — full Perplexica stack
version: "3.8"

services:
  perplexica-backend:
    image: itzcrazykns/perplexica-backend:main
    depends_on:
      - searxng
    ports:
      - "3001:3001"
    volumes:
      - ./config.toml:/home/perplexica/config.toml
      - perplexica-uploads:/home/perplexica/uploads
    networks:
      - perplexica-network
    restart: unless-stopped

  perplexica-frontend:
    image: itzcrazykns/perplexica-frontend:main
    depends_on:
      - perplexica-backend
    ports:
      - "3000:3000"
    environment:
      NEXT_PUBLIC_API_URL: http://localhost:3001
      NEXT_PUBLIC_WS_URL: ws://localhost:3001
    networks:
      - perplexica-network
    restart: unless-stopped

  searxng:
    image: searxng/searxng:latest
    volumes:
      - ./searxng:/etc/searxng
    networks:
      - perplexica-network
    restart: unless-stopped

networks:
  perplexica-network:
    driver: bridge

volumes:
  perplexica-uploads:
# Start the full stack
docker compose up -d

# Check all containers are running
docker compose ps

# View logs if something fails
docker compose logs -f perplexica-backend

Open http://localhost:3000 — the search UI is ready.

Step 3: Configure SearXNG

SearXNG needs its settings file to allow JSON format (required by Perplexica):

# Create SearXNG config directory
mkdir -p searxng

# Create settings.yml
cat > searxng/settings.yml << 'EOF'
use_default_settings: true

server:
  secret_key: "$(openssl rand -hex 32)"
  limiter: false
  image_proxy: true

search:
  safe_search: 0
  autocomplete: ""
  default_lang: "en"

outgoing:
  request_timeout: 10.0

engines:
  - name: google
    engine: google
    shortcut: g
  - name: bing
    engine: bing
    shortcut: b
  - name: duckduckgo
    engine: duckduckgo
    shortcut: ddg

ui:
  static_use_hash: true

enabled_plugins:
  - 'Hash plugin'
  - 'Search on category select'
  - 'Tracker URL remover'

formats:
  - html
  - json  # Required by Perplexica
EOF

Adding Ollama for Local Models

For fully private search (no data leaves your server), add Ollama to the stack:

# Add to docker-compose.yml services:
  ollama:
    image: ollama/ollama:latest
    volumes:
      - ollama_data:/root/.ollama
    ports:
      - "11434:11434"
    networks:
      - perplexica-network
    restart: unless-stopped
    # For GPU support, add:
    # deploy:
    #   resources:
    #     reservations:
    #       devices:
    #         - driver: nvidia
    #           count: all
    #           capabilities: [gpu]

volumes:
  ollama_data:  # Add this alongside perplexica-uploads
# Pull a model after starting Ollama
docker exec -it perplexica-ollama-1 ollama pull llama3.1:8b

# For a smaller, faster model on 4GB RAM:
docker exec -it perplexica-ollama-1 ollama pull mistral:7b

# Update config.toml to use Ollama:
# PROVIDER = "ollama"
# MODEL = "llama3.1:8b"
# OLLAMA = "http://ollama:11434"

With this configuration, your search queries go to your own SearXNG instance and answers are generated by your local Llama 3.1 — nothing reaches external services.


Search Modes

Perplexica ships with specialized search modes beyond general web search:

Web (default)
  → SearXNG web results → LLM answer with citations

Academic
  → Searches arXiv, Semantic Scholar, PubMed
  → Best for research questions

YouTube
  → Searches YouTube, returns video links with summaries
  → Useful for "how to" questions

Reddit / Discussions
  → Searches Reddit for community discussion
  → Good for "what do people think about X" questions

Wolfram Alpha (optional, requires API key)
  → Computational/factual queries
  → Math, unit conversions, factual lookups

Configure the default mode in config.toml:

[FOCUS_MODE]
DEFAULT = "webSearch"  # or "academicSearch", "youtubeSearch", "redditSearch"

Reverse Proxy with Caddy

For public access with HTTPS:

# Caddyfile
search.yourdomain.com {
    reverse_proxy localhost:3000
}
# With Caddy installed:
caddy run --config /etc/caddy/Caddyfile

# Or add to docker-compose.yml as a Caddy service

Cost Comparison

SetupMonthly CostPrivacySpeed
Perplexity Pro$20❌ Cloud⚡ Fast
Perplexica + Groq (free tier)$5 (VPS only)⚠️ Groq API⚡ Fast
Perplexica + OpenAI$5–15 (VPS + API)⚠️ OpenAI API⚡ Fast
Perplexica + Ollama (cloud)$15–30 (GPU VPS)✅ Full⚡ Moderate
Perplexica + Ollama (local)$0 (existing hw)✅ Full⚡ Depends on GPU

The sweet spot for most developers: run Perplexica on a $5/month VPS with Groq's free API tier (28K tokens/minute free) for fast, nearly free AI-powered search with only the LLM call leaving your server.


More self-hosted AI search tools at OSSAlt.

Related: Best Open Source Cursor Alternatives 2026 · Self-Host Coolify: Open Source Vercel Alternative

Comments