Skip to main content

Self-Host Perplexica: Open Source Perplexity 2026

·OSSAlt Team
perplexicaopen-sourceself-hostedai-searchperplexity-alternativeollamaprivacy2026

Perplexity AI is genuinely good — but every query you run trains their models, builds their data profile, and eventually ends up funding a closed-source moat. Perplexica is the answer: a 33,000-star MIT-licensed AI search engine that runs entirely on your hardware, uses your choice of LLM, and routes web searches through SearXNG instead of a vendor-controlled black box.

This guide gets you from zero to a working Perplexica instance in under 15 minutes.

TL;DR

Perplexica is the best open-source Perplexity alternative in 2026. At 33k GitHub stars, MIT licensed, and actively maintained through v1.12.1, it's mature enough for daily use. Run it with Ollama for 100% local operation, or wire it to OpenAI, Anthropic, or Groq if you want faster inference. Every search stays on your hardware — no query history, no ad targeting, no training data contribution.

Key Takeaways

  • 33,000 GitHub stars — the most popular self-hosted AI search engine by a wide margin
  • MIT licensed — fork it, modify it, run it commercially without restriction
  • Model-agnostic — Ollama (local), OpenAI, Anthropic Claude, or Groq all work out of the box
  • Powered by SearXNG — aggregates Google, Bing, DuckDuckGo, and 70+ engines without tracking
  • 3 focus modes — Speed (fast), Balanced (daily use), Quality (deep research) tunable per query
  • v1.12.1 released December 2025 — actively maintained, 976 commits on master
  • Docker Compose install — five commands from clone to running instance

Why Perplexica Over Perplexity AI

Perplexity AI charges $20/month for Pro. The free tier limits your daily searches and uses their proprietary model stack. More importantly, your search history is stored, analyzed, and potentially used for model training. Perplexity's privacy policy is standard enterprise: "we may use your content to improve our services."

Perplexica flips this completely. The architecture is:

Your query → SearXNG (aggregates search engines) → LLM of your choice → Cited answer

Nothing leaves your network unless you configure it to use a cloud API. With Ollama, the entire stack — search aggregation, AI inference, answer generation — runs on your machine.

The tradeoff is real: Perplexica requires setup, hardware, and occasional maintenance. But for privacy-conscious users, researchers handling sensitive topics, or developers who want full control over their AI stack, this is the better choice.


What You're Actually Installing

Perplexica isn't a single binary — it's a small stack of services:

ServiceRolePort
Perplexica frontendChat interface (Next.js)3000
Perplexica backendSearch orchestration + LLM calls3001
SearXNGPrivacy-preserving meta-search engine4000

The backend handles the logic: it takes your question, constructs search queries, sends them to SearXNG, fetches the top results, and passes the context to your LLM with a prompt asking it to synthesize a cited answer.

Docker Compose spins all three up together. You don't need to configure SearXNG separately — Perplexica's docker-compose.yaml handles it.


System Requirements

Before installing, confirm you have:

  • Docker + Docker Compose — v24+ recommended
  • RAM: Minimum 4GB free. 8GB+ if running Ollama locally alongside Perplexica
  • Disk: ~2-5GB for Docker images, plus model storage if using Ollama
  • LLM backend (pick one):
    • Ollama — free, fully local, needs GPU for acceptable speed (CPU works but is slow)
    • OpenAI API key — fastest results, ~$0.01 per search at gpt-4o-mini
    • Anthropic API key — Claude Haiku is a good balance of speed and quality
    • Groq API key — free tier available, fastest inference for Llama-based models

Step 1: Clone the repository

git clone https://github.com/ItzCrazyKns/Perplexica.git
cd Perplexica

Step 2: Configure

Copy the sample config and open it for editing:

cp sample.config.toml config.toml

Open config.toml. The key sections:

[GENERAL]
PORT = 3001
SIMILARITY_MEASURE = "cosine"  # or "dot_product"

[API_KEYS]
OPENAI = ""           # paste your key here if using OpenAI
GROQ = ""             # paste if using Groq
ANTHROPIC = ""        # paste if using Anthropic

[API_ENDPOINTS]
OLLAMA = ""           # e.g. "http://host.docker.internal:11434" for local Ollama
SEARXNG = "http://searxng:4000"  # leave as-is for Docker setup

If using Ollama locally on Linux: You must expose Ollama to the Docker network. Set the environment variable before starting Ollama:

OLLAMA_HOST=0.0.0.0:11434 ollama serve

Then set the endpoint in config.toml:

OLLAMA = "http://host.docker.internal:11434"

Step 3: Start

docker-compose up -d

Wait 30-60 seconds for all containers to initialize. The first run pulls images (~1.5GB total).

Step 4: Open and configure the model

Navigate to http://localhost:3000. On first load, you'll see the Settings gear icon — click it to select your LLM:

  • Chat model: The model that generates answers. Groq's llama-3.3-70b-versatile is a great free option for cloud. For local, llama3.2:3b works on 8GB RAM, llama3.1:8b is better on 16GB+.
  • Embedding model: Used for document reranking. Leave at default unless you're customizing.

Type a question. Try something where citations matter:

"What changed in Node.js 24 vs Node.js 22?"

You'll see Perplexica query SearXNG, pull sources, and generate a cited answer with numbered footnotes linking to exact URLs. This is the core loop.


Search Modes Explained

Perplexica's three focus modes change how it approaches each query:

ModeSpeedDepthBest For
Speed~3-5sShallowQuick factual lookups, recent events
Balanced~8-15sMediumEveryday research, most queries
Quality~20-40sDeepTechnical comparisons, research papers

Switch modes per query in the interface. Quality mode runs multiple search rounds and uses more context tokens — it costs more on paid APIs but produces noticeably better synthesis on complex topics.

Beyond the base modes, you can narrow the search source:

  • All Sources — default, aggregates everything
  • Academic — prioritizes Google Scholar, arXiv, Semantic Scholar
  • Reddit/Discussions — community sentiment, real user experiences
  • News — time-sensitive results prioritized
  • YouTube — video content surfaced with summaries

Using File Upload for Local Analysis

Perplexica supports uploading PDFs and text files to query against them locally. Drop a research paper into the chat and ask questions directly — the backend embeds the document and runs similarity search against your query before calling the LLM.

This is particularly useful for:

  • Analyzing internal documents without sending them to any cloud service
  • Summarizing long PDFs with source citations pointing to specific pages
  • Cross-referencing a document against web results in a single query

Perplexica vs the Alternatives

ToolStarsLLMPrivacySetup Complexity
Perplexica33,000Any (local or cloud API)✅ Full local optionMedium
SearXNG alone16,000None — raw search✅ Full localLow
Morphic8,000OpenAI/Vercel AI only❌ Cloud requiredMedium
MindSearch14,000MultiplePartialHigh
Perplexity AIn/aProprietary❌ Vendor-controlledNone (SaaS)

SearXNG is what powers Perplexica's search layer. If you only want privacy-respecting web search without AI synthesis, SearXNG alone is simpler and lighter. But Perplexica is the right choice when you want the full Perplexity-style workflow — question → web research → cited answer.

Morphic is the most polished open-source competitor, but it's designed to run on Vercel and requires OpenAI. It's not truly self-hostable in the local sense — it's more "deploy your own cloud instance." Perplexica runs on your Raspberry Pi if you want it to.


Keeping Perplexica Updated

Perplexica releases frequently — v1.12.1 landed December 2025. To update:

cd Perplexica
docker-compose down
git pull origin master
docker-compose pull
docker-compose up -d

Check the GitHub releases page for changelog. Most updates add model support or fix SearXNG integration bugs — watch for breaking config.toml changes in the release notes.


When Perplexica Isn't the Right Choice

Be honest about the tradeoffs:

Don't use Perplexica if:

  • You want a polished, no-maintenance AI assistant — Perplexity AI is better
  • You're on a low-power VPS without enough RAM — SearXNG alone is lighter
  • You need mobile-first access — Perplexica's web interface isn't optimized for phones
  • You need real-time data with sub-second responses — cloud APIs will always be faster for latency

Use Perplexica if:

  • You're researching sensitive topics and don't want your queries logged
  • You already run Ollama and want to add web context to your local LLM
  • You want to customize prompts, add focus modes, or build on top of the search pipeline
  • You're building an internal research tool for a team or organization

Methodology


Looking for other self-hosted AI tools? See our guides on self-hosting Dify, Best Open Source Cursor Alternatives, and Self-Hosted AI Agent Frameworks. Compare all options on the OSSAlt homepage.

Comments