Self-Host Perplexica: Open Source Perplexity 2026
Perplexity AI is genuinely good — but every query you run trains their models, builds their data profile, and eventually ends up funding a closed-source moat. Perplexica is the answer: a 33,000-star MIT-licensed AI search engine that runs entirely on your hardware, uses your choice of LLM, and routes web searches through SearXNG instead of a vendor-controlled black box.
This guide gets you from zero to a working Perplexica instance in under 15 minutes.
TL;DR
Perplexica is the best open-source Perplexity alternative in 2026. At 33k GitHub stars, MIT licensed, and actively maintained through v1.12.1, it's mature enough for daily use. Run it with Ollama for 100% local operation, or wire it to OpenAI, Anthropic, or Groq if you want faster inference. Every search stays on your hardware — no query history, no ad targeting, no training data contribution.
Key Takeaways
- 33,000 GitHub stars — the most popular self-hosted AI search engine by a wide margin
- MIT licensed — fork it, modify it, run it commercially without restriction
- Model-agnostic — Ollama (local), OpenAI, Anthropic Claude, or Groq all work out of the box
- Powered by SearXNG — aggregates Google, Bing, DuckDuckGo, and 70+ engines without tracking
- 3 focus modes — Speed (fast), Balanced (daily use), Quality (deep research) tunable per query
- v1.12.1 released December 2025 — actively maintained, 976 commits on master
- Docker Compose install — five commands from clone to running instance
Why Perplexica Over Perplexity AI
Perplexity AI charges $20/month for Pro. The free tier limits your daily searches and uses their proprietary model stack. More importantly, your search history is stored, analyzed, and potentially used for model training. Perplexity's privacy policy is standard enterprise: "we may use your content to improve our services."
Perplexica flips this completely. The architecture is:
Your query → SearXNG (aggregates search engines) → LLM of your choice → Cited answer
Nothing leaves your network unless you configure it to use a cloud API. With Ollama, the entire stack — search aggregation, AI inference, answer generation — runs on your machine.
The tradeoff is real: Perplexica requires setup, hardware, and occasional maintenance. But for privacy-conscious users, researchers handling sensitive topics, or developers who want full control over their AI stack, this is the better choice.
What You're Actually Installing
Perplexica isn't a single binary — it's a small stack of services:
| Service | Role | Port |
|---|---|---|
| Perplexica frontend | Chat interface (Next.js) | 3000 |
| Perplexica backend | Search orchestration + LLM calls | 3001 |
| SearXNG | Privacy-preserving meta-search engine | 4000 |
The backend handles the logic: it takes your question, constructs search queries, sends them to SearXNG, fetches the top results, and passes the context to your LLM with a prompt asking it to synthesize a cited answer.
Docker Compose spins all three up together. You don't need to configure SearXNG separately — Perplexica's docker-compose.yaml handles it.
System Requirements
Before installing, confirm you have:
- Docker + Docker Compose — v24+ recommended
- RAM: Minimum 4GB free. 8GB+ if running Ollama locally alongside Perplexica
- Disk: ~2-5GB for Docker images, plus model storage if using Ollama
- LLM backend (pick one):
- Ollama — free, fully local, needs GPU for acceptable speed (CPU works but is slow)
- OpenAI API key — fastest results, ~$0.01 per search at gpt-4o-mini
- Anthropic API key — Claude Haiku is a good balance of speed and quality
- Groq API key — free tier available, fastest inference for Llama-based models
Installation: Docker Compose (Recommended)
Step 1: Clone the repository
git clone https://github.com/ItzCrazyKns/Perplexica.git
cd Perplexica
Step 2: Configure
Copy the sample config and open it for editing:
cp sample.config.toml config.toml
Open config.toml. The key sections:
[GENERAL]
PORT = 3001
SIMILARITY_MEASURE = "cosine" # or "dot_product"
[API_KEYS]
OPENAI = "" # paste your key here if using OpenAI
GROQ = "" # paste if using Groq
ANTHROPIC = "" # paste if using Anthropic
[API_ENDPOINTS]
OLLAMA = "" # e.g. "http://host.docker.internal:11434" for local Ollama
SEARXNG = "http://searxng:4000" # leave as-is for Docker setup
If using Ollama locally on Linux: You must expose Ollama to the Docker network. Set the environment variable before starting Ollama:
OLLAMA_HOST=0.0.0.0:11434 ollama serve
Then set the endpoint in config.toml:
OLLAMA = "http://host.docker.internal:11434"
Step 3: Start
docker-compose up -d
Wait 30-60 seconds for all containers to initialize. The first run pulls images (~1.5GB total).
Step 4: Open and configure the model
Navigate to http://localhost:3000. On first load, you'll see the Settings gear icon — click it to select your LLM:
- Chat model: The model that generates answers. Groq's
llama-3.3-70b-versatileis a great free option for cloud. For local,llama3.2:3bworks on 8GB RAM,llama3.1:8bis better on 16GB+. - Embedding model: Used for document reranking. Leave at default unless you're customizing.
Step 5: Run your first search
Type a question. Try something where citations matter:
"What changed in Node.js 24 vs Node.js 22?"
You'll see Perplexica query SearXNG, pull sources, and generate a cited answer with numbered footnotes linking to exact URLs. This is the core loop.
Search Modes Explained
Perplexica's three focus modes change how it approaches each query:
| Mode | Speed | Depth | Best For |
|---|---|---|---|
| Speed | ~3-5s | Shallow | Quick factual lookups, recent events |
| Balanced | ~8-15s | Medium | Everyday research, most queries |
| Quality | ~20-40s | Deep | Technical comparisons, research papers |
Switch modes per query in the interface. Quality mode runs multiple search rounds and uses more context tokens — it costs more on paid APIs but produces noticeably better synthesis on complex topics.
Beyond the base modes, you can narrow the search source:
- All Sources — default, aggregates everything
- Academic — prioritizes Google Scholar, arXiv, Semantic Scholar
- Reddit/Discussions — community sentiment, real user experiences
- News — time-sensitive results prioritized
- YouTube — video content surfaced with summaries
Using File Upload for Local Analysis
Perplexica supports uploading PDFs and text files to query against them locally. Drop a research paper into the chat and ask questions directly — the backend embeds the document and runs similarity search against your query before calling the LLM.
This is particularly useful for:
- Analyzing internal documents without sending them to any cloud service
- Summarizing long PDFs with source citations pointing to specific pages
- Cross-referencing a document against web results in a single query
Perplexica vs the Alternatives
| Tool | Stars | LLM | Privacy | Setup Complexity |
|---|---|---|---|---|
| Perplexica | 33,000 | Any (local or cloud API) | ✅ Full local option | Medium |
| SearXNG alone | 16,000 | None — raw search | ✅ Full local | Low |
| Morphic | 8,000 | OpenAI/Vercel AI only | ❌ Cloud required | Medium |
| MindSearch | 14,000 | Multiple | Partial | High |
| Perplexity AI | n/a | Proprietary | ❌ Vendor-controlled | None (SaaS) |
SearXNG is what powers Perplexica's search layer. If you only want privacy-respecting web search without AI synthesis, SearXNG alone is simpler and lighter. But Perplexica is the right choice when you want the full Perplexity-style workflow — question → web research → cited answer.
Morphic is the most polished open-source competitor, but it's designed to run on Vercel and requires OpenAI. It's not truly self-hostable in the local sense — it's more "deploy your own cloud instance." Perplexica runs on your Raspberry Pi if you want it to.
Keeping Perplexica Updated
Perplexica releases frequently — v1.12.1 landed December 2025. To update:
cd Perplexica
docker-compose down
git pull origin master
docker-compose pull
docker-compose up -d
Check the GitHub releases page for changelog. Most updates add model support or fix SearXNG integration bugs — watch for breaking config.toml changes in the release notes.
When Perplexica Isn't the Right Choice
Be honest about the tradeoffs:
Don't use Perplexica if:
- You want a polished, no-maintenance AI assistant — Perplexity AI is better
- You're on a low-power VPS without enough RAM — SearXNG alone is lighter
- You need mobile-first access — Perplexica's web interface isn't optimized for phones
- You need real-time data with sub-second responses — cloud APIs will always be faster for latency
Use Perplexica if:
- You're researching sensitive topics and don't want your queries logged
- You already run Ollama and want to add web context to your local LLM
- You want to customize prompts, add focus modes, or build on top of the search pipeline
- You're building an internal research tool for a team or organization
Methodology
- GitHub data: ItzCrazyKns/Perplexica, March 2026
- Installation verified against v1.12.1 docker-compose.yaml and official README
- Comparison data from openalternative.co and direct GitHub repositories
- Search mode descriptions from Perplexica DeepWiki
Looking for other self-hosted AI tools? See our guides on self-hosting Dify, Best Open Source Cursor Alternatives, and Self-Hosted AI Agent Frameworks. Compare all options on the OSSAlt homepage.