Dify: Self-Hosted AI Agents Without Code 2026
Dify: Self-Hosted AI Agents Without Code 2026
Dify crossed 100,000 GitHub stars in October 2025 — making it one of the fastest-growing AI infrastructure repos ever. It lets you build chatbots, AI workflows, autonomous agents, and RAG pipelines through a visual editor, then expose them as REST APIs. The self-hosted Community Edition is completely free with no usage limits, no per-message credits, and no data leaving your server.
If you've been paying for OpenAI Assistants, LangSmith, or PromptLayer for prototyping — or building LLM app infrastructure from scratch — Dify is worth 30 minutes of your time.
TL;DR
Dify is the most polished no-code AI application builder available for self-hosting in 2026. It handles the entire LLM stack: model provider integrations (OpenAI, Anthropic, Ollama, DeepSeek, and dozens more), RAG over your documents, visual workflow builder with Python/JS code nodes, agent reasoning strategies, and built-in chat UI. The Community Edition self-hosted is fully free. The trade-off vs. Flowise: Dify runs 11 Docker containers and needs 4 GB RAM minimum. For simple use cases, Flowise is lighter.
Key Takeaways
- 100,000+ GitHub stars (crossed Oct 2025); Apache 2.0 license with minor commercial restrictions
- v1.10.x current (2026); v1.0.0 introduced the plugin ecosystem in 2025
- Supported LLMs: OpenAI, Anthropic, Gemini, DeepSeek, Llama, Mistral, Ollama (local), any OpenAI-compatible endpoint
- Self-hosted Community Edition: no credit limits, no per-message fees, no data sent to Dify servers
- 11 Docker containers: API, worker, web, plugin daemon, PostgreSQL, Redis, Weaviate, Nginx, sandbox, SSRF proxy, worker beat
- Minimum requirements: 4 GB RAM, 2 CPU cores; 8 GB+ recommended for production with large knowledge bases
What You Can Build
Chatbots — Conversational apps with persistent memory, session history, and context management. Connect to any LLM provider, tune system prompts, and set context window limits. Deploy as a hosted chat UI or embed via iframe.
AI Workflows — Multi-step pipelines using a visual node editor. Nodes include: LLM (query a model), Code (Python/JS sandbox), HTTP Request (call external APIs), Knowledge Retrieval (RAG lookup), Tool (web search, calculator, etc.), Agent (autonomous reasoning), Conditional Branch, Loop, and Variable Aggregator. Build a pipeline that ingests a PDF, extracts data, transforms it with Python, queries a database, and returns a formatted report — all without writing a backend.
Autonomous Agents — The Agent Node gives an LLM planning capability using strategies like Chain-of-Thought, Tree-of-Thought, Graph-of-Thought, or BoT. The agent decides which tools to use, executes them, observes results, and iterates. Connect tools for web search, code execution, file reading, and custom HTTP endpoints.
RAG Pipelines — Upload documents (PDF, Word, PPT, Markdown, plain text, URLs), configure chunking and embedding, and query them with hybrid retrieval. Dify handles the full pipeline: ingestion → chunking → embedding → vector storage → retrieval → reranking. Built-in support for Weaviate (bundled), Qdrant, Milvus, Pinecone, pgvector, and Chroma.
APIs — Every Dify app generates a REST API endpoint with an API key. Use Dify as a Backend-as-a-Service layer: your front-end or other services call the Dify API, and Dify handles the LLM orchestration. OpenAI-compatible API format available for drop-in replacement.
Docker Compose Setup
# Clone the repository
git clone https://github.com/langgenius/dify.git
cd dify/docker
# Copy environment config
cp .env.example .env
# Edit .env — at minimum, set a secure SECRET_KEY
# SECRET_KEY=$(openssl rand -base64 42)
# Start all services
docker compose up -d
Access Dify at http://localhost (port 80 via Nginx). Create your admin account on first visit.
The 11 containers started:
# Core application
api: # Flask backend (langgenius/dify-api)
worker: # Celery async worker
worker_beat: # Celery scheduler
web: # Next.js frontend (langgenius/dify-web)
plugin_daemon: # Plugin execution sandbox
# Infrastructure
db: # PostgreSQL (primary database)
redis: # Task queue + caching
weaviate: # Bundled vector database
nginx: # Reverse proxy (ports 80/443)
ssrf_proxy: # Outbound HTTP proxy (SSRF protection)
sandbox: # Isolated code execution for Code nodes
Production .env settings to configure:
# Required: generate a strong secret key
SECRET_KEY=your-generated-secret-key-here
# Set your domain for cookie security
CONSOLE_API_URL=https://dify.yourdomain.com
APP_API_URL=https://dify.yourdomain.com
# Storage: local (default) or S3/Azure/GCS for file uploads
STORAGE_TYPE=local
# For S3:
# STORAGE_TYPE=s3
# S3_BUCKET_NAME=your-bucket
# AWS_ACCESS_KEY_ID=...
# AWS_SECRET_ACCESS_KEY=...
# File size limits
UPLOAD_FILE_SIZE_LIMIT=50 # MB
UPLOAD_IMAGE_FILE_SIZE_LIMIT=10
Connecting LLM Providers
After first login: Settings → Model Provider → add providers.
For cloud models (OpenAI, Anthropic, etc.), paste your API key. The model appears immediately in the workflow editor's model selector.
For local models via Ollama:
# First, set up Ollama (if not already running)
docker run -d \
--name ollama \
--gpus all \
-p 11434:11434 \
-v ollama:/root/.ollama \
ollama/ollama
# Pull a model
docker exec ollama ollama pull qwen2.5:14b
In Dify Settings → Model Provider → Ollama:
- Base URL:
http://host.docker.internal:11434(macOS/Windows) orhttp://172.17.0.1:11434(Linux Docker host IP) - Select models to use from the ones pulled in Ollama
This gives you fully private, offline LLM inference — no API keys, no data sent anywhere.
Building a RAG Knowledge Base
RAG (Retrieval Augmented Generation) is where Dify shines. Building a document Q&A system takes under 10 minutes:
- Knowledge tab → Create Knowledge Base
- Upload documents (PDF, Word, Markdown, URLs — up to 50 MB per file)
- Indexing mode: Economy (keyword only) or High Quality (embeddings, requires embedding API key)
- Retrieval setting: Semantic Search (vector), Full-text Search, or Hybrid (recommended)
- Enable Reranking for better result quality (requires a reranker API like Cohere)
- Set Top K (how many chunks to retrieve) and score threshold
Once indexed, attach the knowledge base to any chatbot or workflow node. The Knowledge Retrieval node in workflows takes a query string and returns the top matching chunks.
For fully local RAG (no external APIs):
- Use Ollama for both the chat LLM and embedding model
- Dify supports local embedding via Ollama's embedding endpoint
- All data stays on your server
Dify vs Flowise vs LangFlow
| Dimension | Dify | Flowise | LangFlow |
|---|---|---|---|
| Target user | Business users + devs | Developers | Python developers |
| No-code UX | Best-in-class | Good | Steeper curve |
| Debug tooling | Full trace logs, version history | Minimal | Moderate |
| Nested workflows | Yes (loops, branches, sub-flows) | Limited | Yes |
| Plugin ecosystem | Yes (marketplace, v1.0+) | No | No |
| RAG | Built-in, rich | Plugin-based | Plugin-based |
| Resource usage | 4 GB+ RAM (11 containers) | ~1 GB RAM | ~2 GB RAM |
| Setup complexity | Moderate | Very simple | Moderate |
| License | Apache 2.0 + restrictions | MIT | MIT |
| Enterprise SSO | Yes (paid) | No | Limited |
Choose Dify if: You want the most polished builder with full MLOps observability, rich debugging, and the ability to let non-engineers build and deploy AI apps. Dify's workflow editor is genuinely better than its competitors.
Choose Flowise if: You want a lightweight single-container deployment with minimal setup. Flowise is the fastest path from zero to a working LangChain/LlamaIndex pipeline.
Choose LangFlow if: You're a Python developer who needs to modify component internals and want full code-level control over the pipeline.
The Plugin Ecosystem (v1.0+)
Dify's v1.0.0 release introduced a plugin marketplace — tools, model providers, and agent strategies installable like browser extensions:
- Tools: web search, code execution, image generation, file operations, API connectors
- Model providers: new providers added via plugin (no Dify version upgrade required)
- Agent strategies: custom reasoning modules (beyond built-in CoT/ToT)
- Extensions: custom integrations for Slack, Notion, GitHub, Google Drive
Install from the Marketplace (built into the UI) or via plugin URL. Community-contributed plugins follow the same sandbox architecture as built-in tools.
Exposing Dify as an API
Every app gets an API endpoint accessible via the Dify backend URL:
import requests
# Chat with a Dify chatbot app
response = requests.post(
"http://your-dify-server/v1/chat-messages",
headers={
"Authorization": "Bearer your-app-api-key",
"Content-Type": "application/json"
},
json={
"inputs": {},
"query": "Summarize the Q3 financial report",
"response_mode": "blocking",
"conversation_id": "",
"user": "user-123"
}
)
print(response.json()["answer"])
For streaming responses (real-time output):
json={
"response_mode": "streaming", # Returns SSE stream
...
}
The OpenAI-compatible API lets you swap in Dify for any app already using the OpenAI SDK — just change the base_url to your Dify server and the api_key to your app's API key.
MCP Protocol Support
Dify added HTTP-based MCP (Model Context Protocol, spec 2025-03-26) support in 2025. This means:
- External MCP clients (Claude Desktop, other MCP hosts) can invoke Dify workflows as tools
- Dify agents can consume external MCP servers as tools
- Interoperability with the growing MCP ecosystem (GitHub, filesystem, databases) without custom integration code
This is significant for homelab and enterprise deployments where you want Dify to serve as a central AI orchestration layer that other agents and tools connect to.
Self-Hosted vs. Dify Cloud
| Community Edition (Self-Hosted) | Cloud Professional | Cloud Team | |
|---|---|---|---|
| Price | Free | $59/month | $159/month |
| Message credits | Unlimited | 5,000/month | 10,000/month |
| Apps | Unlimited | 50 | Unlimited |
| Vector storage | Unlimited (your disk) | 5 GB | 20 GB |
| Documents | Unlimited | 500 | 1,000 |
| SSO/SAML | Enterprise license | No | Yes |
| Data residency | Your server | Dify servers | Dify servers |
For privacy-sensitive use cases — medical records, legal documents, proprietary code — self-hosted Community Edition is the only option that keeps data on your infrastructure. The unlimited usage is a genuine advantage over the credit-based cloud tiers.
When to Use Dify
Use Dify if:
- You want to build AI-powered apps without writing a custom backend
- You need RAG over internal documents without sending them to OpenAI's servers
- Your team includes non-engineers who need to modify AI prompts and workflows
- You want to compare LLM providers side-by-side with the same workflow
- You're building on local models (Ollama) for complete privacy
Skip Dify if:
- You only need a simple chatbot with no workflow logic (use Open WebUI directly)
- You need extreme customization of LangChain/LlamaIndex pipeline internals (use LangFlow)
- Your VPS has under 4 GB RAM (use Flowise instead)
- You need enterprise SSO without paying for the enterprise license
Browse all AI agent alternatives at OSSAlt. Related: Activepieces vs n8n automation comparison, self-hosted LLM guide with DeepSeek and Qwen.