Skip to main content

Dify: Self-Hosted AI Agents Without Code 2026

·OSSAlt Team
difyai-agentsragself-hostingdockerllmno-codeworkflows

Dify: Self-Hosted AI Agents Without Code 2026

Dify crossed 100,000 GitHub stars in October 2025 — making it one of the fastest-growing AI infrastructure repos ever. It lets you build chatbots, AI workflows, autonomous agents, and RAG pipelines through a visual editor, then expose them as REST APIs. The self-hosted Community Edition is completely free with no usage limits, no per-message credits, and no data leaving your server.

If you've been paying for OpenAI Assistants, LangSmith, or PromptLayer for prototyping — or building LLM app infrastructure from scratch — Dify is worth 30 minutes of your time.

TL;DR

Dify is the most polished no-code AI application builder available for self-hosting in 2026. It handles the entire LLM stack: model provider integrations (OpenAI, Anthropic, Ollama, DeepSeek, and dozens more), RAG over your documents, visual workflow builder with Python/JS code nodes, agent reasoning strategies, and built-in chat UI. The Community Edition self-hosted is fully free. The trade-off vs. Flowise: Dify runs 11 Docker containers and needs 4 GB RAM minimum. For simple use cases, Flowise is lighter.

Key Takeaways

  • 100,000+ GitHub stars (crossed Oct 2025); Apache 2.0 license with minor commercial restrictions
  • v1.10.x current (2026); v1.0.0 introduced the plugin ecosystem in 2025
  • Supported LLMs: OpenAI, Anthropic, Gemini, DeepSeek, Llama, Mistral, Ollama (local), any OpenAI-compatible endpoint
  • Self-hosted Community Edition: no credit limits, no per-message fees, no data sent to Dify servers
  • 11 Docker containers: API, worker, web, plugin daemon, PostgreSQL, Redis, Weaviate, Nginx, sandbox, SSRF proxy, worker beat
  • Minimum requirements: 4 GB RAM, 2 CPU cores; 8 GB+ recommended for production with large knowledge bases

What You Can Build

Chatbots — Conversational apps with persistent memory, session history, and context management. Connect to any LLM provider, tune system prompts, and set context window limits. Deploy as a hosted chat UI or embed via iframe.

AI Workflows — Multi-step pipelines using a visual node editor. Nodes include: LLM (query a model), Code (Python/JS sandbox), HTTP Request (call external APIs), Knowledge Retrieval (RAG lookup), Tool (web search, calculator, etc.), Agent (autonomous reasoning), Conditional Branch, Loop, and Variable Aggregator. Build a pipeline that ingests a PDF, extracts data, transforms it with Python, queries a database, and returns a formatted report — all without writing a backend.

Autonomous Agents — The Agent Node gives an LLM planning capability using strategies like Chain-of-Thought, Tree-of-Thought, Graph-of-Thought, or BoT. The agent decides which tools to use, executes them, observes results, and iterates. Connect tools for web search, code execution, file reading, and custom HTTP endpoints.

RAG Pipelines — Upload documents (PDF, Word, PPT, Markdown, plain text, URLs), configure chunking and embedding, and query them with hybrid retrieval. Dify handles the full pipeline: ingestion → chunking → embedding → vector storage → retrieval → reranking. Built-in support for Weaviate (bundled), Qdrant, Milvus, Pinecone, pgvector, and Chroma.

APIs — Every Dify app generates a REST API endpoint with an API key. Use Dify as a Backend-as-a-Service layer: your front-end or other services call the Dify API, and Dify handles the LLM orchestration. OpenAI-compatible API format available for drop-in replacement.


Docker Compose Setup

# Clone the repository
git clone https://github.com/langgenius/dify.git
cd dify/docker

# Copy environment config
cp .env.example .env

# Edit .env — at minimum, set a secure SECRET_KEY
# SECRET_KEY=$(openssl rand -base64 42)

# Start all services
docker compose up -d

Access Dify at http://localhost (port 80 via Nginx). Create your admin account on first visit.

The 11 containers started:

# Core application
api:        # Flask backend (langgenius/dify-api)
worker:     # Celery async worker
worker_beat: # Celery scheduler
web:        # Next.js frontend (langgenius/dify-web)
plugin_daemon: # Plugin execution sandbox

# Infrastructure
db:         # PostgreSQL (primary database)
redis:      # Task queue + caching
weaviate:   # Bundled vector database
nginx:      # Reverse proxy (ports 80/443)
ssrf_proxy: # Outbound HTTP proxy (SSRF protection)
sandbox:    # Isolated code execution for Code nodes

Production .env settings to configure:

# Required: generate a strong secret key
SECRET_KEY=your-generated-secret-key-here

# Set your domain for cookie security
CONSOLE_API_URL=https://dify.yourdomain.com
APP_API_URL=https://dify.yourdomain.com

# Storage: local (default) or S3/Azure/GCS for file uploads
STORAGE_TYPE=local
# For S3:
# STORAGE_TYPE=s3
# S3_BUCKET_NAME=your-bucket
# AWS_ACCESS_KEY_ID=...
# AWS_SECRET_ACCESS_KEY=...

# File size limits
UPLOAD_FILE_SIZE_LIMIT=50   # MB
UPLOAD_IMAGE_FILE_SIZE_LIMIT=10

Connecting LLM Providers

After first login: Settings → Model Provider → add providers.

For cloud models (OpenAI, Anthropic, etc.), paste your API key. The model appears immediately in the workflow editor's model selector.

For local models via Ollama:

# First, set up Ollama (if not already running)
docker run -d \
  --name ollama \
  --gpus all \
  -p 11434:11434 \
  -v ollama:/root/.ollama \
  ollama/ollama

# Pull a model
docker exec ollama ollama pull qwen2.5:14b

In Dify Settings → Model Provider → Ollama:

  • Base URL: http://host.docker.internal:11434 (macOS/Windows) or http://172.17.0.1:11434 (Linux Docker host IP)
  • Select models to use from the ones pulled in Ollama

This gives you fully private, offline LLM inference — no API keys, no data sent anywhere.


Building a RAG Knowledge Base

RAG (Retrieval Augmented Generation) is where Dify shines. Building a document Q&A system takes under 10 minutes:

  1. Knowledge tab → Create Knowledge Base
  2. Upload documents (PDF, Word, Markdown, URLs — up to 50 MB per file)
  3. Indexing mode: Economy (keyword only) or High Quality (embeddings, requires embedding API key)
  4. Retrieval setting: Semantic Search (vector), Full-text Search, or Hybrid (recommended)
  5. Enable Reranking for better result quality (requires a reranker API like Cohere)
  6. Set Top K (how many chunks to retrieve) and score threshold

Once indexed, attach the knowledge base to any chatbot or workflow node. The Knowledge Retrieval node in workflows takes a query string and returns the top matching chunks.

For fully local RAG (no external APIs):

  • Use Ollama for both the chat LLM and embedding model
  • Dify supports local embedding via Ollama's embedding endpoint
  • All data stays on your server

Dify vs Flowise vs LangFlow

DimensionDifyFlowiseLangFlow
Target userBusiness users + devsDevelopersPython developers
No-code UXBest-in-classGoodSteeper curve
Debug toolingFull trace logs, version historyMinimalModerate
Nested workflowsYes (loops, branches, sub-flows)LimitedYes
Plugin ecosystemYes (marketplace, v1.0+)NoNo
RAGBuilt-in, richPlugin-basedPlugin-based
Resource usage4 GB+ RAM (11 containers)~1 GB RAM~2 GB RAM
Setup complexityModerateVery simpleModerate
LicenseApache 2.0 + restrictionsMITMIT
Enterprise SSOYes (paid)NoLimited

Choose Dify if: You want the most polished builder with full MLOps observability, rich debugging, and the ability to let non-engineers build and deploy AI apps. Dify's workflow editor is genuinely better than its competitors.

Choose Flowise if: You want a lightweight single-container deployment with minimal setup. Flowise is the fastest path from zero to a working LangChain/LlamaIndex pipeline.

Choose LangFlow if: You're a Python developer who needs to modify component internals and want full code-level control over the pipeline.


The Plugin Ecosystem (v1.0+)

Dify's v1.0.0 release introduced a plugin marketplace — tools, model providers, and agent strategies installable like browser extensions:

  • Tools: web search, code execution, image generation, file operations, API connectors
  • Model providers: new providers added via plugin (no Dify version upgrade required)
  • Agent strategies: custom reasoning modules (beyond built-in CoT/ToT)
  • Extensions: custom integrations for Slack, Notion, GitHub, Google Drive

Install from the Marketplace (built into the UI) or via plugin URL. Community-contributed plugins follow the same sandbox architecture as built-in tools.


Exposing Dify as an API

Every app gets an API endpoint accessible via the Dify backend URL:

import requests

# Chat with a Dify chatbot app
response = requests.post(
    "http://your-dify-server/v1/chat-messages",
    headers={
        "Authorization": "Bearer your-app-api-key",
        "Content-Type": "application/json"
    },
    json={
        "inputs": {},
        "query": "Summarize the Q3 financial report",
        "response_mode": "blocking",
        "conversation_id": "",
        "user": "user-123"
    }
)

print(response.json()["answer"])

For streaming responses (real-time output):

json={
    "response_mode": "streaming",  # Returns SSE stream
    ...
}

The OpenAI-compatible API lets you swap in Dify for any app already using the OpenAI SDK — just change the base_url to your Dify server and the api_key to your app's API key.


MCP Protocol Support

Dify added HTTP-based MCP (Model Context Protocol, spec 2025-03-26) support in 2025. This means:

  • External MCP clients (Claude Desktop, other MCP hosts) can invoke Dify workflows as tools
  • Dify agents can consume external MCP servers as tools
  • Interoperability with the growing MCP ecosystem (GitHub, filesystem, databases) without custom integration code

This is significant for homelab and enterprise deployments where you want Dify to serve as a central AI orchestration layer that other agents and tools connect to.


Self-Hosted vs. Dify Cloud

Community Edition (Self-Hosted)Cloud ProfessionalCloud Team
PriceFree$59/month$159/month
Message creditsUnlimited5,000/month10,000/month
AppsUnlimited50Unlimited
Vector storageUnlimited (your disk)5 GB20 GB
DocumentsUnlimited5001,000
SSO/SAMLEnterprise licenseNoYes
Data residencyYour serverDify serversDify servers

For privacy-sensitive use cases — medical records, legal documents, proprietary code — self-hosted Community Edition is the only option that keeps data on your infrastructure. The unlimited usage is a genuine advantage over the credit-based cloud tiers.


When to Use Dify

Use Dify if:

  • You want to build AI-powered apps without writing a custom backend
  • You need RAG over internal documents without sending them to OpenAI's servers
  • Your team includes non-engineers who need to modify AI prompts and workflows
  • You want to compare LLM providers side-by-side with the same workflow
  • You're building on local models (Ollama) for complete privacy

Skip Dify if:

  • You only need a simple chatbot with no workflow logic (use Open WebUI directly)
  • You need extreme customization of LangChain/LlamaIndex pipeline internals (use LangFlow)
  • Your VPS has under 4 GB RAM (use Flowise instead)
  • You need enterprise SSO without paying for the enterprise license

Browse all AI agent alternatives at OSSAlt. Related: Activepieces vs n8n automation comparison, self-hosted LLM guide with DeepSeek and Qwen.

Comments