Self-Host Stirling PDF: Open Source PDF Tools 2026

TL;DR

Stirling PDF (MIT, ~45K GitHub stars, Java) is a comprehensive self-hosted PDF manipulation tool with 50+ operations — merge, split, compress, convert, rotate, watermark, add page numbers, OCR, redact, sign, and more. Adobe Acrobat Standard costs $12.99/month ($155.88/year). Stirling PDF is free and processes everything locally — your documents never leave your server. It's become one of the most-starred self-hosted tools in 2025-2026 for good reason.

Key Takeaways

Stirling PDF: MIT, ~45K stars, Java — 50+ PDF operations in one self-hosted tool
No cloud uploads: All processing happens on your server — documents stay private
OCR: Built-in Tesseract OCR to make scanned PDFs text-searchable
Batch operations: Process multiple PDFs at once
API: Full REST API for automation (compress all PDFs from Paperless, etc.)
vs Adobe Acrobat: Stirling has most Acrobat features, free, no subscription

Feature Overview

Category	Operations
Organize	Merge, Split, Remove pages, Rotate, Reorder, PDF to single image per page
Convert	PDF→Word/Excel/PPT, Word/Excel/PPT→PDF, Image→PDF, PDF→Images, HTML→PDF
Optimize	Compress, Repair, Flatten annotations, Reduce file size
Security	Add/Remove passwords, Redact text, Add/Remove watermarks, Flatten forms
Other	OCR (text recognition), Add signatures, Add page numbers, Extract images, Extract text, Compare PDFs

Part 1: Docker Setup

# docker-compose.yml
services:
  stirling-pdf:
    image: frooodle/s-pdf:latest
    container_name: stirling-pdf
    restart: unless-stopped
    ports:
      - "8080:8080"
    volumes:
      - stirling_trainingData:/usr/share/tesseract-ocr/5/tessdata  # OCR language data
      - stirling_extraConfigs:/configs   # Custom configs
      - stirling_logs:/logs
      - stirling_customFiles:/customFiles
    environment:
      DOCKER_ENABLE_SECURITY: "false"  # Set true to enable login
      SECURITY_ENABLELOGIN: "false"
      LANGS: "en_GB"
      TZ: "America/Los_Angeles"
      # Enable OCR (downloads language data on first use):
      INSTALL_BOOK_AND_ADVANCED_HTML_OPS: "false"
      TESSERACT_LANGS: "eng"

volumes:
  stirling_trainingData:
  stirling_extraConfigs:
  stirling_logs:
  stirling_customFiles:

docker compose up -d

Visit http://your-server:8080

Part 2: HTTPS with Caddy

pdf.yourdomain.com {
    reverse_proxy localhost:8080
}

Part 3: Authentication (Optional)

For multi-user or internet-exposed deployments:

environment:
  DOCKER_ENABLE_SECURITY: "true"
  SECURITY_ENABLELOGIN: "true"
  SECURITY_INITIALLOGIN_USERNAME: "admin"
  SECURITY_INITIALLOGIN_PASSWORD: "your-admin-password"
  # Users can be managed in the UI after login

Part 4: Common Operations

Merge PDFs

Merge/Split → Merge PDFs
Upload multiple PDFs
Drag to reorder
Merge → download combined PDF

Split a PDF

Merge/Split → Split PDF
Upload PDF
Choose: split by page ranges, split every N pages, split on specific pages
Download as ZIP of split files

Compress PDF

Other → Compress PDF
Upload PDF
Choose compression level (lower quality = smaller size)
Download compressed PDF

OCR a scanned document

Other → PDF OCR
Upload scanned PDF (image-only)
Select language: English
Stirling runs Tesseract OCR
Download text-searchable PDF

# Via API (for automation):
curl -X POST "https://pdf.yourdomain.com/api/v1/misc/ocr-pdf" \
  -H "Content-Type: multipart/form-data" \
  -F "fileInput=@scan.pdf" \
  -F "languages=eng" \
  --output scan-ocr.pdf

Redact sensitive information

Security → Redact PDF
Upload PDF
Draw rectangles over sensitive areas (SSN, account numbers, etc.)
Redact → areas permanently blacked out

Add passwords

Security → Encrypt PDF
Upload PDF
Set owner password (full access) and user password (read-only)
Encryption: AES-256
Download encrypted PDF

Part 5: REST API

Full API for automation and integration with Paperless, n8n, etc.:

BASE="https://pdf.yourdomain.com/api/v1"

# Merge two PDFs:
curl -X POST "$BASE/general/merge-pdfs" \
  -H "Content-Type: multipart/form-data" \
  -F "fileInput=@doc1.pdf" \
  -F "fileInput=@doc2.pdf" \
  --output merged.pdf

# Compress a PDF:
curl -X POST "$BASE/general/compress-pdf" \
  -F "fileInput=@large.pdf" \
  -F "optimizeLevel=2" \
  --output compressed.pdf

# Convert PDF to images:
curl -X POST "$BASE/convert/pdf/img" \
  -F "fileInput=@document.pdf" \
  -F "imageFormat=png" \
  -F "singleOrMultiple=multiple" \
  --output images.zip

# Convert Word to PDF:
curl -X POST "$BASE/convert/file/pdf" \
  -F "fileInput=@document.docx" \
  --output document.pdf

# Extract text from PDF:
curl -X POST "$BASE/misc/extract-text" \
  -F "fileInput=@document.pdf" | jq '.text'

# Remove pages from PDF:
curl -X POST "$BASE/general/remove-pages" \
  -F "fileInput=@document.pdf" \
  -F "pageNumbers=1,3,5-7" \
  --output trimmed.pdf

n8n automation: compress all new PDFs

// n8n Code node — auto-compress PDFs from a watched folder:
const formData = new FormData();
formData.append('fileInput', items[0].binary.data);
formData.append('optimizeLevel', '2');

const response = await fetch('https://pdf.yourdomain.com/api/v1/general/compress-pdf', {
  method: 'POST',
  body: formData
});

return [{ binary: { data: await response.arrayBuffer() } }];

Part 6: Multi-Language OCR

Add more languages for OCR:

environment:
  TESSERACT_LANGS: "eng+fra+deu+spa+jpn"

Or install language packs manually:

# Download Tesseract language data:
docker exec stirling-pdf apt-get install -y \
  tesseract-ocr-fra \
  tesseract-ocr-deu \
  tesseract-ocr-spa

# Or copy .traineddata files to:
# /usr/share/tesseract-ocr/5/tessdata/

Available languages: 100+ via Tesseract's traineddata files.

Part 7: Pipeline Operations

Chain multiple operations together:

# Pipeline: compress → add page numbers → add watermark:
# Via UI: Other → Pipeline
# 1. Add Step: Compress PDF (level: 2)
# 2. Add Step: Add Page Numbers
# 3. Add Step: Add Watermark (text: "CONFIDENTIAL", opacity: 0.3)
# Run pipeline on uploaded file

Maintenance

# Update:
docker compose pull
docker compose up -d

# Check version:
curl https://pdf.yourdomain.com/api/v1/info

# Logs:
docker compose logs -f stirling-pdf

# Clear temporary files (auto-cleaned, but manual if needed):
docker exec stirling-pdf rm -rf /tmp/stirling-pdf-*

See all open source PDF and productivity tools at OSSAlt.com/categories/productivity.

Comments