Vendored deer-flow upstream (bytedance/deer-flow) plus prompt-injection hardening: - New deerflow.security package: content_delimiter, html_cleaner, sanitizer (8 layers — invisible chars, control chars, symbols, NFC, PUA, tag chars, horizontal whitespace collapse with newline/tab preservation, length cap) - New deerflow.community.searx package: web_search, web_fetch, image_search backed by a private SearX instance, every external string sanitized and wrapped in <<<EXTERNAL_UNTRUSTED_CONTENT>>> delimiters - All native community web providers (ddg_search, tavily, exa, firecrawl, jina_ai, infoquest, image_search) replaced with hard-fail stubs that raise NativeWebToolDisabledError at import time, so a misconfigured tool.use path fails loud rather than silently falling back to unsanitized output - Native client back-doors (jina_client.py, infoquest_client.py) stubbed too - Native-tool tests quarantined under tests/_disabled_native/ (collect_ignore_glob via local conftest.py) - Sanitizer Layer 7 fix: only collapse horizontal whitespace, preserve newlines and tabs so list/table structure survives - Hardened runtime config.yaml references only the searx-backed tools - Factory overlay (backend/) kept in sync with deer-flow tree as a reference / source See HARDENING.md for the full audit trail and verification steps.
88 lines
3.9 KiB
Docker
88 lines
3.9 KiB
Docker
# Backend Dockerfile — multi-stage build
|
|
# Stage 1 (builder): compiles native Python extensions with build-essential
|
|
# Stage 2 (dev): retains toolchain for dev containers (uv sync at startup)
|
|
# Stage 3 (runtime): clean image without compiler toolchain for production
|
|
|
|
# UV source image (override for restricted networks that cannot reach ghcr.io)
|
|
ARG UV_IMAGE=ghcr.io/astral-sh/uv:0.7.20
|
|
FROM ${UV_IMAGE} AS uv-source
|
|
|
|
# ── Stage 1: Builder ──────────────────────────────────────────────────────────
|
|
FROM python:3.12-slim-bookworm AS builder
|
|
|
|
ARG NODE_MAJOR=22
|
|
ARG APT_MIRROR
|
|
ARG UV_INDEX_URL
|
|
|
|
# Optionally override apt mirror for restricted networks (e.g. APT_MIRROR=mirrors.aliyun.com)
|
|
RUN if [ -n "${APT_MIRROR}" ]; then \
|
|
sed -i "s|deb.debian.org|${APT_MIRROR}|g" /etc/apt/sources.list.d/debian.sources 2>/dev/null || true; \
|
|
sed -i "s|deb.debian.org|${APT_MIRROR}|g" /etc/apt/sources.list 2>/dev/null || true; \
|
|
fi
|
|
|
|
# Install build tools + Node.js (build-essential needed for native Python extensions)
|
|
RUN apt-get update && apt-get install -y \
|
|
curl \
|
|
build-essential \
|
|
gnupg \
|
|
ca-certificates \
|
|
&& mkdir -p /etc/apt/keyrings \
|
|
&& curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key | gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg \
|
|
&& echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_${NODE_MAJOR}.x nodistro main" > /etc/apt/sources.list.d/nodesource.list \
|
|
&& apt-get update \
|
|
&& apt-get install -y nodejs \
|
|
&& rm -rf /var/lib/apt/lists/*
|
|
|
|
# Install uv (source image overridable via UV_IMAGE build arg)
|
|
COPY --from=uv-source /uv /uvx /usr/local/bin/
|
|
|
|
# Set working directory
|
|
WORKDIR /app
|
|
|
|
# Copy backend source code
|
|
COPY backend ./backend
|
|
|
|
# Install dependencies with cache mount
|
|
RUN --mount=type=cache,target=/root/.cache/uv \
|
|
sh -c "cd backend && UV_INDEX_URL=${UV_INDEX_URL:-https://pypi.org/simple} uv sync"
|
|
|
|
# ── Stage 2: Dev ──────────────────────────────────────────────────────────────
|
|
# Retains compiler toolchain from builder so startup-time `uv sync` can build
|
|
# source distributions in development containers.
|
|
FROM builder AS dev
|
|
|
|
# Install Docker CLI (for DooD: allows starting sandbox containers via host Docker socket)
|
|
COPY --from=docker:cli /usr/local/bin/docker /usr/local/bin/docker
|
|
|
|
EXPOSE 8001 2024
|
|
|
|
CMD ["sh", "-c", "cd backend && PYTHONPATH=. uv run uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001"]
|
|
|
|
# ── Stage 3: Runtime ──────────────────────────────────────────────────────────
|
|
# Clean image without build-essential — reduces size (~200 MB) and attack surface.
|
|
FROM python:3.12-slim-bookworm
|
|
|
|
# Copy Node.js runtime from builder (provides npx for MCP servers)
|
|
COPY --from=builder /usr/bin/node /usr/bin/node
|
|
COPY --from=builder /usr/lib/node_modules /usr/lib/node_modules
|
|
RUN ln -s ../lib/node_modules/npm/bin/npm-cli.js /usr/bin/npm \
|
|
&& ln -s ../lib/node_modules/npm/bin/npx-cli.js /usr/bin/npx
|
|
|
|
# Install Docker CLI (for DooD: allows starting sandbox containers via host Docker socket)
|
|
COPY --from=docker:cli /usr/local/bin/docker /usr/local/bin/docker
|
|
|
|
# Install uv (source image overridable via UV_IMAGE build arg)
|
|
COPY --from=uv-source /uv /uvx /usr/local/bin/
|
|
|
|
# Set working directory
|
|
WORKDIR /app
|
|
|
|
# Copy backend with pre-built virtualenv from builder
|
|
COPY --from=builder /app/backend ./backend
|
|
|
|
# Expose ports (gateway: 8001, langgraph: 2024)
|
|
EXPOSE 8001 2024
|
|
|
|
# Default command (can be overridden in docker-compose)
|
|
CMD ["sh", "-c", "cd backend && PYTHONPATH=. uv run --no-sync uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001"]
|