Initial commit: hardened DeerFlow factory

Vendored deer-flow upstream (bytedance/deer-flow) plus prompt-injection
hardening:

- New deerflow.security package: content_delimiter, html_cleaner,
  sanitizer (8 layers — invisible chars, control chars, symbols, NFC,
  PUA, tag chars, horizontal whitespace collapse with newline/tab
  preservation, length cap)
- New deerflow.community.searx package: web_search, web_fetch,
  image_search backed by a private SearX instance, every external
  string sanitized and wrapped in <<<EXTERNAL_UNTRUSTED_CONTENT>>>
  delimiters
- All native community web providers (ddg_search, tavily, exa,
  firecrawl, jina_ai, infoquest, image_search) replaced with hard-fail
  stubs that raise NativeWebToolDisabledError at import time, so a
  misconfigured tool.use path fails loud rather than silently falling
  back to unsanitized output
- Native client back-doors (jina_client.py, infoquest_client.py)
  stubbed too
- Native-tool tests quarantined under tests/_disabled_native/
  (collect_ignore_glob via local conftest.py)
- Sanitizer Layer 7 fix: only collapse horizontal whitespace, preserve
  newlines and tabs so list/table structure survives
- Hardened runtime config.yaml references only the searx-backed tools
- Factory overlay (backend/) kept in sync with deer-flow tree as a
  reference / source

See HARDENING.md for the full audit trail and verification steps.
This commit is contained in:
2026-04-12 14:23:57 +02:00
commit 6de0bf9f5b
889 changed files with 173052 additions and 0 deletions

View File

@@ -0,0 +1,128 @@
name: Runtime Information
description: Report runtime/environment details to help reproduce an issue.
title: "[runtime] "
labels:
- needs-triage
body:
- type: markdown
attributes:
value: |
Thanks for sharing runtime details.
Complete this form so maintainers can quickly reproduce and diagnose the problem.
- type: input
id: summary
attributes:
label: Problem summary
description: Short summary of the issue.
placeholder: e.g. make dev fails to start gateway service
validations:
required: true
- type: textarea
id: expected
attributes:
label: Expected behavior
placeholder: What did you expect to happen?
validations:
required: true
- type: textarea
id: actual
attributes:
label: Actual behavior
placeholder: What happened instead? Include key error lines.
validations:
required: true
- type: dropdown
id: os
attributes:
label: Operating system
options:
- macOS
- Linux
- Windows
- Other
validations:
required: true
- type: input
id: platform_details
attributes:
label: Platform details
description: Add architecture and shell if relevant.
placeholder: e.g. arm64, zsh
- type: input
id: python_version
attributes:
label: Python version
placeholder: e.g. Python 3.12.9
- type: input
id: node_version
attributes:
label: Node.js version
placeholder: e.g. v23.11.0
- type: input
id: pnpm_version
attributes:
label: pnpm version
placeholder: e.g. 10.26.2
- type: input
id: uv_version
attributes:
label: uv version
placeholder: e.g. 0.7.20
- type: dropdown
id: run_mode
attributes:
label: How are you running DeerFlow?
options:
- Local (make dev)
- Docker (make docker-dev)
- CI
- Other
validations:
required: true
- type: textarea
id: reproduce
attributes:
label: Reproduction steps
description: Provide exact commands and sequence.
placeholder: |
1. make check
2. make install
3. make dev
4. ...
validations:
required: true
- type: textarea
id: logs
attributes:
label: Relevant logs
description: Paste key lines from logs (for example logs/gateway.log, logs/frontend.log).
render: shell
validations:
required: true
- type: textarea
id: git_info
attributes:
label: Git state
description: Share output of git branch and latest commit SHA.
placeholder: |
branch: feature/my-branch
commit: abcdef1
- type: textarea
id: additional
attributes:
label: Additional context
description: Add anything else that might help triage.

View File

@@ -0,0 +1,213 @@
# Copilot Onboarding Instructions for DeerFlow
Use this file as the default operating guide for this repository. Follow it first, and only search the codebase when this file is incomplete or incorrect.
## 1) Repository Summary
DeerFlow is a full-stack "super agent harness".
- Backend: Python 3.12, LangGraph + FastAPI gateway, sandbox/tool system, memory, MCP integration.
- Frontend: Next.js 16 + React 19 + TypeScript + pnpm.
- Local dev entrypoint: root `Makefile` starts backend + frontend + nginx on `http://localhost:2026`.
- Docker dev entrypoint: `make docker-*` (mode-aware provisioner startup from `config.yaml`).
Current repo footprint is medium-large (backend service, frontend app, docker stack, skills library, docs).
## 2) Runtime and Toolchain Requirements
Validated in this repo on macOS:
- Node.js `>=22` (validated with Node `23.11.0`)
- pnpm (repo expects lockfile generated by pnpm 10; validated with pnpm `10.26.2` and `10.15.0`)
- Python `>=3.12` (CI uses `3.12`)
- `uv` (validated with `0.7.20`)
- `nginx` (required for `make dev` unified local endpoint)
Always run from repo root unless a command explicitly says otherwise.
## 3) Build/Test/Lint/Run - Verified Command Sequences
These were executed and validated in this repository.
### A. Bootstrap and install
1. Check prerequisites:
```bash
make check
```
Observed: passes when required tools are installed.
2. Install dependencies (recommended order: backend then frontend, as implemented by `make install`):
```bash
make install
```
### B. Backend CI-equivalent validation
Run from `backend/`:
```bash
make lint
make test
```
Validated results:
- `make lint`: pass (`ruff check .`)
- `make test`: pass (`277 passed, 15 warnings in ~76.6s`)
CI parity:
- `.github/workflows/backend-unit-tests.yml` runs on pull requests.
- CI executes `uv sync --group dev`, then `make lint`, then `make test` in `backend/`.
### C. Frontend validation
Run from `frontend/`.
Recommended reliable sequence:
```bash
pnpm lint
pnpm typecheck
BETTER_AUTH_SECRET=local-dev-secret pnpm build
```
Observed failure modes and workarounds:
- `pnpm build` fails without `BETTER_AUTH_SECRET` in production-mode env validation.
- Workaround: set `BETTER_AUTH_SECRET` (best) or set `SKIP_ENV_VALIDATION=1`.
- Even with `SKIP_ENV_VALIDATION=1`, Better Auth can still warn/error in logs about default secret; prefer setting a real non-default secret.
- `pnpm check` currently fails (`next lint` invocation is incompatible here and resolves to an invalid directory). Do not rely on `pnpm check`; run `pnpm lint` and `pnpm typecheck` explicitly.
### D. Run locally (all services)
From root:
```bash
make dev
```
Behavior:
- Stops existing local services first.
- Starts LangGraph (`2024`), Gateway (`8001`), Frontend (`3000`), nginx (`2026`).
- Unified app endpoint: `http://localhost:2026`.
- Logs: `logs/langgraph.log`, `logs/gateway.log`, `logs/frontend.log`, `logs/nginx.log`.
Stop services:
```bash
make stop
```
If tool sessions/timeouts interrupt `make dev`, run `make stop` again to ensure cleanup.
### E. Config bootstrap
From root:
```bash
make config
```
Important behavior:
- This intentionally aborts if `config.yaml` (or `config.yml`/`configure.yml`) already exists.
- Use `make config` only for first-time setup in a clean clone.
## 4) Command Order That Minimizes Failures
Use this exact order for local code changes:
1. `make check`
2. `make install` (if frontend fails with proxy errors, rerun frontend install with proxy vars unset)
3. Backend checks: `cd backend && make lint && make test`
4. Frontend checks: `cd frontend && pnpm lint && pnpm typecheck`
5. Frontend build (if UI changes or release-sensitive changes): `BETTER_AUTH_SECRET=... pnpm build`
Always run backend lint/tests before opening PRs because that is what CI enforces.
## 5) Project Layout and Architecture (High-Value Paths)
Root-level orchestration and config:
- `Makefile` - main local/dev/docker command entrypoints
- `config.example.yaml` - primary app config template
- `config.yaml` - local active config (gitignored)
- `docker/docker-compose-dev.yaml` - Docker dev topology
- `.github/workflows/backend-unit-tests.yml` - PR validation workflow
Backend core:
- `backend/packages/harness/deerflow/agents/` - lead agent, middleware chain, memory
- `backend/app/gateway/` - FastAPI gateway API
- `backend/packages/harness/deerflow/sandbox/` - sandbox provider + tool wrappers
- `backend/packages/harness/deerflow/subagents/` - subagent registry/execution
- `backend/packages/harness/deerflow/mcp/` - MCP integration
- `backend/langgraph.json` - graph entrypoint (`deerflow.agents:make_lead_agent`)
- `backend/pyproject.toml` - Python deps and `requires-python`
- `backend/ruff.toml` - lint/format policy
- `backend/tests/` - backend unit and integration-like tests
Frontend core:
- `frontend/src/app/` - Next.js routes/pages
- `frontend/src/components/` - UI components
- `frontend/src/core/` - app logic (threads, tools, API, models)
- `frontend/src/env.js` - env schema/validation (critical for build behavior)
- `frontend/package.json` - scripts/deps
- `frontend/eslint.config.js` - lint rules
- `frontend/tsconfig.json` - TS config
Skills and assets:
- `skills/public/` - built-in skill packs loaded by agent runtime
## 6) Pre-Checkin / Validation Expectations
Before submitting changes, run at minimum:
- Backend: `cd backend && make lint && make test`
- Frontend (if touched): `cd frontend && pnpm lint && pnpm typecheck`
- Frontend build when changing env/auth/routing/build-sensitive files: `BETTER_AUTH_SECRET=... pnpm build`
If touching orchestration/config (`Makefile`, `docker/*`, `config*.yaml`), also run `make dev` and verify the four services start.
## 7) Non-Obvious Dependencies and Gotchas
- Proxy env vars can silently break frontend network operations (`pnpm install`/registry access).
- `BETTER_AUTH_SECRET` is effectively required for reliable frontend production build validation.
- Next.js may warn about multiple lockfiles and workspace root inference; this is currently a warning, not a build blocker.
- `make config` is non-idempotent by design when config already exists.
- `make dev` includes process cleanup and can emit shutdown logs/noise if interrupted; this is expected.
## 8) Root Inventory (quick reference)
Important root entries:
- `.github/`
- `backend/`
- `frontend/`
- `docker/`
- `skills/`
- `scripts/`
- `docs/`
- `README.md`
- `CONTRIBUTING.md`
- `Makefile`
- `config.example.yaml`
- `extensions_config.example.json`
## 9) Instruction Priority
Trust this onboarding guide first.
Only do broad repo searches (`grep/find/code search`) when:
- you need file-level implementation details not listed here,
- a command here fails and you need updated replacement behavior,
- or CI/workflow definitions have changed since this file was written.

View File

@@ -0,0 +1,40 @@
name: Unit Tests
on:
push:
branches: [ 'main' ]
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
concurrency:
group: unit-tests-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
permissions:
contents: read
jobs:
backend-unit-tests:
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- name: Checkout
uses: actions/checkout@v6
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.12'
- name: Install uv
uses: astral-sh/setup-uv@v7
- name: Install backend dependencies
working-directory: backend
run: uv sync --group dev
- name: Run unit tests of backend
working-directory: backend
run: make test

View File

@@ -0,0 +1,74 @@
name: Lint Check
on:
push:
branches: [ 'main' ]
pull_request:
branches: [ '*' ]
permissions:
contents: read
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.12'
- name: Install uv
uses: astral-sh/setup-uv@v7
- name: Install dependencies
working-directory: backend
run: |
uv sync --group dev
- name: Lint backend
working-directory: backend
run: make lint
lint-frontend:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '22'
- name: Enable Corepack
run: corepack enable
- name: Use pinned pnpm version
run: corepack prepare pnpm@10.26.2 --activate
- name: Install frontend dependencies
run: |
cd frontend
pnpm install --frozen-lockfile
- name: Check frontend formatting
run: |
cd frontend
pnpm format
- name: Run frontend linting
run: |
cd frontend
pnpm lint
- name: Check TypeScript types
run: |
cd frontend
pnpm typecheck
- name: Build frontend
run: |
cd frontend
BETTER_AUTH_SECRET=local-dev-secret pnpm build