Initial commit: hardened DeerFlow factory

Vendored deer-flow upstream (bytedance/deer-flow) plus prompt-injection
hardening:

- New deerflow.security package: content_delimiter, html_cleaner,
  sanitizer (8 layers — invisible chars, control chars, symbols, NFC,
  PUA, tag chars, horizontal whitespace collapse with newline/tab
  preservation, length cap)
- New deerflow.community.searx package: web_search, web_fetch,
  image_search backed by a private SearX instance, every external
  string sanitized and wrapped in <<<EXTERNAL_UNTRUSTED_CONTENT>>>
  delimiters
- All native community web providers (ddg_search, tavily, exa,
  firecrawl, jina_ai, infoquest, image_search) replaced with hard-fail
  stubs that raise NativeWebToolDisabledError at import time, so a
  misconfigured tool.use path fails loud rather than silently falling
  back to unsanitized output
- Native client back-doors (jina_client.py, infoquest_client.py)
  stubbed too
- Native-tool tests quarantined under tests/_disabled_native/
  (collect_ignore_glob via local conftest.py)
- Sanitizer Layer 7 fix: only collapse horizontal whitespace, preserve
  newlines and tabs so list/table structure survives
- Hardened runtime config.yaml references only the searx-backed tools
- Factory overlay (backend/) kept in sync with deer-flow tree as a
  reference / source

See HARDENING.md for the full audit trail and verification steps.
This commit is contained in:
2026-04-12 14:23:57 +02:00
commit 6de0bf9f5b
889 changed files with 173052 additions and 0 deletions

View File

@@ -0,0 +1,655 @@
# API Reference
This document provides a complete reference for the DeerFlow backend APIs.
## Overview
DeerFlow backend exposes two sets of APIs:
1. **LangGraph API** - Agent interactions, threads, and streaming (`/api/langgraph/*`)
2. **Gateway API** - Models, MCP, skills, uploads, and artifacts (`/api/*`)
All APIs are accessed through the Nginx reverse proxy at port 2026.
## LangGraph API
Base URL: `/api/langgraph`
The LangGraph API is provided by the LangGraph server and follows the LangGraph SDK conventions.
### Threads
#### Create Thread
```http
POST /api/langgraph/threads
Content-Type: application/json
```
**Request Body:**
```json
{
"metadata": {}
}
```
**Response:**
```json
{
"thread_id": "abc123",
"created_at": "2024-01-15T10:30:00Z",
"metadata": {}
}
```
#### Get Thread State
```http
GET /api/langgraph/threads/{thread_id}/state
```
**Response:**
```json
{
"values": {
"messages": [...],
"sandbox": {...},
"artifacts": [...],
"thread_data": {...},
"title": "Conversation Title"
},
"next": [],
"config": {...}
}
```
### Runs
#### Create Run
Execute the agent with input.
```http
POST /api/langgraph/threads/{thread_id}/runs
Content-Type: application/json
```
**Request Body:**
```json
{
"input": {
"messages": [
{
"role": "user",
"content": "Hello, can you help me?"
}
]
},
"config": {
"recursion_limit": 100,
"configurable": {
"model_name": "gpt-4",
"thinking_enabled": false,
"is_plan_mode": false
}
},
"stream_mode": ["values", "messages-tuple", "custom"]
}
```
**Stream Mode Compatibility:**
- Use: `values`, `messages-tuple`, `custom`, `updates`, `events`, `debug`, `tasks`, `checkpoints`
- Do not use: `tools` (deprecated/invalid in current `langgraph-api` and will trigger schema validation errors)
**Recursion Limit:**
`config.recursion_limit` caps the number of graph steps LangGraph will execute
in a single run. The `/api/langgraph/*` endpoints go straight to the LangGraph
server and therefore inherit LangGraph's native default of **25**, which is
too low for plan-mode or subagent-heavy runs — the agent typically errors out
with `GraphRecursionError` after the first round of subagent results comes
back, before the lead agent can synthesize the final answer.
DeerFlow's own Gateway and IM-channel paths mitigate this by defaulting to
`100` in `build_run_config` (see `backend/app/gateway/services.py`), but
clients calling the LangGraph API directly must set `recursion_limit`
explicitly in the request body. `100` matches the Gateway default and is a
safe starting point; increase it if you run deeply nested subagent graphs.
**Configurable Options:**
- `model_name` (string): Override the default model
- `thinking_enabled` (boolean): Enable extended thinking for supported models
- `is_plan_mode` (boolean): Enable TodoList middleware for task tracking
**Response:** Server-Sent Events (SSE) stream
```
event: values
data: {"messages": [...], "title": "..."}
event: messages
data: {"content": "Hello! I'd be happy to help.", "role": "assistant"}
event: end
data: {}
```
#### Get Run History
```http
GET /api/langgraph/threads/{thread_id}/runs
```
**Response:**
```json
{
"runs": [
{
"run_id": "run123",
"status": "success",
"created_at": "2024-01-15T10:30:00Z"
}
]
}
```
#### Stream Run
Stream responses in real-time.
```http
POST /api/langgraph/threads/{thread_id}/runs/stream
Content-Type: application/json
```
Same request body as Create Run. Returns SSE stream.
---
## Gateway API
Base URL: `/api`
### Models
#### List Models
Get all available LLM models from configuration.
```http
GET /api/models
```
**Response:**
```json
{
"models": [
{
"name": "gpt-4",
"display_name": "GPT-4",
"supports_thinking": false,
"supports_vision": true
},
{
"name": "claude-3-opus",
"display_name": "Claude 3 Opus",
"supports_thinking": false,
"supports_vision": true
},
{
"name": "deepseek-v3",
"display_name": "DeepSeek V3",
"supports_thinking": true,
"supports_vision": false
}
]
}
```
#### Get Model Details
```http
GET /api/models/{model_name}
```
**Response:**
```json
{
"name": "gpt-4",
"display_name": "GPT-4",
"model": "gpt-4",
"max_tokens": 4096,
"supports_thinking": false,
"supports_vision": true
}
```
### MCP Configuration
#### Get MCP Config
Get current MCP server configurations.
```http
GET /api/mcp/config
```
**Response:**
```json
{
"mcpServers": {
"github": {
"enabled": true,
"type": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {
"GITHUB_TOKEN": "***"
},
"description": "GitHub operations"
},
"filesystem": {
"enabled": false,
"type": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem"],
"description": "File system access"
}
}
}
```
#### Update MCP Config
Update MCP server configurations.
```http
PUT /api/mcp/config
Content-Type: application/json
```
**Request Body:**
```json
{
"mcpServers": {
"github": {
"enabled": true,
"type": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {
"GITHUB_TOKEN": "$GITHUB_TOKEN"
},
"description": "GitHub operations"
}
}
}
```
**Response:**
```json
{
"success": true,
"message": "MCP configuration updated"
}
```
### Skills
#### List Skills
Get all available skills.
```http
GET /api/skills
```
**Response:**
```json
{
"skills": [
{
"name": "pdf-processing",
"display_name": "PDF Processing",
"description": "Handle PDF documents efficiently",
"enabled": true,
"license": "MIT",
"path": "public/pdf-processing"
},
{
"name": "frontend-design",
"display_name": "Frontend Design",
"description": "Design and build frontend interfaces",
"enabled": false,
"license": "MIT",
"path": "public/frontend-design"
}
]
}
```
#### Get Skill Details
```http
GET /api/skills/{skill_name}
```
**Response:**
```json
{
"name": "pdf-processing",
"display_name": "PDF Processing",
"description": "Handle PDF documents efficiently",
"enabled": true,
"license": "MIT",
"path": "public/pdf-processing",
"allowed_tools": ["read_file", "write_file", "bash"],
"content": "# PDF Processing\n\nInstructions for the agent..."
}
```
#### Enable Skill
```http
POST /api/skills/{skill_name}/enable
```
**Response:**
```json
{
"success": true,
"message": "Skill 'pdf-processing' enabled"
}
```
#### Disable Skill
```http
POST /api/skills/{skill_name}/disable
```
**Response:**
```json
{
"success": true,
"message": "Skill 'pdf-processing' disabled"
}
```
#### Install Skill
Install a skill from a `.skill` file.
```http
POST /api/skills/install
Content-Type: multipart/form-data
```
**Request Body:**
- `file`: The `.skill` file to install
**Response:**
```json
{
"success": true,
"message": "Skill 'my-skill' installed successfully",
"skill": {
"name": "my-skill",
"display_name": "My Skill",
"path": "custom/my-skill"
}
}
```
### File Uploads
#### Upload Files
Upload one or more files to a thread.
```http
POST /api/threads/{thread_id}/uploads
Content-Type: multipart/form-data
```
**Request Body:**
- `files`: One or more files to upload
**Response:**
```json
{
"success": true,
"files": [
{
"filename": "document.pdf",
"size": 1234567,
"path": ".deer-flow/threads/abc123/user-data/uploads/document.pdf",
"virtual_path": "/mnt/user-data/uploads/document.pdf",
"artifact_url": "/api/threads/abc123/artifacts/mnt/user-data/uploads/document.pdf",
"markdown_file": "document.md",
"markdown_path": ".deer-flow/threads/abc123/user-data/uploads/document.md",
"markdown_virtual_path": "/mnt/user-data/uploads/document.md",
"markdown_artifact_url": "/api/threads/abc123/artifacts/mnt/user-data/uploads/document.md"
}
],
"message": "Successfully uploaded 1 file(s)"
}
```
**Supported Document Formats** (auto-converted to Markdown):
- PDF (`.pdf`)
- PowerPoint (`.ppt`, `.pptx`)
- Excel (`.xls`, `.xlsx`)
- Word (`.doc`, `.docx`)
#### List Uploaded Files
```http
GET /api/threads/{thread_id}/uploads/list
```
**Response:**
```json
{
"files": [
{
"filename": "document.pdf",
"size": 1234567,
"path": ".deer-flow/threads/abc123/user-data/uploads/document.pdf",
"virtual_path": "/mnt/user-data/uploads/document.pdf",
"artifact_url": "/api/threads/abc123/artifacts/mnt/user-data/uploads/document.pdf",
"extension": ".pdf",
"modified": 1705997600.0
}
],
"count": 1
}
```
#### Delete File
```http
DELETE /api/threads/{thread_id}/uploads/{filename}
```
**Response:**
```json
{
"success": true,
"message": "Deleted document.pdf"
}
```
### Thread Cleanup
Remove DeerFlow-managed local thread files under `.deer-flow/threads/{thread_id}` after the LangGraph thread itself has been deleted.
```http
DELETE /api/threads/{thread_id}
```
**Response:**
```json
{
"success": true,
"message": "Deleted local thread data for abc123"
}
```
**Error behavior:**
- `422` for invalid thread IDs
- `500` returns a generic `{"detail": "Failed to delete local thread data."}` response while full exception details stay in server logs
### Artifacts
#### Get Artifact
Download or view an artifact generated by the agent.
```http
GET /api/threads/{thread_id}/artifacts/{path}
```
**Path Examples:**
- `/api/threads/abc123/artifacts/mnt/user-data/outputs/result.txt`
- `/api/threads/abc123/artifacts/mnt/user-data/uploads/document.pdf`
**Query Parameters:**
- `download` (boolean): If `true`, force download with Content-Disposition header
**Response:** File content with appropriate Content-Type
---
## Error Responses
All APIs return errors in a consistent format:
```json
{
"detail": "Error message describing what went wrong"
}
```
**HTTP Status Codes:**
- `400` - Bad Request: Invalid input
- `404` - Not Found: Resource not found
- `422` - Validation Error: Request validation failed
- `500` - Internal Server Error: Server-side error
---
## Authentication
Currently, DeerFlow does not implement authentication. All APIs are accessible without credentials.
Note: This is about DeerFlow API authentication. MCP outbound connections can still use OAuth for configured HTTP/SSE MCP servers.
For production deployments, it is recommended to:
1. Use Nginx for basic auth or OAuth integration
2. Deploy behind a VPN or private network
3. Implement custom authentication middleware
---
## Rate Limiting
No rate limiting is implemented by default. For production deployments, configure rate limiting in Nginx:
```nginx
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
location /api/ {
limit_req zone=api burst=20 nodelay;
proxy_pass http://backend;
}
```
---
## WebSocket Support
The LangGraph server supports WebSocket connections for real-time streaming. Connect to:
```
ws://localhost:2026/api/langgraph/threads/{thread_id}/runs/stream
```
---
## SDK Usage
### Python (LangGraph SDK)
```python
from langgraph_sdk import get_client
client = get_client(url="http://localhost:2026/api/langgraph")
# Create thread
thread = await client.threads.create()
# Run agent
async for event in client.runs.stream(
thread["thread_id"],
"lead_agent",
input={"messages": [{"role": "user", "content": "Hello"}]},
config={"configurable": {"model_name": "gpt-4"}},
stream_mode=["values", "messages-tuple", "custom"],
):
print(event)
```
### JavaScript/TypeScript
```typescript
// Using fetch for Gateway API
const response = await fetch('/api/models');
const data = await response.json();
console.log(data.models);
// Using EventSource for streaming
const eventSource = new EventSource(
`/api/langgraph/threads/${threadId}/runs/stream`
);
eventSource.onmessage = (event) => {
console.log(JSON.parse(event.data));
};
```
### cURL Examples
```bash
# List models
curl http://localhost:2026/api/models
# Get MCP config
curl http://localhost:2026/api/mcp/config
# Upload file
curl -X POST http://localhost:2026/api/threads/abc123/uploads \
-F "files=@document.pdf"
# Enable skill
curl -X POST http://localhost:2026/api/skills/pdf-processing/enable
# Create thread and run agent
curl -X POST http://localhost:2026/api/langgraph/threads \
-H "Content-Type: application/json" \
-d '{}'
curl -X POST http://localhost:2026/api/langgraph/threads/abc123/runs \
-H "Content-Type: application/json" \
-d '{
"input": {"messages": [{"role": "user", "content": "Hello"}]},
"config": {
"recursion_limit": 100,
"configurable": {"model_name": "gpt-4"}
}
}'
```
> The `/api/langgraph/*` endpoints bypass DeerFlow's Gateway and inherit
> LangGraph's native `recursion_limit` default of 25, which is too low for
> plan-mode or subagent runs. Set `config.recursion_limit` explicitly — see
> the [Create Run](#create-run) section for details.

View File

@@ -0,0 +1,238 @@
# Apple Container Support
DeerFlow now supports Apple Container as the preferred container runtime on macOS, with automatic fallback to Docker.
## Overview
Starting with this version, DeerFlow automatically detects and uses Apple Container on macOS when available, falling back to Docker when:
- Apple Container is not installed
- Running on non-macOS platforms
This provides better performance on Apple Silicon Macs while maintaining compatibility across all platforms.
## Benefits
### On Apple Silicon Macs with Apple Container:
- **Better Performance**: Native ARM64 execution without Rosetta 2 translation
- **Lower Resource Usage**: Lighter weight than Docker Desktop
- **Native Integration**: Uses macOS Virtualization.framework
### Fallback to Docker:
- Full backward compatibility
- Works on all platforms (macOS, Linux, Windows)
- No configuration changes needed
## Requirements
### For Apple Container (macOS only):
- macOS 15.0 or later
- Apple Silicon (M1/M2/M3/M4)
- Apple Container CLI installed
### Installation:
```bash
# Download from GitHub releases
# https://github.com/apple/container/releases
# Verify installation
container --version
# Start the service
container system start
```
### For Docker (all platforms):
- Docker Desktop or Docker Engine
## How It Works
### Automatic Detection
The `AioSandboxProvider` automatically detects the available container runtime:
1. On macOS: Try `container --version`
- Success → Use Apple Container
- Failure → Fall back to Docker
2. On other platforms: Use Docker directly
### Runtime Differences
Both runtimes use nearly identical command syntax:
**Container Startup:**
```bash
# Apple Container
container run --rm -d -p 8080:8080 -v /host:/container -e KEY=value image
# Docker
docker run --rm -d -p 8080:8080 -v /host:/container -e KEY=value image
```
**Container Cleanup:**
```bash
# Apple Container (with --rm flag)
container stop <id> # Auto-removes due to --rm
# Docker (with --rm flag)
docker stop <id> # Auto-removes due to --rm
```
### Implementation Details
The implementation is in `backend/packages/harness/deerflow/community/aio_sandbox/aio_sandbox_provider.py`:
- `_detect_container_runtime()`: Detects available runtime at startup
- `_start_container()`: Uses detected runtime, skips Docker-specific options for Apple Container
- `_stop_container()`: Uses appropriate stop command for the runtime
## Configuration
No configuration changes are needed! The system works automatically.
However, you can verify the runtime in use by checking the logs:
```
INFO:deerflow.community.aio_sandbox.aio_sandbox_provider:Detected Apple Container: container version 0.1.0
INFO:deerflow.community.aio_sandbox.aio_sandbox_provider:Starting sandbox container using container: ...
```
Or for Docker:
```
INFO:deerflow.community.aio_sandbox.aio_sandbox_provider:Apple Container not available, falling back to Docker
INFO:deerflow.community.aio_sandbox.aio_sandbox_provider:Starting sandbox container using docker: ...
```
## Container Images
Both runtimes use OCI-compatible images. The default image works with both:
```yaml
sandbox:
use: deerflow.community.aio_sandbox:AioSandboxProvider
image: enterprise-public-cn-beijing.cr.volces.com/vefaas-public/all-in-one-sandbox:latest # Default image
```
Make sure your images are available for the appropriate architecture:
- ARM64 for Apple Container on Apple Silicon
- AMD64 for Docker on Intel Macs
- Multi-arch images work on both
### Pre-pulling Images (Recommended)
**Important**: Container images are typically large (500MB+) and are pulled on first use, which can cause a long wait time without clear feedback.
**Best Practice**: Pre-pull the image during setup:
```bash
# From project root
make setup-sandbox
```
This command will:
1. Read the configured image from `config.yaml` (or use default)
2. Detect available runtime (Apple Container or Docker)
3. Pull the image with progress indication
4. Verify the image is ready for use
**Manual pre-pull**:
```bash
# Using Apple Container
container image pull enterprise-public-cn-beijing.cr.volces.com/vefaas-public/all-in-one-sandbox:latest
# Using Docker
docker pull enterprise-public-cn-beijing.cr.volces.com/vefaas-public/all-in-one-sandbox:latest
```
If you skip pre-pulling, the image will be automatically pulled on first agent execution, which may take several minutes depending on your network speed.
## Cleanup Scripts
The project includes a unified cleanup script that handles both runtimes:
**Script:** `scripts/cleanup-containers.sh`
**Usage:**
```bash
# Clean up all DeerFlow sandbox containers
./scripts/cleanup-containers.sh deer-flow-sandbox
# Custom prefix
./scripts/cleanup-containers.sh my-prefix
```
**Makefile Integration:**
All cleanup commands in `Makefile` automatically handle both runtimes:
```bash
make stop # Stops all services and cleans up containers
make clean # Full cleanup including logs
```
## Testing
Test the container runtime detection:
```bash
cd backend
python test_container_runtime.py
```
This will:
1. Detect the available runtime
2. Optionally start a test container
3. Verify connectivity
4. Clean up
## Troubleshooting
### Apple Container not detected on macOS
1. Check if installed:
```bash
which container
container --version
```
2. Check if service is running:
```bash
container system start
```
3. Check logs for detection:
```bash
# Look for detection message in application logs
grep "container runtime" logs/*.log
```
### Containers not cleaning up
1. Manually check running containers:
```bash
# Apple Container
container list
# Docker
docker ps
```
2. Run cleanup script manually:
```bash
./scripts/cleanup-containers.sh deer-flow-sandbox
```
### Performance issues
- Apple Container should be faster on Apple Silicon
- If experiencing issues, you can force Docker by temporarily renaming the `container` command:
```bash
# Temporary workaround - not recommended for permanent use
sudo mv /opt/homebrew/bin/container /opt/homebrew/bin/container.bak
```
## References
- [Apple Container GitHub](https://github.com/apple/container)
- [Apple Container Documentation](https://github.com/apple/container/blob/main/docs/)
- [OCI Image Spec](https://github.com/opencontainers/image-spec)

View File

@@ -0,0 +1,484 @@
# Architecture Overview
This document provides a comprehensive overview of the DeerFlow backend architecture.
## System Architecture
```
┌──────────────────────────────────────────────────────────────────────────┐
│ Client (Browser) │
└─────────────────────────────────┬────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────────┐
│ Nginx (Port 2026) │
│ Unified Reverse Proxy Entry Point │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ /api/langgraph/* → LangGraph Server (2024) │ │
│ │ /api/* → Gateway API (8001) │ │
│ │ /* → Frontend (3000) │ │
│ └────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────┬────────────────────────────────────────┘
┌───────────────────────┼───────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ LangGraph Server │ │ Gateway API │ │ Frontend │
│ (Port 2024) │ │ (Port 8001) │ │ (Port 3000) │
│ │ │ │ │ │
│ - Agent Runtime │ │ - Models API │ │ - Next.js App │
│ - Thread Mgmt │ │ - MCP Config │ │ - React UI │
│ - SSE Streaming │ │ - Skills Mgmt │ │ - Chat Interface │
│ - Checkpointing │ │ - File Uploads │ │ │
│ │ │ - Thread Cleanup │ │ │
│ │ │ - Artifacts │ │ │
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
│ │
│ ┌─────────────────┘
│ │
▼ ▼
┌──────────────────────────────────────────────────────────────────────────┐
│ Shared Configuration │
│ ┌─────────────────────────┐ ┌────────────────────────────────────────┐ │
│ │ config.yaml │ │ extensions_config.json │ │
│ │ - Models │ │ - MCP Servers │ │
│ │ - Tools │ │ - Skills State │ │
│ │ - Sandbox │ │ │ │
│ │ - Summarization │ │ │ │
│ └─────────────────────────┘ └────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────┘
```
## Component Details
### LangGraph Server
The LangGraph server is the core agent runtime, built on LangGraph for robust multi-agent workflow orchestration.
**Entry Point**: `packages/harness/deerflow/agents/lead_agent/agent.py:make_lead_agent`
**Key Responsibilities**:
- Agent creation and configuration
- Thread state management
- Middleware chain execution
- Tool execution orchestration
- SSE streaming for real-time responses
**Configuration**: `langgraph.json`
```json
{
"agent": {
"type": "agent",
"path": "deerflow.agents:make_lead_agent"
}
}
```
### Gateway API
FastAPI application providing REST endpoints for non-agent operations.
**Entry Point**: `app/gateway/app.py`
**Routers**:
- `models.py` - `/api/models` - Model listing and details
- `mcp.py` - `/api/mcp` - MCP server configuration
- `skills.py` - `/api/skills` - Skills management
- `uploads.py` - `/api/threads/{id}/uploads` - File upload
- `threads.py` - `/api/threads/{id}` - Local DeerFlow thread data cleanup after LangGraph deletion
- `artifacts.py` - `/api/threads/{id}/artifacts` - Artifact serving
- `suggestions.py` - `/api/threads/{id}/suggestions` - Follow-up suggestion generation
The web conversation delete flow is now split across both backend surfaces: LangGraph handles `DELETE /api/langgraph/threads/{thread_id}` for thread state, then the Gateway `threads.py` router removes DeerFlow-managed filesystem data via `Paths.delete_thread_dir()`.
### Agent Architecture
```
┌─────────────────────────────────────────────────────────────────────────┐
│ make_lead_agent(config) │
└────────────────────────────────────┬────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ Middleware Chain │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ 1. ThreadDataMiddleware - Initialize workspace/uploads/outputs │ │
│ │ 2. UploadsMiddleware - Process uploaded files │ │
│ │ 3. SandboxMiddleware - Acquire sandbox environment │ │
│ │ 4. SummarizationMiddleware - Context reduction (if enabled) │ │
│ │ 5. TitleMiddleware - Auto-generate titles │ │
│ │ 6. TodoListMiddleware - Task tracking (if plan_mode) │ │
│ │ 7. ViewImageMiddleware - Vision model support │ │
│ │ 8. ClarificationMiddleware - Handle clarifications │ │
│ └──────────────────────────────────────────────────────────────────┘ │
└────────────────────────────────────┬────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ Agent Core │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────┐ │
│ │ Model │ │ Tools │ │ System Prompt │ │
│ │ (from factory) │ │ (configured + │ │ (with skills) │ │
│ │ │ │ MCP + builtin) │ │ │ │
│ └──────────────────┘ └──────────────────┘ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
```
### Thread State
The `ThreadState` extends LangGraph's `AgentState` with additional fields:
```python
class ThreadState(AgentState):
# Core state from AgentState
messages: list[BaseMessage]
# DeerFlow extensions
sandbox: dict # Sandbox environment info
artifacts: list[str] # Generated file paths
thread_data: dict # {workspace, uploads, outputs} paths
title: str | None # Auto-generated conversation title
todos: list[dict] # Task tracking (plan mode)
viewed_images: dict # Vision model image data
```
### Sandbox System
```
┌─────────────────────────────────────────────────────────────────────────┐
│ Sandbox Architecture │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────┐
│ SandboxProvider │ (Abstract)
│ - acquire() │
│ - get() │
│ - release() │
└────────────┬────────────┘
┌────────────────────┼────────────────────┐
│ │
▼ ▼
┌─────────────────────────┐ ┌─────────────────────────┐
│ LocalSandboxProvider │ │ AioSandboxProvider │
│ (packages/harness/deerflow/sandbox/local.py) │ │ (packages/harness/deerflow/community/) │
│ │ │ │
│ - Singleton instance │ │ - Docker-based │
│ - Direct execution │ │ - Isolated containers │
│ - Development use │ │ - Production use │
└─────────────────────────┘ └─────────────────────────┘
┌─────────────────────────┐
│ Sandbox │ (Abstract)
│ - execute_command() │
│ - read_file() │
│ - write_file() │
│ - list_dir() │
└─────────────────────────┘
```
**Virtual Path Mapping**:
| Virtual Path | Physical Path |
|-------------|---------------|
| `/mnt/user-data/workspace` | `backend/.deer-flow/threads/{thread_id}/user-data/workspace` |
| `/mnt/user-data/uploads` | `backend/.deer-flow/threads/{thread_id}/user-data/uploads` |
| `/mnt/user-data/outputs` | `backend/.deer-flow/threads/{thread_id}/user-data/outputs` |
| `/mnt/skills` | `deer-flow/skills/` |
### Tool System
```
┌─────────────────────────────────────────────────────────────────────────┐
│ Tool Sources │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ Built-in Tools │ │ Configured Tools │ │ MCP Tools │
│ (packages/harness/deerflow/tools/) │ │ (config.yaml) │ │ (extensions.json) │
├─────────────────────┤ ├─────────────────────┤ ├─────────────────────┤
│ - present_file │ │ - web_search │ │ - github │
│ - ask_clarification │ │ - web_fetch │ │ - filesystem │
│ - view_image │ │ - bash │ │ - postgres │
│ │ │ - read_file │ │ - brave-search │
│ │ │ - write_file │ │ - puppeteer │
│ │ │ - str_replace │ │ - ... │
│ │ │ - ls │ │ │
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
│ │ │
└───────────────────────┴───────────────────────┘
┌─────────────────────────┐
│ get_available_tools() │
│ (packages/harness/deerflow/tools/__init__) │
└─────────────────────────┘
```
### Model Factory
```
┌─────────────────────────────────────────────────────────────────────────┐
│ Model Factory │
│ (packages/harness/deerflow/models/factory.py) │
└─────────────────────────────────────────────────────────────────────────┘
config.yaml:
┌─────────────────────────────────────────────────────────────────────────┐
│ models: │
│ - name: gpt-4 │
│ display_name: GPT-4 │
│ use: langchain_openai:ChatOpenAI │
│ model: gpt-4 │
│ api_key: $OPENAI_API_KEY │
│ max_tokens: 4096 │
│ supports_thinking: false │
│ supports_vision: true │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────┐
│ create_chat_model() │
│ - name: str │
│ - thinking_enabled │
└────────────┬────────────┘
┌─────────────────────────┐
│ resolve_class() │
│ (reflection system) │
└────────────┬────────────┘
┌─────────────────────────┐
│ BaseChatModel │
│ (LangChain instance) │
└─────────────────────────┘
```
**Supported Providers**:
- OpenAI (`langchain_openai:ChatOpenAI`)
- Anthropic (`langchain_anthropic:ChatAnthropic`)
- DeepSeek (`langchain_deepseek:ChatDeepSeek`)
- Custom via LangChain integrations
### MCP Integration
```
┌─────────────────────────────────────────────────────────────────────────┐
│ MCP Integration │
│ (packages/harness/deerflow/mcp/manager.py) │
└─────────────────────────────────────────────────────────────────────────┘
extensions_config.json:
┌─────────────────────────────────────────────────────────────────────────┐
│ { │
│ "mcpServers": { │
│ "github": { │
│ "enabled": true, │
│ "type": "stdio", │
│ "command": "npx", │
│ "args": ["-y", "@modelcontextprotocol/server-github"], │
│ "env": {"GITHUB_TOKEN": "$GITHUB_TOKEN"} │
│ } │
│ } │
│ } │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────┐
│ MultiServerMCPClient │
│ (langchain-mcp-adapters)│
└────────────┬────────────┘
┌────────────────────┼────────────────────┐
│ │ │
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ stdio │ │ SSE │ │ HTTP │
│ transport │ │ transport │ │ transport │
└───────────┘ └───────────┘ └───────────┘
```
### Skills System
```
┌─────────────────────────────────────────────────────────────────────────┐
│ Skills System │
│ (packages/harness/deerflow/skills/loader.py) │
└─────────────────────────────────────────────────────────────────────────┘
Directory Structure:
┌─────────────────────────────────────────────────────────────────────────┐
│ skills/ │
│ ├── public/ # Public skills (committed) │
│ │ ├── pdf-processing/ │
│ │ │ └── SKILL.md │
│ │ ├── frontend-design/ │
│ │ │ └── SKILL.md │
│ │ └── ... │
│ └── custom/ # Custom skills (gitignored) │
│ └── user-installed/ │
│ └── SKILL.md │
└─────────────────────────────────────────────────────────────────────────┘
SKILL.md Format:
┌─────────────────────────────────────────────────────────────────────────┐
│ --- │
│ name: PDF Processing │
│ description: Handle PDF documents efficiently │
│ license: MIT │
│ allowed-tools: │
│ - read_file │
│ - write_file │
│ - bash │
│ --- │
│ │
│ # Skill Instructions │
│ Content injected into system prompt... │
└─────────────────────────────────────────────────────────────────────────┘
```
### Request Flow
```
┌─────────────────────────────────────────────────────────────────────────┐
│ Request Flow Example │
│ User sends message to agent │
└─────────────────────────────────────────────────────────────────────────┘
1. Client → Nginx
POST /api/langgraph/threads/{thread_id}/runs
{"input": {"messages": [{"role": "user", "content": "Hello"}]}}
2. Nginx → LangGraph Server (2024)
Proxied to LangGraph server
3. LangGraph Server
a. Load/create thread state
b. Execute middleware chain:
- ThreadDataMiddleware: Set up paths
- UploadsMiddleware: Inject file list
- SandboxMiddleware: Acquire sandbox
- SummarizationMiddleware: Check token limits
- TitleMiddleware: Generate title if needed
- TodoListMiddleware: Load todos (if plan mode)
- ViewImageMiddleware: Process images
- ClarificationMiddleware: Check for clarifications
c. Execute agent:
- Model processes messages
- May call tools (bash, web_search, etc.)
- Tools execute via sandbox
- Results added to messages
d. Stream response via SSE
4. Client receives streaming response
```
## Data Flow
### File Upload Flow
```
1. Client uploads file
POST /api/threads/{thread_id}/uploads
Content-Type: multipart/form-data
2. Gateway receives file
- Validates file
- Stores in .deer-flow/threads/{thread_id}/user-data/uploads/
- If document: converts to Markdown via markitdown
3. Returns response
{
"files": [{
"filename": "doc.pdf",
"path": ".deer-flow/.../uploads/doc.pdf",
"virtual_path": "/mnt/user-data/uploads/doc.pdf",
"artifact_url": "/api/threads/.../artifacts/mnt/.../doc.pdf"
}]
}
4. Next agent run
- UploadsMiddleware lists files
- Injects file list into messages
- Agent can access via virtual_path
```
### Thread Cleanup Flow
```
1. Client deletes conversation via LangGraph
DELETE /api/langgraph/threads/{thread_id}
2. Web UI follows up with Gateway cleanup
DELETE /api/threads/{thread_id}
3. Gateway removes local DeerFlow-managed files
- Deletes .deer-flow/threads/{thread_id}/ recursively
- Missing directories are treated as a no-op
- Invalid thread IDs are rejected before filesystem access
```
### Configuration Reload
```
1. Client updates MCP config
PUT /api/mcp/config
2. Gateway writes extensions_config.json
- Updates mcpServers section
- File mtime changes
3. MCP Manager detects change
- get_cached_mcp_tools() checks mtime
- If changed: reinitializes MCP client
- Loads updated server configurations
4. Next agent run uses new tools
```
## Security Considerations
### Sandbox Isolation
- Agent code executes within sandbox boundaries
- Local sandbox: Direct execution (development only)
- Docker sandbox: Container isolation (production recommended)
- Path traversal prevention in file operations
### API Security
- Thread isolation: Each thread has separate data directories
- File validation: Uploads checked for path safety
- Environment variable resolution: Secrets not stored in config
### MCP Security
- Each MCP server runs in its own process
- Environment variables resolved at runtime
- Servers can be enabled/disabled independently
## Performance Considerations
### Caching
- MCP tools cached with file mtime invalidation
- Configuration loaded once, reloaded on file change
- Skills parsed once at startup, cached in memory
### Streaming
- SSE used for real-time response streaming
- Reduces time to first token
- Enables progress visibility for long operations
### Context Management
- Summarization middleware reduces context when limits approached
- Configurable triggers: tokens, messages, or fraction
- Preserves recent messages while summarizing older ones

View File

@@ -0,0 +1,258 @@
# 自动 Thread Title 生成功能
## 功能说明
自动为对话线程生成标题,在用户首次提问并收到回复后自动触发。
## 实现方式
使用 `TitleMiddleware``after_model` 钩子中:
1. 检测是否是首次对话1个用户消息 + 1个助手回复
2. 检查 state 是否已有 title
3. 调用 LLM 生成简洁的标题默认最多6个词
4. 将 title 存储到 `ThreadState` 中(会被 checkpointer 持久化)
TitleMiddleware 会先把 LangChain message content 里的结构化 block/list 内容归一化为纯文本,再拼到 title prompt 里,避免把 Python/JSON 的原始 repr 泄漏到标题生成模型。
## ⚠️ 重要:存储机制
### Title 存储位置
Title 存储在 **`ThreadState.title`** 中,而非 thread metadata
```python
class ThreadState(AgentState):
sandbox: SandboxState | None = None
title: str | None = None # ✅ Title stored here
```
### 持久化说明
| 部署方式 | 持久化 | 说明 |
|---------|--------|------|
| **LangGraph Studio (本地)** | ❌ 否 | 仅内存存储,重启后丢失 |
| **LangGraph Platform** | ✅ 是 | 自动持久化到数据库 |
| **自定义 + Checkpointer** | ✅ 是 | 需配置 PostgreSQL/SQLite checkpointer |
### 如何启用持久化
如果需要在本地开发时也持久化 title需要配置 checkpointer
```python
# 在 langgraph.json 同级目录创建 checkpointer.py
from langgraph.checkpoint.postgres import PostgresSaver
checkpointer = PostgresSaver.from_conn_string(
"postgresql://user:pass@localhost/dbname"
)
```
然后在 `langgraph.json` 中引用:
```json
{
"graphs": {
"lead_agent": "deerflow.agents:lead_agent"
},
"checkpointer": "checkpointer:checkpointer"
}
```
## 配置
`config.yaml` 中添加(可选):
```yaml
title:
enabled: true
max_words: 6
max_chars: 60
model_name: null # 使用默认模型
```
或在代码中配置:
```python
from deerflow.config.title_config import TitleConfig, set_title_config
set_title_config(TitleConfig(
enabled=True,
max_words=8,
max_chars=80,
))
```
## 客户端使用
### 获取 Thread Title
```typescript
// 方式1: 从 thread state 获取
const state = await client.threads.getState(threadId);
const title = state.values.title || "New Conversation";
// 方式2: 监听 stream 事件
for await (const chunk of client.runs.stream(threadId, assistantId, {
input: { messages: [{ role: "user", content: "Hello" }] }
})) {
if (chunk.event === "values" && chunk.data.title) {
console.log("Title:", chunk.data.title);
}
}
```
### 显示 Title
```typescript
// 在对话列表中显示
function ConversationList() {
const [threads, setThreads] = useState([]);
useEffect(() => {
async function loadThreads() {
const allThreads = await client.threads.list();
// 获取每个 thread 的 state 来读取 title
const threadsWithTitles = await Promise.all(
allThreads.map(async (t) => {
const state = await client.threads.getState(t.thread_id);
return {
id: t.thread_id,
title: state.values.title || "New Conversation",
updatedAt: t.updated_at,
};
})
);
setThreads(threadsWithTitles);
}
loadThreads();
}, []);
return (
<ul>
{threads.map(thread => (
<li key={thread.id}>
<a href={`/chat/${thread.id}`}>{thread.title}</a>
</li>
))}
</ul>
);
}
```
## 工作流程
```mermaid
sequenceDiagram
participant User
participant Client
participant LangGraph
participant TitleMiddleware
participant LLM
participant Checkpointer
User->>Client: 发送首条消息
Client->>LangGraph: POST /threads/{id}/runs
LangGraph->>Agent: 处理消息
Agent-->>LangGraph: 返回回复
LangGraph->>TitleMiddleware: after_agent()
TitleMiddleware->>TitleMiddleware: 检查是否需要生成 title
TitleMiddleware->>LLM: 生成 title
LLM-->>TitleMiddleware: 返回 title
TitleMiddleware->>LangGraph: return {"title": "..."}
LangGraph->>Checkpointer: 保存 state (含 title)
LangGraph-->>Client: 返回响应
Client->>Client: 从 state.values.title 读取
```
## 优势
**可靠持久化** - 使用 LangGraph 的 state 机制,自动持久化
**完全后端处理** - 客户端无需额外逻辑
**自动触发** - 首次对话后自动生成
**可配置** - 支持自定义长度、模型等
**容错性强** - 失败时使用 fallback 策略
**架构一致** - 与现有 SandboxMiddleware 保持一致
## 注意事项
1. **读取方式不同**Title 在 `state.values.title` 而非 `thread.metadata.title`
2. **性能考虑**title 生成会增加约 0.5-1 秒延迟,可通过使用更快的模型优化
3. **并发安全**middleware 在 agent 执行后运行,不会阻塞主流程
4. **Fallback 策略**:如果 LLM 调用失败,会使用用户消息的前几个词作为 title
## 测试
```python
# 测试 title 生成
import pytest
from deerflow.agents.title_middleware import TitleMiddleware
def test_title_generation():
# TODO: 添加单元测试
pass
```
## 故障排查
### Title 没有生成
1. 检查配置是否启用:`get_title_config().enabled == True`
2. 检查日志:查找 "Generated thread title" 或错误信息
3. 确认是首次对话:只有 1 个用户消息和 1 个助手回复时才会触发
### Title 生成但客户端看不到
1. 确认读取位置:应该从 `state.values.title` 读取,而非 `thread.metadata.title`
2. 检查 API 响应:确认 state 中包含 title 字段
3. 尝试重新获取 state`client.threads.getState(threadId)`
### Title 重启后丢失
1. 检查是否配置了 checkpointer本地开发需要
2. 确认部署方式LangGraph Platform 会自动持久化
3. 查看数据库:确认 checkpointer 正常工作
## 架构设计
### 为什么使用 State 而非 Metadata
| 特性 | State | Metadata |
|------|-------|----------|
| **持久化** | ✅ 自动(通过 checkpointer | ⚠️ 取决于实现 |
| **版本控制** | ✅ 支持时间旅行 | ❌ 不支持 |
| **类型安全** | ✅ TypedDict 定义 | ❌ 任意字典 |
| **可追溯** | ✅ 每次更新都记录 | ⚠️ 只有最新值 |
| **标准化** | ✅ LangGraph 核心机制 | ⚠️ 扩展功能 |
### 实现细节
```python
# TitleMiddleware 核心逻辑
@override
def after_agent(self, state: TitleMiddlewareState, runtime: Runtime) -> dict | None:
"""Generate and set thread title after the first agent response."""
if self._should_generate_title(state, runtime):
title = self._generate_title(runtime)
print(f"Generated thread title: {title}")
# ✅ 返回 state 更新,会被 checkpointer 自动持久化
return {"title": title}
return None
```
## 相关文件
- [`packages/harness/deerflow/agents/thread_state.py`](../packages/harness/deerflow/agents/thread_state.py) - ThreadState 定义
- [`packages/harness/deerflow/agents/middlewares/title_middleware.py`](../packages/harness/deerflow/agents/middlewares/title_middleware.py) - TitleMiddleware 实现
- [`packages/harness/deerflow/config/title_config.py`](../packages/harness/deerflow/config/title_config.py) - 配置管理
- [`config.yaml`](../../config.example.yaml) - 配置文件
- [`packages/harness/deerflow/agents/lead_agent/agent.py`](../packages/harness/deerflow/agents/lead_agent/agent.py) - Middleware 注册
## 参考资料
- [LangGraph Checkpointer 文档](https://langchain-ai.github.io/langgraph/concepts/persistence/)
- [LangGraph State 管理](https://langchain-ai.github.io/langgraph/concepts/low_level/#state)
- [LangGraph Middleware](https://langchain-ai.github.io/langgraph/concepts/middleware/)

View File

@@ -0,0 +1,369 @@
# Configuration Guide
This guide explains how to configure DeerFlow for your environment.
## Config Versioning
`config.example.yaml` contains a `config_version` field that tracks schema changes. When the example version is higher than your local `config.yaml`, the application emits a startup warning:
```
WARNING - Your config.yaml (version 0) is outdated — the latest version is 1.
Run `make config-upgrade` to merge new fields into your config.
```
- **Missing `config_version`** in your config is treated as version 0.
- Run `make config-upgrade` to auto-merge missing fields (your existing values are preserved, a `.bak` backup is created).
- When changing the config schema, bump `config_version` in `config.example.yaml`.
## Configuration Sections
### Models
Configure the LLM models available to the agent:
```yaml
models:
- name: gpt-4 # Internal identifier
display_name: GPT-4 # Human-readable name
use: langchain_openai:ChatOpenAI # LangChain class path
model: gpt-4 # Model identifier for API
api_key: $OPENAI_API_KEY # API key (use env var)
max_tokens: 4096 # Max tokens per request
temperature: 0.7 # Sampling temperature
```
**Supported Providers**:
- OpenAI (`langchain_openai:ChatOpenAI`)
- Anthropic (`langchain_anthropic:ChatAnthropic`)
- DeepSeek (`langchain_deepseek:ChatDeepSeek`)
- Claude Code OAuth (`deerflow.models.claude_provider:ClaudeChatModel`)
- Codex CLI (`deerflow.models.openai_codex_provider:CodexChatModel`)
- Any LangChain-compatible provider
CLI-backed provider examples:
```yaml
models:
- name: gpt-5.4
display_name: GPT-5.4 (Codex CLI)
use: deerflow.models.openai_codex_provider:CodexChatModel
model: gpt-5.4
supports_thinking: true
supports_reasoning_effort: true
- name: claude-sonnet-4.6
display_name: Claude Sonnet 4.6 (Claude Code OAuth)
use: deerflow.models.claude_provider:ClaudeChatModel
model: claude-sonnet-4-6
max_tokens: 4096
supports_thinking: true
```
**Auth behavior for CLI-backed providers**:
- `CodexChatModel` loads Codex CLI auth from `~/.codex/auth.json`
- The Codex Responses endpoint currently rejects `max_tokens` and `max_output_tokens`, so `CodexChatModel` does not expose a request-level token cap
- `ClaudeChatModel` accepts `CLAUDE_CODE_OAUTH_TOKEN`, `ANTHROPIC_AUTH_TOKEN`, `CLAUDE_CODE_OAUTH_TOKEN_FILE_DESCRIPTOR`, `CLAUDE_CODE_CREDENTIALS_PATH`, or plaintext `~/.claude/.credentials.json`
- On macOS, DeerFlow does not probe Keychain automatically. Use `scripts/export_claude_code_oauth.py` to export Claude Code auth explicitly when needed
To use OpenAI's `/v1/responses` endpoint with LangChain, keep using `langchain_openai:ChatOpenAI` and set:
```yaml
models:
- name: gpt-5-responses
display_name: GPT-5 (Responses API)
use: langchain_openai:ChatOpenAI
model: gpt-5
api_key: $OPENAI_API_KEY
use_responses_api: true
output_version: responses/v1
```
For OpenAI-compatible gateways (for example Novita or OpenRouter), keep using `langchain_openai:ChatOpenAI` and set `base_url`:
```yaml
models:
- name: novita-deepseek-v3.2
display_name: Novita DeepSeek V3.2
use: langchain_openai:ChatOpenAI
model: deepseek/deepseek-v3.2
api_key: $NOVITA_API_KEY
base_url: https://api.novita.ai/openai
supports_thinking: true
when_thinking_enabled:
extra_body:
thinking:
type: enabled
- name: minimax-m2.5
display_name: MiniMax M2.5
use: langchain_openai:ChatOpenAI
model: MiniMax-M2.5
api_key: $MINIMAX_API_KEY
base_url: https://api.minimax.io/v1
max_tokens: 4096
temperature: 1.0 # MiniMax requires temperature in (0.0, 1.0]
supports_vision: true
- name: minimax-m2.5-highspeed
display_name: MiniMax M2.5 Highspeed
use: langchain_openai:ChatOpenAI
model: MiniMax-M2.5-highspeed
api_key: $MINIMAX_API_KEY
base_url: https://api.minimax.io/v1
max_tokens: 4096
temperature: 1.0 # MiniMax requires temperature in (0.0, 1.0]
supports_vision: true
- name: openrouter-gemini-2.5-flash
display_name: Gemini 2.5 Flash (OpenRouter)
use: langchain_openai:ChatOpenAI
model: google/gemini-2.5-flash-preview
api_key: $OPENAI_API_KEY
base_url: https://openrouter.ai/api/v1
```
If your OpenRouter key lives in a different environment variable name, point `api_key` at that variable explicitly (for example `api_key: $OPENROUTER_API_KEY`).
**Thinking Models**:
Some models support "thinking" mode for complex reasoning:
```yaml
models:
- name: deepseek-v3
supports_thinking: true
when_thinking_enabled:
extra_body:
thinking:
type: enabled
```
**Gemini with thinking via OpenAI-compatible gateway**:
When routing Gemini through an OpenAI-compatible proxy (Vertex AI OpenAI compat endpoint, AI Studio, or third-party gateways) with thinking enabled, the API attaches a `thought_signature` to each tool-call object returned in the response. Every subsequent request that replays those assistant messages **must** echo those signatures back on the tool-call entries or the API returns:
```
HTTP 400 INVALID_ARGUMENT: function call `<tool>` in the N. content block is
missing a `thought_signature`.
```
Standard `langchain_openai:ChatOpenAI` silently drops `thought_signature` when serialising messages. Use `deerflow.models.patched_openai:PatchedChatOpenAI` instead — it re-injects the tool-call signatures (sourced from `AIMessage.additional_kwargs["tool_calls"]`) into every outgoing payload:
```yaml
models:
- name: gemini-2.5-pro-thinking
display_name: Gemini 2.5 Pro (Thinking)
use: deerflow.models.patched_openai:PatchedChatOpenAI
model: google/gemini-2.5-pro-preview # model name as expected by your gateway
api_key: $GEMINI_API_KEY
base_url: https://<your-openai-compat-gateway>/v1
max_tokens: 16384
supports_thinking: true
supports_vision: true
when_thinking_enabled:
extra_body:
thinking:
type: enabled
```
For Gemini accessed **without** thinking (e.g. via OpenRouter where thinking is not activated), the plain `langchain_openai:ChatOpenAI` with `supports_thinking: false` is sufficient and no patch is needed.
### Tool Groups
Organize tools into logical groups:
```yaml
tool_groups:
- name: web # Web browsing and search
- name: file:read # Read-only file operations
- name: file:write # Write file operations
- name: bash # Shell command execution
```
### Tools
Configure specific tools available to the agent:
```yaml
tools:
- name: web_search
group: web
use: deerflow.community.tavily.tools:web_search_tool
max_results: 5
# api_key: $TAVILY_API_KEY # Optional
```
**Built-in Tools**:
- `web_search` - Search the web (DuckDuckGo, Tavily, Exa, InfoQuest, Firecrawl)
- `web_fetch` - Fetch web pages (Jina AI, Exa, InfoQuest, Firecrawl)
- `ls` - List directory contents
- `read_file` - Read file contents
- `write_file` - Write file contents
- `str_replace` - String replacement in files
- `bash` - Execute bash commands
### Sandbox
DeerFlow supports multiple sandbox execution modes. Configure your preferred mode in `config.yaml`:
**Local Execution** (runs sandbox code directly on the host machine):
```yaml
sandbox:
use: deerflow.sandbox.local:LocalSandboxProvider # Local execution
allow_host_bash: false # default; host bash is disabled unless explicitly re-enabled
```
**Docker Execution** (runs sandbox code in isolated Docker containers):
```yaml
sandbox:
use: deerflow.community.aio_sandbox:AioSandboxProvider # Docker-based sandbox
```
**Docker Execution with Kubernetes** (runs sandbox code in Kubernetes pods via provisioner service):
This mode runs each sandbox in an isolated Kubernetes Pod on your **host machine's cluster**. Requires Docker Desktop K8s, OrbStack, or similar local K8s setup.
```yaml
sandbox:
use: deerflow.community.aio_sandbox:AioSandboxProvider
provisioner_url: http://provisioner:8002
```
When using Docker development (`make docker-start`), DeerFlow starts the `provisioner` service only if this provisioner mode is configured. In local or plain Docker sandbox modes, `provisioner` is skipped.
See [Provisioner Setup Guide](../../docker/provisioner/README.md) for detailed configuration, prerequisites, and troubleshooting.
Choose between local execution or Docker-based isolation:
**Option 1: Local Sandbox** (default, simpler setup):
```yaml
sandbox:
use: deerflow.sandbox.local:LocalSandboxProvider
allow_host_bash: false
```
`allow_host_bash` is intentionally `false` by default. DeerFlow's local sandbox is a host-side convenience mode, not a secure shell isolation boundary. If you need `bash`, prefer `AioSandboxProvider`. Only set `allow_host_bash: true` for fully trusted single-user local workflows.
**Option 2: Docker Sandbox** (isolated, more secure):
```yaml
sandbox:
use: deerflow.community.aio_sandbox:AioSandboxProvider
port: 8080
auto_start: true
container_prefix: deer-flow-sandbox
# Optional: Additional mounts
mounts:
- host_path: /path/on/host
container_path: /path/in/container
read_only: false
```
When you configure `sandbox.mounts`, DeerFlow exposes those `container_path` values in the agent prompt so the agent can discover and operate on mounted directories directly instead of assuming everything must live under `/mnt/user-data`.
### Skills
Configure the skills directory for specialized workflows:
```yaml
skills:
# Host path (optional, default: ../skills)
path: /custom/path/to/skills
# Container mount path (default: /mnt/skills)
container_path: /mnt/skills
```
**How Skills Work**:
- Skills are stored in `deer-flow/skills/{public,custom}/`
- Each skill has a `SKILL.md` file with metadata
- Skills are automatically discovered and loaded
- Available in both local and Docker sandbox via path mapping
**Per-Agent Skill Filtering**:
Custom agents can restrict which skills they load by defining a `skills` field in their `config.yaml` (located at `workspace/agents/<agent_name>/config.yaml`):
- **Omitted or `null`**: Loads all globally enabled skills (default fallback).
- **`[]` (empty list)**: Disables all skills for this specific agent.
- **`["skill-name"]`**: Loads only the explicitly specified skills.
### Title Generation
Automatic conversation title generation:
```yaml
title:
enabled: true
max_words: 6
max_chars: 60
model_name: null # Use first model in list
```
### GitHub API Token (Optional for GitHub Deep Research Skill)
The default GitHub API rate limits are quite restrictive. For frequent project research, we recommend configuring a personal access token (PAT) with read-only permissions.
**Configuration Steps**:
1. Uncomment the `GITHUB_TOKEN` line in the `.env` file and add your personal access token
2. Restart the DeerFlow service to apply changes
## Environment Variables
DeerFlow supports environment variable substitution using the `$` prefix:
```yaml
models:
- api_key: $OPENAI_API_KEY # Reads from environment
```
**Common Environment Variables**:
- `OPENAI_API_KEY` - OpenAI API key
- `ANTHROPIC_API_KEY` - Anthropic API key
- `DEEPSEEK_API_KEY` - DeepSeek API key
- `NOVITA_API_KEY` - Novita API key (OpenAI-compatible endpoint)
- `TAVILY_API_KEY` - Tavily search API key
- `DEER_FLOW_CONFIG_PATH` - Custom config file path
## Configuration Location
The configuration file should be placed in the **project root directory** (`deer-flow/config.yaml`), not in the backend directory.
## Configuration Priority
DeerFlow searches for configuration in this order:
1. Path specified in code via `config_path` argument
2. Path from `DEER_FLOW_CONFIG_PATH` environment variable
3. `config.yaml` in current working directory (typically `backend/` when running)
4. `config.yaml` in parent directory (project root: `deer-flow/`)
## Best Practices
1. **Place `config.yaml` in project root** - Not in `backend/` directory
2. **Never commit `config.yaml`** - It's already in `.gitignore`
3. **Use environment variables for secrets** - Don't hardcode API keys
4. **Keep `config.example.yaml` updated** - Document all new options
5. **Test configuration changes locally** - Before deploying
6. **Use Docker sandbox for production** - Better isolation and security
## Troubleshooting
### "Config file not found"
- Ensure `config.yaml` exists in the **project root** directory (`deer-flow/config.yaml`)
- The backend searches parent directory by default, so root location is preferred
- Alternatively, set `DEER_FLOW_CONFIG_PATH` environment variable to custom location
### "Invalid API key"
- Verify environment variables are set correctly
- Check that `$` prefix is used for env var references
### "Skills not loading"
- Check that `deer-flow/skills/` directory exists
- Verify skills have valid `SKILL.md` files
- Check `skills.path` configuration if using custom path
### "Docker sandbox fails to start"
- Ensure Docker is running
- Check port 8080 (or configured port) is available
- Verify Docker image is accessible
## Examples
See `config.example.yaml` for complete examples of all configuration options.

View File

@@ -0,0 +1,293 @@
# 文件上传功能
## 概述
DeerFlow 后端提供了完整的文件上传功能,支持多文件上传,并自动将 Office 文档和 PDF 转换为 Markdown 格式。
## 功能特性
- ✅ 支持多文件同时上传
- ✅ 自动转换文档为 MarkdownPDF、PPT、Excel、Word
- ✅ 文件存储在线程隔离的目录中
- ✅ Agent 自动感知已上传的文件
- ✅ 支持文件列表查询和删除
## API 端点
### 1. 上传文件
```
POST /api/threads/{thread_id}/uploads
```
**请求体:** `multipart/form-data`
- `files`: 一个或多个文件
**响应:**
```json
{
"success": true,
"files": [
{
"filename": "document.pdf",
"size": 1234567,
"path": ".deer-flow/threads/{thread_id}/user-data/uploads/document.pdf",
"virtual_path": "/mnt/user-data/uploads/document.pdf",
"artifact_url": "/api/threads/{thread_id}/artifacts/mnt/user-data/uploads/document.pdf",
"markdown_file": "document.md",
"markdown_path": ".deer-flow/threads/{thread_id}/user-data/uploads/document.md",
"markdown_virtual_path": "/mnt/user-data/uploads/document.md",
"markdown_artifact_url": "/api/threads/{thread_id}/artifacts/mnt/user-data/uploads/document.md"
}
],
"message": "Successfully uploaded 1 file(s)"
}
```
**路径说明:**
- `path`: 实际文件系统路径(相对于 `backend/` 目录)
- `virtual_path`: Agent 在沙箱中使用的虚拟路径
- `artifact_url`: 前端通过 HTTP 访问文件的 URL
### 2. 列出已上传文件
```
GET /api/threads/{thread_id}/uploads/list
```
**响应:**
```json
{
"files": [
{
"filename": "document.pdf",
"size": 1234567,
"path": ".deer-flow/threads/{thread_id}/user-data/uploads/document.pdf",
"virtual_path": "/mnt/user-data/uploads/document.pdf",
"artifact_url": "/api/threads/{thread_id}/artifacts/mnt/user-data/uploads/document.pdf",
"extension": ".pdf",
"modified": 1705997600.0
}
],
"count": 1
}
```
### 3. 删除文件
```
DELETE /api/threads/{thread_id}/uploads/{filename}
```
**响应:**
```json
{
"success": true,
"message": "Deleted document.pdf"
}
```
## 支持的文档格式
以下格式会自动转换为 Markdown
- PDF (`.pdf`)
- PowerPoint (`.ppt`, `.pptx`)
- Excel (`.xls`, `.xlsx`)
- Word (`.doc`, `.docx`)
转换后的 Markdown 文件会保存在同一目录下,文件名为原文件名 + `.md` 扩展名。
## Agent 集成
### 自动文件列举
Agent 在每次请求时会自动收到已上传文件的列表,格式如下:
```xml
<uploaded_files>
The following files have been uploaded and are available for use:
- document.pdf (1.2 MB)
Path: /mnt/user-data/uploads/document.pdf
- document.md (45.3 KB)
Path: /mnt/user-data/uploads/document.md
You can read these files using the `read_file` tool with the paths shown above.
</uploaded_files>
```
### 使用上传的文件
Agent 在沙箱中运行使用虚拟路径访问文件。Agent 可以直接使用 `read_file` 工具读取上传的文件:
```python
# 读取原始 PDF如果支持
read_file(path="/mnt/user-data/uploads/document.pdf")
# 读取转换后的 Markdown推荐
read_file(path="/mnt/user-data/uploads/document.md")
```
**路径映射关系:**
- Agent 使用:`/mnt/user-data/uploads/document.pdf`(虚拟路径)
- 实际存储:`backend/.deer-flow/threads/{thread_id}/user-data/uploads/document.pdf`
- 前端访问:`/api/threads/{thread_id}/artifacts/mnt/user-data/uploads/document.pdf`HTTP URL
上传流程采用“线程目录优先”策略:
- 先写入 `backend/.deer-flow/threads/{thread_id}/user-data/uploads/` 作为权威存储
- 本地沙箱(`sandbox_id=local`)直接使用线程目录内容
- 非本地沙箱会额外同步到 `/mnt/user-data/uploads/*`,确保运行时可见
## 测试示例
### 使用 curl 测试
```bash
# 1. 上传单个文件
curl -X POST http://localhost:2026/api/threads/test-thread/uploads \
-F "files=@/path/to/document.pdf"
# 2. 上传多个文件
curl -X POST http://localhost:2026/api/threads/test-thread/uploads \
-F "files=@/path/to/document.pdf" \
-F "files=@/path/to/presentation.pptx" \
-F "files=@/path/to/spreadsheet.xlsx"
# 3. 列出已上传文件
curl http://localhost:2026/api/threads/test-thread/uploads/list
# 4. 删除文件
curl -X DELETE http://localhost:2026/api/threads/test-thread/uploads/document.pdf
```
### 使用 Python 测试
```python
import requests
thread_id = "test-thread"
base_url = "http://localhost:2026"
# 上传文件
files = [
("files", open("document.pdf", "rb")),
("files", open("presentation.pptx", "rb")),
]
response = requests.post(
f"{base_url}/api/threads/{thread_id}/uploads",
files=files
)
print(response.json())
# 列出文件
response = requests.get(f"{base_url}/api/threads/{thread_id}/uploads/list")
print(response.json())
# 删除文件
response = requests.delete(
f"{base_url}/api/threads/{thread_id}/uploads/document.pdf"
)
print(response.json())
```
## 文件存储结构
```
backend/.deer-flow/threads/
└── {thread_id}/
└── user-data/
└── uploads/
├── document.pdf # 原始文件
├── document.md # 转换后的 Markdown
├── presentation.pptx
├── presentation.md
└── ...
```
## 限制
- 最大文件大小100MB可在 nginx.conf 中配置 `client_max_body_size`
- 文件名安全性:系统会自动验证文件路径,防止目录遍历攻击
- 线程隔离:每个线程的上传文件相互隔离,无法跨线程访问
## 技术实现
### 组件
1. **Upload Router** (`app/gateway/routers/uploads.py`)
- 处理文件上传、列表、删除请求
- 使用 markitdown 转换文档
2. **Uploads Middleware** (`packages/harness/deerflow/agents/middlewares/uploads_middleware.py`)
- 在每次 Agent 请求前注入文件列表
- 自动生成格式化的文件列表消息
3. **Nginx 配置** (`nginx.conf`)
- 路由上传请求到 Gateway API
- 配置大文件上传支持
### 依赖
- `markitdown>=0.0.1a2` - 文档转换
- `python-multipart>=0.0.20` - 文件上传处理
## 故障排查
### 文件上传失败
1. 检查文件大小是否超过限制
2. 检查 Gateway API 是否正常运行
3. 检查磁盘空间是否充足
4. 查看 Gateway 日志:`make gateway`
### 文档转换失败
1. 检查 markitdown 是否正确安装:`uv run python -c "import markitdown"`
2. 查看日志中的具体错误信息
3. 某些损坏或加密的文档可能无法转换,但原文件仍会保存
### Agent 看不到上传的文件
1. 确认 UploadsMiddleware 已在 agent.py 中注册
2. 检查 thread_id 是否正确
3. 确认文件确实已上传到 `backend/.deer-flow/threads/{thread_id}/user-data/uploads/`
4. 非本地沙箱场景下,确认上传接口没有报错(需要成功完成 sandbox 同步)
## 开发建议
### 前端集成
```typescript
// 上传文件示例
async function uploadFiles(threadId: string, files: File[]) {
const formData = new FormData();
files.forEach(file => {
formData.append('files', file);
});
const response = await fetch(
`/api/threads/${threadId}/uploads`,
{
method: 'POST',
body: formData,
}
);
return response.json();
}
// 列出文件
async function listFiles(threadId: string) {
const response = await fetch(
`/api/threads/${threadId}/uploads/list`
);
return response.json();
}
```
### 扩展功能建议
1. **文件预览**:添加预览端点,支持在浏览器中直接查看文件
2. **批量删除**:支持一次删除多个文件
3. **文件搜索**:支持按文件名或类型搜索
4. **版本控制**:保留文件的多个版本
5. **压缩包支持**:自动解压 zip 文件
6. **图片 OCR**:对上传的图片进行 OCR 识别

View File

@@ -0,0 +1,385 @@
# Guardrails: Pre-Tool-Call Authorization
> **Context:** [Issue #1213](https://github.com/bytedance/deer-flow/issues/1213) — DeerFlow has Docker sandboxing and human approval via `ask_clarification`, but no deterministic, policy-driven authorization layer for tool calls. An agent running autonomous multi-step tasks can execute any loaded tool with any arguments. Guardrails add a middleware that evaluates every tool call against a policy **before** execution.
## Why Guardrails
```
Without guardrails: With guardrails:
Agent Agent
│ │
▼ ▼
┌──────────┐ ┌──────────┐
│ bash │──▶ executes immediately │ bash │──▶ GuardrailMiddleware
│ rm -rf / │ │ rm -rf / │ │
└──────────┘ └──────────┘ ▼
┌──────────────┐
│ Provider │
│ evaluates │
│ against │
│ policy │
└──────┬───────┘
┌─────┴─────┐
│ │
ALLOW DENY
│ │
▼ ▼
Tool runs Agent sees:
normally "Guardrail denied:
rm -rf blocked"
```
- **Sandboxing** provides process isolation but not semantic authorization. A sandboxed `bash` can still `curl` data out.
- **Human approval** (`ask_clarification`) requires a human in the loop for every action. Not viable for autonomous workflows.
- **Guardrails** provide deterministic, policy-driven authorization that works without human intervention.
## Architecture
```
┌─────────────────────────────────────────────────────────────────────┐
│ Middleware Chain │
│ │
│ 1. ThreadDataMiddleware ─── per-thread dirs │
│ 2. UploadsMiddleware ─── file upload tracking │
│ 3. SandboxMiddleware ─── sandbox acquisition │
│ 4. DanglingToolCallMiddleware ── fix incomplete tool calls │
│ 5. GuardrailMiddleware ◄──── EVALUATES EVERY TOOL CALL │
│ 6. ToolErrorHandlingMiddleware ── convert exceptions to messages │
│ 7-12. (Summarization, Title, Memory, Vision, Subagent, Clarify) │
│ │
└─────────────────────────────────────────────────────────────────────┘
┌──────────────────────────┐
│ GuardrailProvider │ ◄── pluggable: any class
│ (configured in YAML) │ with evaluate/aevaluate
└────────────┬─────────────┘
┌─────────┼──────────────┐
│ │ │
▼ ▼ ▼
Built-in OAP Passport Custom
Allowlist Provider Provider
(zero dep) (open standard) (your code)
Any implementation
(e.g. APort, or
your own evaluator)
```
The `GuardrailMiddleware` implements `wrap_tool_call` / `awrap_tool_call` (the same `AgentMiddleware` pattern used by `ToolErrorHandlingMiddleware`). It:
1. Builds a `GuardrailRequest` with tool name, arguments, and passport reference
2. Calls `provider.evaluate(request)` on whatever provider is configured
3. If **deny**: returns `ToolMessage(status="error")` with the reason -- agent sees the denial and adapts
4. If **allow**: passes through to the actual tool handler
5. If **provider error** and `fail_closed=true` (default): blocks the call
6. `GraphBubbleUp` exceptions (LangGraph control signals) are always propagated, never caught
## Three Provider Options
### Option 1: Built-in AllowlistProvider (Zero Dependencies)
The simplest option. Ships with DeerFlow. Block or allow tools by name. No external packages, no passport, no network.
**config.yaml:**
```yaml
guardrails:
enabled: true
provider:
use: deerflow.guardrails.builtin:AllowlistProvider
config:
denied_tools: ["bash", "write_file"]
```
This blocks `bash` and `write_file` for all requests. All other tools pass through.
You can also use an allowlist (only these tools are permitted):
```yaml
guardrails:
enabled: true
provider:
use: deerflow.guardrails.builtin:AllowlistProvider
config:
allowed_tools: ["web_search", "read_file", "ls"]
```
**Try it:**
1. Add the config above to your `config.yaml`
2. Start DeerFlow: `make dev`
3. Ask the agent: "Use bash to run echo hello"
4. The agent sees: `Guardrail denied: tool 'bash' was blocked (oap.tool_not_allowed)`
### Option 2: OAP Passport Provider (Policy-Based)
For policy enforcement based on the [Open Agent Passport (OAP)](https://github.com/aporthq/aport-spec) open standard. An OAP passport is a JSON document that declares an agent's identity, capabilities, and operational limits. Any provider that reads an OAP passport and returns OAP-compliant decisions works with DeerFlow.
```
┌─────────────────────────────────────────────────────────────┐
│ OAP Passport (JSON) │
│ (open standard, any provider) │
│ { │
│ "spec_version": "oap/1.0", │
│ "status": "active", │
│ "capabilities": [ │
│ {"id": "system.command.execute"}, │
│ {"id": "data.file.read"}, │
│ {"id": "data.file.write"}, │
│ {"id": "web.fetch"}, │
│ {"id": "mcp.tool.execute"} │
│ ], │
│ "limits": { │
│ "system.command.execute": { │
│ "allowed_commands": ["git", "npm", "node", "ls"], │
│ "blocked_patterns": ["rm -rf", "sudo", "chmod 777"] │
│ } │
│ } │
│ } │
└──────────────────────────┬──────────────────────────────────┘
Any OAP-compliant provider
┌────────────────┼────────────────┐
│ │ │
Your own APort (ref. Other future
evaluator implementation) implementations
```
**Creating a passport manually:**
An OAP passport is just a JSON file. You can create one by hand following the [OAP specification](https://github.com/aporthq/aport-spec/blob/main/oap/oap-spec.md) and validate it against the [JSON schema](https://github.com/aporthq/aport-spec/blob/main/oap/passport-schema.json). See the [examples](https://github.com/aporthq/aport-spec/tree/main/oap/examples) directory for templates.
**Using APort as a reference implementation:**
[APort Agent Guardrails](https://github.com/aporthq/aport-agent-guardrails) is one open-source (Apache 2.0) implementation of an OAP provider. It handles passport creation, local evaluation, and optional hosted API evaluation.
```bash
pip install aport-agent-guardrails
aport setup --framework deerflow
```
This creates:
- `~/.aport/deerflow/config.yaml` -- evaluator config (local or API mode)
- `~/.aport/deerflow/aport/passport.json` -- OAP passport with capabilities and limits
**config.yaml (using APort as the provider):**
```yaml
guardrails:
enabled: true
provider:
use: aport_guardrails.providers.generic:OAPGuardrailProvider
```
**config.yaml (using your own OAP provider):**
```yaml
guardrails:
enabled: true
provider:
use: my_oap_provider:MyOAPProvider
config:
passport_path: ./my-passport.json
```
Any provider that accepts `framework` as a kwarg and implements `evaluate`/`aevaluate` works. The OAP standard defines the passport format and decision codes; DeerFlow doesn't care which provider reads them.
**What the passport controls:**
| Passport field | What it does | Example |
|---|---|---|
| `capabilities[].id` | Which tool categories the agent can use | `system.command.execute`, `data.file.write` |
| `limits.*.allowed_commands` | Which commands are allowed | `["git", "npm", "node"]` or `["*"]` for all |
| `limits.*.blocked_patterns` | Patterns always denied | `["rm -rf", "sudo", "chmod 777"]` |
| `status` | Kill switch | `active`, `suspended`, `revoked` |
**Evaluation modes (provider-dependent):**
OAP providers may support different evaluation modes. For example, the APort reference implementation supports:
| Mode | How it works | Network | Latency |
|---|---|---|---|
| **Local** | Evaluates passport locally (bash script). | None | ~300ms |
| **API** | Sends passport + context to a hosted evaluator. Signed decisions. | Yes | ~65ms |
A custom OAP provider can implement any evaluation strategy -- the DeerFlow middleware doesn't care how the provider reaches its decision.
**Try it:**
1. Install and set up as above
2. Start DeerFlow and ask: "Create a file called test.txt with content hello"
3. Then ask: "Now delete it using bash rm -rf"
4. Guardrail blocks it: `oap.blocked_pattern: Command contains blocked pattern: rm -rf`
### Option 3: Custom Provider (Bring Your Own)
Any Python class with `evaluate(request)` and `aevaluate(request)` methods works. No base class or inheritance needed -- it's a structural protocol.
```python
# my_guardrail.py
class MyGuardrailProvider:
name = "my-company"
def evaluate(self, request):
from deerflow.guardrails.provider import GuardrailDecision, GuardrailReason
# Example: block any bash command containing "delete"
if request.tool_name == "bash" and "delete" in str(request.tool_input):
return GuardrailDecision(
allow=False,
reasons=[GuardrailReason(code="custom.blocked", message="delete not allowed")],
policy_id="custom.v1",
)
return GuardrailDecision(allow=True, reasons=[GuardrailReason(code="oap.allowed")])
async def aevaluate(self, request):
return self.evaluate(request)
```
**config.yaml:**
```yaml
guardrails:
enabled: true
provider:
use: my_guardrail:MyGuardrailProvider
```
Make sure `my_guardrail.py` is on the Python path (e.g. in the backend directory or installed as a package).
**Try it:**
1. Create `my_guardrail.py` in the backend directory
2. Add the config
3. Start DeerFlow and ask: "Use bash to delete test.txt"
4. Your provider blocks it
## Implementing a Provider
### Required Interface
```
┌──────────────────────────────────────────────────┐
│ GuardrailProvider Protocol │
│ │
│ name: str │
│ │
│ evaluate(request: GuardrailRequest) │
│ -> GuardrailDecision │
│ │
│ aevaluate(request: GuardrailRequest) (async) │
│ -> GuardrailDecision │
└──────────────────────────────────────────────────┘
┌──────────────────────────┐ ┌──────────────────────────┐
│ GuardrailRequest │ │ GuardrailDecision │
│ │ │ │
│ tool_name: str │ │ allow: bool │
│ tool_input: dict │ │ reasons: [GuardrailReason]│
│ agent_id: str | None │ │ policy_id: str | None │
│ thread_id: str | None │ │ metadata: dict │
│ is_subagent: bool │ │ │
│ timestamp: str │ │ GuardrailReason: │
│ │ │ code: str │
└──────────────────────────┘ │ message: str │
└──────────────────────────┘
```
### DeerFlow Tool Names
These are the tool names your provider will see in `request.tool_name`:
| Tool | What it does |
|---|---|
| `bash` | Shell command execution |
| `write_file` | Create/overwrite a file |
| `str_replace` | Edit a file (find and replace) |
| `read_file` | Read file content |
| `ls` | List directory |
| `web_search` | Web search query |
| `web_fetch` | Fetch URL content |
| `image_search` | Image search |
| `present_file` | Present file to user |
| `view_image` | Display image |
| `ask_clarification` | Ask user a question |
| `task` | Delegate to subagent |
| `mcp__*` | MCP tools (dynamic) |
### OAP Reason Codes
Standard codes used by the [OAP specification](https://github.com/aporthq/aport-spec):
| Code | Meaning |
|---|---|
| `oap.allowed` | Tool call authorized |
| `oap.tool_not_allowed` | Tool not in allowlist |
| `oap.command_not_allowed` | Command not in allowed_commands |
| `oap.blocked_pattern` | Command matches a blocked pattern |
| `oap.limit_exceeded` | Operation exceeds a limit |
| `oap.passport_suspended` | Passport status is suspended/revoked |
| `oap.evaluator_error` | Provider crashed (fail-closed) |
### Provider Loading
DeerFlow loads providers via `resolve_variable()` -- the same mechanism used for models, tools, and sandbox providers. The `use:` field is a Python class path: `package.module:ClassName`.
The provider is instantiated with `**config` kwargs if `config:` is set, plus `framework="deerflow"` is always injected. Accept `**kwargs` to stay forward-compatible:
```python
class YourProvider:
def __init__(self, framework: str = "generic", **kwargs):
# framework="deerflow" tells you which config dir to use
...
```
## Configuration Reference
```yaml
guardrails:
# Enable/disable guardrail middleware (default: false)
enabled: true
# Block tool calls if provider raises an exception (default: true)
fail_closed: true
# Passport reference -- passed as request.agent_id to the provider.
# File path, hosted agent ID, or null (provider resolves from its config).
passport: null
# Provider: loaded by class path via resolve_variable
provider:
use: deerflow.guardrails.builtin:AllowlistProvider
config: # optional kwargs passed to provider.__init__
denied_tools: ["bash"]
```
## Testing
```bash
cd backend
uv run python -m pytest tests/test_guardrail_middleware.py -v
```
25 tests covering:
- AllowlistProvider: allow, deny, both allowlist+denylist, async
- GuardrailMiddleware: allow passthrough, deny with OAP codes, fail-closed, fail-open, passport forwarding, empty reasons fallback, empty tool name, protocol isinstance check
- Async paths: awrap_tool_call for allow, deny, fail-closed, fail-open
- GraphBubbleUp: LangGraph control signals propagate through (not caught)
- Config: defaults, from_dict, singleton load/reset
## Files
```
packages/harness/deerflow/guardrails/
__init__.py # Public exports
provider.py # GuardrailProvider protocol, GuardrailRequest, GuardrailDecision
middleware.py # GuardrailMiddleware (AgentMiddleware subclass)
builtin.py # AllowlistProvider (zero deps)
packages/harness/deerflow/config/
guardrails_config.py # GuardrailsConfig Pydantic model + singleton
packages/harness/deerflow/agents/middlewares/
tool_error_handling_middleware.py # Registers GuardrailMiddleware in chain
config.example.yaml # Three provider options documented
tests/test_guardrail_middleware.py # 25 tests
docs/GUARDRAILS.md # This file
```

View File

@@ -0,0 +1,343 @@
# DeerFlow 后端拆分设计文档Harness + App
> 状态Draft
> 作者DeerFlow Team
> 日期2026-03-13
## 1. 背景与动机
DeerFlow 后端当前是一个单一 Python 包(`src.*`),包含了从底层 agent 编排到上层用户产品的所有代码。随着项目发展,这种结构带来了几个问题:
- **复用困难**其他产品CLI 工具、Slack bot、第三方集成想用 agent 能力,必须依赖整个后端,包括 FastAPI、IM SDK 等不需要的依赖
- **职责模糊**agent 编排逻辑和用户产品逻辑混在同一个 `src/` 下,边界不清晰
- **依赖膨胀**LangGraph Server 运行时不需要 FastAPI/uvicorn/Slack SDK但当前必须安装全部依赖
本文档提出将后端拆分为两部分:**deerflow-harness**(可发布的 agent 框架包)和 **app**(不打包的用户产品代码)。
## 2. 核心概念
### 2.1 Harness线束/框架层)
Harness 是 agent 的构建与编排框架,回答 **"如何构建和运行 agent"** 的问题:
- Agent 工厂与生命周期管理
- Middleware pipeline
- 工具系统(内置工具 + MCP + 社区工具)
- 沙箱执行环境
- 子 agent 委派
- 记忆系统
- 技能加载与注入
- 模型工厂
- 配置系统
**Harness 是一个可发布的 Python 包**`deerflow-harness`),可以独立安装和使用。
**Harness 的设计原则**:对上层应用完全无感知。它不知道也不关心谁在调用它——可以是 Web App、CLI、Slack Bot、或者一个单元测试。
### 2.2 App应用层
App 是面向用户的产品代码,回答 **"如何将 agent 呈现给用户"** 的问题:
- Gateway APIFastAPI REST 接口)
- IM Channels飞书、Slack、Telegram 集成)
- Custom Agent 的 CRUD 管理
- 文件上传/下载的 HTTP 接口
**App 不打包、不发布**,它是 DeerFlow 项目内部的应用代码,直接运行。
**App 依赖 Harness但 Harness 不依赖 App。**
### 2.3 边界划分
| 模块 | 归属 | 说明 |
|------|------|------|
| `config/` | Harness | 配置系统是基础设施 |
| `reflection/` | Harness | 动态模块加载工具 |
| `utils/` | Harness | 通用工具函数 |
| `agents/` | Harness | Agent 工厂、middleware、state、memory |
| `subagents/` | Harness | 子 agent 委派系统 |
| `sandbox/` | Harness | 沙箱执行环境 |
| `tools/` | Harness | 工具注册与发现 |
| `mcp/` | Harness | MCP 协议集成 |
| `skills/` | Harness | 技能加载、解析、定义 schema |
| `models/` | Harness | LLM 模型工厂 |
| `community/` | Harness | 社区工具tavily、jina 等) |
| `client.py` | Harness | 嵌入式 Python 客户端 |
| `gateway/` | App | FastAPI REST API |
| `channels/` | App | IM 平台集成 |
**关于 Custom Agents**agent 定义格式(`config.yaml` + `SOUL.md` schema由 Harness 层的 `config/agents_config.py` 定义但文件的存储、CRUD、发现机制由 App 层的 `gateway/routers/agents.py` 负责。
## 3. 目标架构
### 3.1 目录结构
```
backend/
├── packages/
│ └── harness/
│ ├── pyproject.toml # deerflow-harness 包定义
│ └── deerflow/ # Python 包根import 前缀: deerflow.*
│ ├── __init__.py
│ ├── config/
│ ├── reflection/
│ ├── utils/
│ ├── agents/
│ │ ├── lead_agent/
│ │ ├── middlewares/
│ │ ├── memory/
│ │ ├── checkpointer/
│ │ └── thread_state.py
│ ├── subagents/
│ ├── sandbox/
│ ├── tools/
│ ├── mcp/
│ ├── skills/
│ ├── models/
│ ├── community/
│ └── client.py
├── app/ # 不打包import 前缀: app.*
│ ├── __init__.py
│ ├── gateway/
│ │ ├── __init__.py
│ │ ├── app.py
│ │ ├── config.py
│ │ ├── path_utils.py
│ │ └── routers/
│ └── channels/
│ ├── __init__.py
│ ├── base.py
│ ├── manager.py
│ ├── service.py
│ ├── store.py
│ ├── message_bus.py
│ ├── feishu.py
│ ├── slack.py
│ └── telegram.py
├── pyproject.toml # uv workspace root
├── langgraph.json
├── tests/
├── docs/
└── Makefile
```
### 3.2 Import 规则
两个层使用不同的 import 前缀,职责边界一目了然:
```python
# ---------------------------------------------------------------
# Harness 内部互相引用deerflow.* 前缀)
# ---------------------------------------------------------------
from deerflow.agents import make_lead_agent
from deerflow.models import create_chat_model
from deerflow.config import get_app_config
from deerflow.tools import get_available_tools
# ---------------------------------------------------------------
# App 内部互相引用app.* 前缀)
# ---------------------------------------------------------------
from app.gateway.app import app
from app.gateway.routers.uploads import upload_files
from app.channels.service import start_channel_service
# ---------------------------------------------------------------
# App 调用 Harness单向依赖Harness 永远不 import app
# ---------------------------------------------------------------
from deerflow.agents import make_lead_agent
from deerflow.models import create_chat_model
from deerflow.skills import load_skills
from deerflow.config.extensions_config import get_extensions_config
```
**App 调用 Harness 示例 — Gateway 中启动 agent**
```python
# app/gateway/routers/chat.py
from deerflow.agents.lead_agent.agent import make_lead_agent
from deerflow.models import create_chat_model
from deerflow.config import get_app_config
async def create_chat_session(thread_id: str, model_name: str):
config = get_app_config()
model = create_chat_model(name=model_name)
agent = make_lead_agent(config=...)
# ... 使用 agent 处理用户消息
```
**App 调用 Harness 示例 — Channel 中查询 skills**
```python
# app/channels/manager.py
from deerflow.skills import load_skills
from deerflow.agents.memory.updater import get_memory_data
def handle_status_command():
skills = load_skills(enabled_only=True)
memory = get_memory_data()
return f"Skills: {len(skills)}, Memory facts: {len(memory.get('facts', []))}"
```
**禁止方向**Harness 代码中绝不能出现 `from app.``import app.`
### 3.3 为什么 App 不打包
| 方面 | 打包(放 packages/ 下) | 不打包(放 backend/app/ |
|------|------------------------|--------------------------|
| 命名空间 | 需要 pkgutil `extend_path` 合并,或独立前缀 | 天然独立,`app.*` vs `deerflow.*` |
| 发布需求 | 没有——App 是项目内部代码 | 不需要 pyproject.toml |
| 复杂度 | 需要管理两个包的构建、版本、依赖声明 | 直接运行,零额外配置 |
| 运行方式 | `pip install deerflow-app` | `PYTHONPATH=. uvicorn app.gateway.app:app` |
App 的唯一消费者是 DeerFlow 项目自身,没有独立发布的需求。放在 `backend/app/` 下作为普通 Python 包,通过 `PYTHONPATH` 或 editable install 让 Python 找到即可。
### 3.4 依赖关系
```
┌─────────────────────────────────────┐
│ app/ (不打包,直接运行) │
│ ├── fastapi, uvicorn │
│ ├── slack-sdk, lark-oapi, ... │
│ └── import deerflow.* │
└──────────────┬──────────────────────┘
┌─────────────────────────────────────┐
│ deerflow-harness (可发布的包) │
│ ├── langgraph, langchain │
│ ├── markitdown, pydantic, ... │
│ └── 零 app 依赖 │
└─────────────────────────────────────┘
```
**依赖分类**
| 分类 | 依赖包 |
|------|--------|
| Harness only | agent-sandbox, langchain*, langgraph*, markdownify, markitdown, pydantic, pyyaml, readabilipy, tavily-python, firecrawl-py, tiktoken, ddgs, duckdb, httpx, kubernetes, dotenv |
| App only | fastapi, uvicorn, sse-starlette, python-multipart, lark-oapi, slack-sdk, python-telegram-bot, markdown-to-mrkdwn |
| Shared | langgraph-sdkchannels 用 HTTP client, pydantic, httpx |
### 3.5 Workspace 配置
`backend/pyproject.toml`workspace root
```toml
[project]
name = "deer-flow"
version = "0.1.0"
requires-python = ">=3.12"
dependencies = ["deerflow-harness"]
[dependency-groups]
dev = ["pytest>=8.0.0", "ruff>=0.14.11"]
# App 的额外依赖fastapi 等)也声明在 workspace root因为 app 不打包
app = ["fastapi", "uvicorn", "sse-starlette", "python-multipart"]
channels = ["lark-oapi", "slack-sdk", "python-telegram-bot"]
[tool.uv.workspace]
members = ["packages/harness"]
[tool.uv.sources]
deerflow-harness = { workspace = true }
```
## 4. 当前的跨层依赖问题
在拆分之前,需要先解决 `client.py` 中两处从 harness 到 app 的反向依赖:
### 4.1 `_validate_skill_frontmatter`
```python
# client.py — harness 导入了 app 层代码
from src.gateway.routers.skills import _validate_skill_frontmatter
```
**解决方案**:将该函数提取到 `deerflow/skills/validation.py`。这是一个纯逻辑函数(解析 YAML frontmatter、校验字段与 FastAPI 无关。
### 4.2 `CONVERTIBLE_EXTENSIONS` + `convert_file_to_markdown`
```python
# client.py — harness 导入了 app 层代码
from src.gateway.routers.uploads import CONVERTIBLE_EXTENSIONS, convert_file_to_markdown
```
**解决方案**:将它们提取到 `deerflow/utils/file_conversion.py`。仅依赖 `markitdown` + `pathlib`,是通用工具函数。
## 5. 基础设施变更
### 5.1 LangGraph Server
LangGraph Server 只需要 harness 包。`langgraph.json` 更新:
```json
{
"dependencies": ["./packages/harness"],
"graphs": {
"lead_agent": "deerflow.agents:make_lead_agent"
},
"checkpointer": {
"path": "./packages/harness/deerflow/agents/checkpointer/async_provider.py:make_checkpointer"
}
}
```
### 5.2 Gateway API
```bash
# serve.sh / Makefile
# PYTHONPATH 包含 backend/ 根目录,使 app.* 和 deerflow.* 都能被找到
PYTHONPATH=. uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001
```
### 5.3 Nginx
无需变更(只做 URL 路由,不涉及 Python 模块路径)。
### 5.4 Docker
Dockerfile 中的 module 引用从 `src.` 改为 `deerflow.` / `app.``COPY` 命令需覆盖 `packages/``app/` 目录。
## 6. 实施计划
分 3 个 PR 递进执行:
### PR 1提取共享工具函数Low Risk
1. 创建 `src/skills/validation.py`,从 `gateway/routers/skills.py` 提取 `_validate_skill_frontmatter`
2. 创建 `src/utils/file_conversion.py`,从 `gateway/routers/uploads.py` 提取文件转换逻辑
3. 更新 `client.py``gateway/routers/skills.py``gateway/routers/uploads.py` 的 import
4. 运行全部测试确认无回归
### PR 2Rename + 物理拆分High Risk原子操作
1. 创建 `packages/harness/` 目录,创建 `pyproject.toml`
2. `git mv` 将 harness 相关模块从 `src/` 移入 `packages/harness/deerflow/`
3. `git mv` 将 app 相关模块从 `src/` 移入 `app/`
4. 全局替换 import
- harness 模块:`src.*``deerflow.*`(所有 `.py` 文件、`langgraph.json`、测试、文档)
- app 模块:`src.gateway.*``app.gateway.*``src.channels.*``app.channels.*`
5. 更新 workspace root `pyproject.toml`
6. 更新 `langgraph.json``Makefile``Dockerfile`
7. `uv sync` + 全部测试 + 手动验证服务启动
### PR 3边界检查 + 文档Low Risk
1. 添加 lint 规则:检查 harness 不 import app 模块
2. 更新 `CLAUDE.md``README.md`
## 7. 风险与缓解
| 风险 | 影响 | 缓解措施 |
|------|------|----------|
| 全局 rename 误伤 | 字符串中的 `src` 被错误替换 | 正则精确匹配 `\bsrc\.`review diff |
| LangGraph Server 找不到模块 | 服务启动失败 | `langgraph.json``dependencies` 指向正确的 harness 包路径 |
| App 的 `PYTHONPATH` 缺失 | Gateway/Channel 启动 import 报错 | Makefile/Docker 统一设置 `PYTHONPATH=.` |
| `config.yaml` 中的 `use` 字段引用旧路径 | 运行时模块解析失败 | `config.yaml` 中的 `use` 字段同步更新为 `deerflow.*` |
| 测试中 `sys.path` 混乱 | 测试失败 | 用 editable install`uv sync`)确保 deerflow 可导入,`conftest.py` 中添加 `app/``sys.path` |
## 8. 未来演进
- **独立发布**harness 可以发布到内部 PyPI让其他项目直接 `pip install deerflow-harness`
- **插件化 App**:不同的 appweb、CLI、bot可以各自独立都依赖同一个 harness
- **更细粒度拆分**:如果 harness 内部模块继续增长,可以进一步拆分(如 `deerflow-sandbox``deerflow-mcp`

View File

@@ -0,0 +1,65 @@
# MCP (Model Context Protocol) Configuration
DeerFlow supports configurable MCP servers and skills to extend its capabilities, which are loaded from a dedicated `extensions_config.json` file in the project root directory.
## Setup
1. Copy `extensions_config.example.json` to `extensions_config.json` in the project root directory.
```bash
# Copy example configuration
cp extensions_config.example.json extensions_config.json
```
2. Enable the desired MCP servers or skills by setting `"enabled": true`.
3. Configure each servers command, arguments, and environment variables as needed.
4. Restart the application to load and register MCP tools.
## OAuth Support (HTTP/SSE MCP Servers)
For `http` and `sse` MCP servers, DeerFlow supports OAuth token acquisition and automatic token refresh.
- Supported grants: `client_credentials`, `refresh_token`
- Configure per-server `oauth` block in `extensions_config.json`
- Secrets should be provided via environment variables (for example: `$MCP_OAUTH_CLIENT_SECRET`)
Example:
```json
{
"mcpServers": {
"secure-http-server": {
"enabled": true,
"type": "http",
"url": "https://api.example.com/mcp",
"oauth": {
"enabled": true,
"token_url": "https://auth.example.com/oauth/token",
"grant_type": "client_credentials",
"client_id": "$MCP_OAUTH_CLIENT_ID",
"client_secret": "$MCP_OAUTH_CLIENT_SECRET",
"scope": "mcp.read",
"refresh_skew_seconds": 60
}
}
}
}
```
## How It Works
MCP servers expose tools that are automatically discovered and integrated into DeerFlows agent system at runtime. Once enabled, these tools become available to agents without additional code changes.
## Example Capabilities
MCP servers can provide access to:
- **File systems**
- **Databases** (e.g., PostgreSQL)
- **External APIs** (e.g., GitHub, Brave Search)
- **Browser automation** (e.g., Puppeteer)
- **Custom MCP server implementations**
## Learn More
For detailed documentation about the Model Context Protocol, visit:
https://modelcontextprotocol.io

View File

@@ -0,0 +1,65 @@
# Memory System Improvements
This document tracks memory injection behavior and roadmap status.
## Status (As Of 2026-03-10)
Implemented in `main`:
- Accurate token counting via `tiktoken` in `format_memory_for_injection`.
- Facts are injected into prompt memory context.
- Facts are ranked by confidence (descending).
- Injection respects `max_injection_tokens` budget.
Planned / not yet merged:
- TF-IDF similarity-based fact retrieval.
- `current_context` input for context-aware scoring.
- Configurable similarity/confidence weights (`similarity_weight`, `confidence_weight`).
- Middleware/runtime wiring for context-aware retrieval before each model call.
## Current Behavior
Function today:
```python
def format_memory_for_injection(memory_data: dict[str, Any], max_tokens: int = 2000) -> str:
```
Current injection format:
- `User Context` section from `user.*.summary`
- `History` section from `history.*.summary`
- `Facts` section from `facts[]`, sorted by confidence, appended until token budget is reached
Token counting:
- Uses `tiktoken` (`cl100k_base`) when available
- Falls back to `len(text) // 4` if tokenizer import fails
## Known Gap
Previous versions of this document described TF-IDF/context-aware retrieval as if it were already shipped.
That was not accurate for `main` and caused confusion.
Issue reference: `#1059`
## Roadmap (Planned)
Planned scoring strategy:
```text
final_score = (similarity * 0.6) + (confidence * 0.4)
```
Planned integration shape:
1. Extract recent conversational context from filtered user/final-assistant turns.
2. Compute TF-IDF cosine similarity between each fact and current context.
3. Rank by weighted score and inject under token budget.
4. Fall back to confidence-only ranking if context is unavailable.
## Validation
Current regression coverage includes:
- facts inclusion in memory injection output
- confidence ordering
- token-budget-limited fact inclusion
Tests:
- `backend/tests/test_memory_prompt_injection.py`

View File

@@ -0,0 +1,38 @@
# Memory System Improvements - Summary
## Sync Note (2026-03-10)
This summary is synchronized with the `main` branch implementation.
TF-IDF/context-aware retrieval is **planned**, not merged yet.
## Implemented
- Accurate token counting with `tiktoken` in memory injection.
- Facts are injected into `<memory>` prompt content.
- Facts are ordered by confidence and bounded by `max_injection_tokens`.
## Planned (Not Yet Merged)
- TF-IDF cosine similarity recall based on recent conversation context.
- `current_context` parameter for `format_memory_for_injection`.
- Weighted ranking (`similarity` + `confidence`).
- Runtime extraction/injection flow for context-aware fact selection.
## Why This Sync Was Needed
Earlier docs described TF-IDF behavior as already implemented, which did not match code in `main`.
This mismatch is tracked in issue `#1059`.
## Current API Shape
```python
def format_memory_for_injection(memory_data: dict[str, Any], max_tokens: int = 2000) -> str:
```
No `current_context` argument is currently available in `main`.
## Verification Pointers
- Implementation: `packages/harness/deerflow/agents/memory/prompt.py`
- Prompt assembly: `packages/harness/deerflow/agents/lead_agent/prompt.py`
- Regression tests: `backend/tests/test_memory_prompt_injection.py`

View File

@@ -0,0 +1,63 @@
# Memory Settings Review
Use this when reviewing the Memory Settings add/edit flow locally with the fewest possible manual steps.
## Quick Review
1. Start DeerFlow locally using any working development setup you already use.
Examples:
```bash
make dev
```
or
```bash
make docker-start
```
If you already have DeerFlow running locally, you can reuse that existing setup.
2. Load the sample memory fixture.
```bash
python scripts/load_memory_sample.py
```
3. Open `Settings > Memory`.
Default local URLs:
- App: `http://localhost:2026`
- Local frontend-only fallback: `http://localhost:3000`
## Minimal Manual Test
1. Click `Add fact`.
2. Create a new fact with:
- Content: `Reviewer-added memory fact`
- Category: `testing`
- Confidence: `0.88`
3. Confirm the new fact appears immediately and shows `Manual` as the source.
4. Edit the sample fact `This sample fact is intended for edit testing.` and change it to:
- Content: `This sample fact was edited during manual review.`
- Category: `testing`
- Confidence: `0.91`
5. Confirm the edited fact updates immediately.
6. Refresh the page and confirm both the newly added fact and the edited fact still persist.
## Optional Sanity Checks
- Search `Reviewer-added` and confirm the new fact is matched.
- Search `workflow` and confirm category text is searchable.
- Switch between `All`, `Facts`, and `Summaries`.
- Delete the disposable sample fact `Delete fact testing can target this disposable sample entry.` and confirm the list updates immediately.
- Clear all memory and confirm the page enters the empty state.
## Fixture Files
- Sample fixture: `backend/docs/memory-settings-sample.json`
- Default local runtime target: `backend/.deer-flow/memory.json`
The loader script creates a timestamped backup automatically before overwriting an existing runtime memory file.

View File

@@ -0,0 +1,289 @@
# 文件路径使用示例
## 三种路径类型
DeerFlow 的文件上传系统返回三种不同的路径,每种路径用于不同的场景:
### 1. 实际文件系统路径 (path)
```
.deer-flow/threads/{thread_id}/user-data/uploads/document.pdf
```
**用途:**
- 文件在服务器文件系统中的实际位置
- 相对于 `backend/` 目录
- 用于直接文件系统访问、备份、调试等
**示例:**
```python
# Python 代码中直接访问
from pathlib import Path
file_path = Path("backend/.deer-flow/threads/abc123/user-data/uploads/document.pdf")
content = file_path.read_bytes()
```
### 2. 虚拟路径 (virtual_path)
```
/mnt/user-data/uploads/document.pdf
```
**用途:**
- Agent 在沙箱环境中使用的路径
- 沙箱系统会自动映射到实际路径
- Agent 的所有文件操作工具都使用这个路径
**示例:**
Agent 在对话中使用:
```python
# Agent 使用 read_file 工具
read_file(path="/mnt/user-data/uploads/document.pdf")
# Agent 使用 bash 工具
bash(command="cat /mnt/user-data/uploads/document.pdf")
```
### 3. HTTP 访问 URL (artifact_url)
```
/api/threads/{thread_id}/artifacts/mnt/user-data/uploads/document.pdf
```
**用途:**
- 前端通过 HTTP 访问文件
- 用于下载、预览文件
- 可以直接在浏览器中打开
**示例:**
```typescript
// 前端 TypeScript/JavaScript 代码
const threadId = 'abc123';
const filename = 'document.pdf';
// 下载文件
const downloadUrl = `/api/threads/${threadId}/artifacts/mnt/user-data/uploads/${filename}?download=true`;
window.open(downloadUrl);
// 在新窗口预览
const viewUrl = `/api/threads/${threadId}/artifacts/mnt/user-data/uploads/${filename}`;
window.open(viewUrl, '_blank');
// 使用 fetch API 获取
const response = await fetch(viewUrl);
const blob = await response.blob();
```
## 完整使用流程示例
### 场景:前端上传文件并让 Agent 处理
```typescript
// 1. 前端上传文件
async function uploadAndProcess(threadId: string, file: File) {
// 上传文件
const formData = new FormData();
formData.append('files', file);
const uploadResponse = await fetch(
`/api/threads/${threadId}/uploads`,
{
method: 'POST',
body: formData
}
);
const uploadData = await uploadResponse.json();
const fileInfo = uploadData.files[0];
console.log('文件信息:', fileInfo);
// {
// filename: "report.pdf",
// path: ".deer-flow/threads/abc123/user-data/uploads/report.pdf",
// virtual_path: "/mnt/user-data/uploads/report.pdf",
// artifact_url: "/api/threads/abc123/artifacts/mnt/user-data/uploads/report.pdf",
// markdown_file: "report.md",
// markdown_path: ".deer-flow/threads/abc123/user-data/uploads/report.md",
// markdown_virtual_path: "/mnt/user-data/uploads/report.md",
// markdown_artifact_url: "/api/threads/abc123/artifacts/mnt/user-data/uploads/report.md"
// }
// 2. 发送消息给 Agent
await sendMessage(threadId, "请分析刚上传的 PDF 文件");
// Agent 会自动看到文件列表,包含:
// - report.pdf (虚拟路径: /mnt/user-data/uploads/report.pdf)
// - report.md (虚拟路径: /mnt/user-data/uploads/report.md)
// 3. 前端可以直接访问转换后的 Markdown
const mdResponse = await fetch(fileInfo.markdown_artifact_url);
const markdownContent = await mdResponse.text();
console.log('Markdown 内容:', markdownContent);
// 4. 或者下载原始 PDF
const downloadLink = document.createElement('a');
downloadLink.href = fileInfo.artifact_url + '?download=true';
downloadLink.download = fileInfo.filename;
downloadLink.click();
}
```
## 路径转换表
| 场景 | 使用的路径类型 | 示例 |
|------|---------------|------|
| 服务器后端代码直接访问 | `path` | `.deer-flow/threads/abc123/user-data/uploads/file.pdf` |
| Agent 工具调用 | `virtual_path` | `/mnt/user-data/uploads/file.pdf` |
| 前端下载/预览 | `artifact_url` | `/api/threads/abc123/artifacts/mnt/user-data/uploads/file.pdf` |
| 备份脚本 | `path` | `.deer-flow/threads/abc123/user-data/uploads/file.pdf` |
| 日志记录 | `path` | `.deer-flow/threads/abc123/user-data/uploads/file.pdf` |
## 代码示例集合
### Python - 后端处理
```python
from pathlib import Path
from deerflow.agents.middlewares.thread_data_middleware import THREAD_DATA_BASE_DIR
def process_uploaded_file(thread_id: str, filename: str):
# 使用实际路径
base_dir = Path.cwd() / THREAD_DATA_BASE_DIR / thread_id / "user-data" / "uploads"
file_path = base_dir / filename
# 直接读取
with open(file_path, 'rb') as f:
content = f.read()
return content
```
### JavaScript - 前端访问
```javascript
// 列出已上传的文件
async function listUploadedFiles(threadId) {
const response = await fetch(`/api/threads/${threadId}/uploads/list`);
const data = await response.json();
// 为每个文件创建下载链接
data.files.forEach(file => {
console.log(`文件: ${file.filename}`);
console.log(`下载: ${file.artifact_url}?download=true`);
console.log(`预览: ${file.artifact_url}`);
// 如果是文档,还有 Markdown 版本
if (file.markdown_artifact_url) {
console.log(`Markdown: ${file.markdown_artifact_url}`);
}
});
return data.files;
}
// 删除文件
async function deleteFile(threadId, filename) {
const response = await fetch(
`/api/threads/${threadId}/uploads/${filename}`,
{ method: 'DELETE' }
);
return response.json();
}
```
### React 组件示例
```tsx
import React, { useState, useEffect } from 'react';
interface UploadedFile {
filename: string;
size: number;
path: string;
virtual_path: string;
artifact_url: string;
extension: string;
modified: number;
markdown_artifact_url?: string;
}
function FileUploadList({ threadId }: { threadId: string }) {
const [files, setFiles] = useState<UploadedFile[]>([]);
useEffect(() => {
fetchFiles();
}, [threadId]);
async function fetchFiles() {
const response = await fetch(`/api/threads/${threadId}/uploads/list`);
const data = await response.json();
setFiles(data.files);
}
async function handleUpload(event: React.ChangeEvent<HTMLInputElement>) {
const fileList = event.target.files;
if (!fileList) return;
const formData = new FormData();
Array.from(fileList).forEach(file => {
formData.append('files', file);
});
await fetch(`/api/threads/${threadId}/uploads`, {
method: 'POST',
body: formData
});
fetchFiles(); // 刷新列表
}
async function handleDelete(filename: string) {
await fetch(`/api/threads/${threadId}/uploads/${filename}`, {
method: 'DELETE'
});
fetchFiles(); // 刷新列表
}
return (
<div>
<input type="file" multiple onChange={handleUpload} />
<ul>
{files.map(file => (
<li key={file.filename}>
<span>{file.filename}</span>
<a href={file.artifact_url} target="_blank">预览</a>
<a href={`${file.artifact_url}?download=true`}>下载</a>
{file.markdown_artifact_url && (
<a href={file.markdown_artifact_url} target="_blank">Markdown</a>
)}
<button onClick={() => handleDelete(file.filename)}>删除</button>
</li>
))}
</ul>
</div>
);
}
```
## 注意事项
1. **路径安全性**
- 实际路径(`path`)包含线程 ID确保隔离
- API 会验证路径,防止目录遍历攻击
- 前端不应直接使用 `path`,而应使用 `artifact_url`
2. **Agent 使用**
- Agent 只能看到和使用 `virtual_path`
- 沙箱系统自动映射到实际路径
- Agent 不需要知道实际的文件系统结构
3. **前端集成**
- 始终使用 `artifact_url` 访问文件
- 不要尝试直接访问文件系统路径
- 使用 `?download=true` 参数强制下载
4. **Markdown 转换**
- 转换成功时,会返回额外的 `markdown_*` 字段
- 建议优先使用 Markdown 版本(更易处理)
- 原始文件始终保留

View File

@@ -0,0 +1,55 @@
# Documentation
This directory contains detailed documentation for the DeerFlow backend.
## Quick Links
| Document | Description |
|----------|-------------|
| [ARCHITECTURE.md](ARCHITECTURE.md) | System architecture overview |
| [API.md](API.md) | Complete API reference |
| [CONFIGURATION.md](CONFIGURATION.md) | Configuration options |
| [SETUP.md](SETUP.md) | Quick setup guide |
## Feature Documentation
| Document | Description |
|----------|-------------|
| [STREAMING.md](STREAMING.md) | Token-level streaming design: Gateway vs DeerFlowClient paths, `stream_mode` semantics, per-id dedup |
| [FILE_UPLOAD.md](FILE_UPLOAD.md) | File upload functionality |
| [PATH_EXAMPLES.md](PATH_EXAMPLES.md) | Path types and usage examples |
| [summarization.md](summarization.md) | Context summarization feature |
| [plan_mode_usage.md](plan_mode_usage.md) | Plan mode with TodoList |
| [AUTO_TITLE_GENERATION.md](AUTO_TITLE_GENERATION.md) | Automatic title generation |
## Development
| Document | Description |
|----------|-------------|
| [TODO.md](TODO.md) | Planned features and known issues |
## Getting Started
1. **New to DeerFlow?** Start with [SETUP.md](SETUP.md) for quick installation
2. **Configuring the system?** See [CONFIGURATION.md](CONFIGURATION.md)
3. **Understanding the architecture?** Read [ARCHITECTURE.md](ARCHITECTURE.md)
4. **Building integrations?** Check [API.md](API.md) for API reference
## Document Organization
```
docs/
├── README.md # This file
├── ARCHITECTURE.md # System architecture
├── API.md # API reference
├── CONFIGURATION.md # Configuration guide
├── SETUP.md # Setup instructions
├── FILE_UPLOAD.md # File upload feature
├── PATH_EXAMPLES.md # Path usage examples
├── summarization.md # Summarization feature
├── plan_mode_usage.md # Plan mode feature
├── STREAMING.md # Token-level streaming design
├── AUTO_TITLE_GENERATION.md # Title generation
├── TITLE_GENERATION_IMPLEMENTATION.md # Title implementation details
└── TODO.md # Roadmap and issues
```

View File

@@ -0,0 +1,92 @@
# Setup Guide
Quick setup instructions for DeerFlow.
## Configuration Setup
DeerFlow uses a YAML configuration file that should be placed in the **project root directory**.
### Steps
1. **Navigate to project root**:
```bash
cd /path/to/deer-flow
```
2. **Copy example configuration**:
```bash
cp config.example.yaml config.yaml
```
3. **Edit configuration**:
```bash
# Option A: Set environment variables (recommended)
export OPENAI_API_KEY="your-key-here"
# Option B: Edit config.yaml directly
vim config.yaml # or your preferred editor
```
4. **Verify configuration**:
```bash
cd backend
python -c "from deerflow.config import get_app_config; print('✓ Config loaded:', get_app_config().models[0].name)"
```
## Important Notes
- **Location**: `config.yaml` should be in `deer-flow/` (project root), not `deer-flow/backend/`
- **Git**: `config.yaml` is automatically ignored by git (contains secrets)
- **Priority**: If both `backend/config.yaml` and `../config.yaml` exist, backend version takes precedence
## Configuration File Locations
The backend searches for `config.yaml` in this order:
1. `DEER_FLOW_CONFIG_PATH` environment variable (if set)
2. `backend/config.yaml` (current directory when running from backend/)
3. `deer-flow/config.yaml` (parent directory - **recommended location**)
**Recommended**: Place `config.yaml` in project root (`deer-flow/config.yaml`).
## Sandbox Setup (Optional but Recommended)
If you plan to use Docker/Container-based sandbox (configured in `config.yaml` under `sandbox.use: deerflow.community.aio_sandbox:AioSandboxProvider`), it's highly recommended to pre-pull the container image:
```bash
# From project root
make setup-sandbox
```
**Why pre-pull?**
- The sandbox image (~500MB+) is pulled on first use, causing a long wait
- Pre-pulling provides clear progress indication
- Avoids confusion when first using the agent
If you skip this step, the image will be automatically pulled on first agent execution, which may take several minutes depending on your network speed.
## Troubleshooting
### Config file not found
```bash
# Check where the backend is looking
cd deer-flow/backend
python -c "from deerflow.config.app_config import AppConfig; print(AppConfig.resolve_config_path())"
```
If it can't find the config:
1. Ensure you've copied `config.example.yaml` to `config.yaml`
2. Verify you're in the correct directory
3. Check the file exists: `ls -la ../config.yaml`
### Permission denied
```bash
chmod 600 ../config.yaml # Protect sensitive configuration
```
## See Also
- [Configuration Guide](CONFIGURATION.md) - Detailed configuration options
- [Architecture Overview](../CLAUDE.md) - System architecture

View File

@@ -0,0 +1,351 @@
# DeerFlow 流式输出设计
本文档解释 DeerFlow 是如何把 LangGraph agent 的事件流端到端送到两类消费者HTTP 客户端、嵌入式 Python 调用方)的:两条路径为什么**必须**并存、它们各自的契约是什么、以及设计里那些 non-obvious 的不变式。
---
## TL;DR
- DeerFlow 有**两条并行**的流式路径:**Gateway 路径**async / HTTP SSE / JSON 序列化)服务浏览器和 IM 渠道;**DeerFlowClient 路径**sync / in-process / 原生 LangChain 对象)服务 Jupyter、脚本、测试。它们**无法合并**——消费者模型不同。
- 两条路径都从 `create_agent()` 工厂出发,核心都是订阅 LangGraph 的 `stream_mode=["values", "messages", "custom"]``values` 是节点级 state 快照,`messages` 是 LLM token 级 delta`custom` 是显式 `StreamWriter` 事件。**这三种模式不是详细程度的梯度,是三个独立的事件源**,要 token 流就必须显式订阅 `messages`
- 嵌入式 client 为每个 `stream()` 调用维护三个 `set[str]``seen_ids` / `streamed_ids` / `counted_usage_ids`。三者看起来相似但管理**三个独立的不变式**,不能合并。
---
## 为什么有两条流式路径
两条路径服务的消费者模型根本不同:
| 维度 | Gateway 路径 | DeerFlowClient 路径 |
|---|---|---|
| 入口 | FastAPI `/runs/stream` endpoint | `DeerFlowClient.stream(message)` |
| 触发层 | `runtime/runs/worker.py::run_agent` | `packages/harness/deerflow/client.py::DeerFlowClient.stream` |
| 执行模型 | `async def` + `agent.astream()` | sync generator + `agent.stream()` |
| 事件传输 | `StreamBridge`asyncio Queue+ `sse_consumer` | 直接 `yield` |
| 序列化 | `serialize(chunk)` → 纯 JSON dict匹配 LangGraph Platform wire 格式 | `StreamEvent.data`,携带原生 LangChain 对象 |
| 消费者 | 前端 `useStream` React hook、飞书/Slack/Telegram channel、LangGraph SDK 客户端 | Jupyter notebook、集成测试、内部 Python 脚本 |
| 生命周期管理 | `RunManager`run_id 跟踪、disconnect 语义、multitask 策略、heartbeat | 无;函数返回即结束 |
| 断连恢复 | `Last-Event-ID` SSE 重连 | 无需要 |
**两条路径的存在是 DRY 的刻意妥协**Gateway 的全部基础设施async + Queue + JSON + RunManager**都是为了跨网络边界把事件送给 HTTP 消费者**。当生产者agent和消费者Python 调用栈)在同一个进程时,这整套东西都是纯开销。
### 为什么不能让 DeerFlowClient 复用 Gateway
曾经考虑过三种复用方案,都被否决:
1. **让 `client.stream()` 变成 `async def client.astream()`**
breaking change。用户用不上的 `async for` / `asyncio.run()` 要硬塞进 Jupyter notebook 和同步脚本。DeerFlowClient 的一大卖点("把 agent 当普通函数调用")直接消失。
2. **在 `client.stream()` 内部起一个独立事件循环线程,用 `StreamBridge` 在 sync/async 之间做桥接**
引入线程池、队列、信号量。为了"消除重复",把**复杂度**代替代码行数引进来。是典型的"wrong abstraction"——开销高于复用收益。
3. **让 `run_agent` 自己兼容 sync mode**
给 Gateway 加一条用不到的死分支,污染 worker.py 的焦点。
所以两条路径的事件处理逻辑会**相似但不共享**。这是刻意设计,不是疏忽。
---
## LangGraph `stream_mode` 三层语义
LangGraph 的 `agent.stream(stream_mode=[...])` 是**多路复用**接口:一次订阅多个 mode每个 mode 是一个独立的事件源。三种核心 mode
```mermaid
flowchart LR
classDef values fill:#B8C5D1,stroke:#5A6B7A,color:#2C3E50
classDef messages fill:#C9B8A8,stroke:#7A6B5A,color:#2C3E50
classDef custom fill:#B5C4B1,stroke:#5A7A5A,color:#2C3E50
subgraph LG["LangGraph agent graph"]
direction TB
Node1["node: LLM call"]
Node2["node: tool call"]
Node3["node: reducer"]
end
LG -->|"每个节点完成后"| V["values: 完整 state 快照"]
Node1 -->|"LLM 每产生一个 token"| M["messages: (AIMessageChunk, meta)"]
Node1 -->|"StreamWriter.write()"| C["custom: 任意 dict"]
class V values
class M messages
class C custom
```
| Mode | 发射时机 | Payload | 粒度 |
|---|---|---|---|
| `values` | 每个 graph 节点完成后 | 完整 state dicttitle、messages、artifacts| 节点级 |
| `messages` | LLM 每次 yield 一个 chunktool 节点完成时 | `(AIMessageChunk \| ToolMessage, metadata_dict)` | token 级 |
| `custom` | 用户代码显式调用 `StreamWriter.write()` | 任意 dict | 应用定义 |
### 两套命名的由来
同一件事在**三个协议层**有三个名字:
```
Application HTTP / SSE LangGraph Graph
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ frontend │ │ LangGraph │ │ agent.astream│
│ useStream │──"messages- │ Platform SDK │──"messages"──│ graph.astream│
│ Feishu IM │ tuple"──────│ HTTP wire │ │ │
└──────────────┘ └──────────────┘ └──────────────┘
```
- **Graph 层**`agent.stream` / `agent.astream`LangGraph Python 直接 APImode 叫 **`"messages"`**。
- **Platform SDK 层**`langgraph-sdk` HTTP client跨进程 HTTP 契约mode 叫 **`"messages-tuple"`**。
- **Gateway worker** 显式做翻译:`if m == "messages-tuple": lg_modes.append("messages")``runtime/runs/worker.py:117-121`)。
**后果**`DeerFlowClient.stream()` 直接调 `agent.stream()`Graph 层),所以必须传 `"messages"``app/channels/manager.py` 通过 `langgraph-sdk` 走 HTTP SDK所以传 `"messages-tuple"`。**这两个字符串不能互相替代**,也不能抽成"一个共享常量"——它们是不同协议层的 type alias共享只会让某一层说不是它母语的话。
---
## Gateway 路径async + HTTP SSE
```mermaid
sequenceDiagram
participant Client as HTTP Client
participant API as FastAPI<br/>thread_runs.py
participant Svc as services.py<br/>start_run
participant Worker as worker.py<br/>run_agent (async)
participant Bridge as StreamBridge<br/>(asyncio.Queue)
participant Agent as LangGraph<br/>agent.astream
participant SSE as sse_consumer
Client->>API: POST /runs/stream
API->>Svc: start_run(body)
Svc->>Bridge: create bridge
Svc->>Worker: asyncio.create_task(run_agent(...))
Svc-->>API: StreamingResponse(sse_consumer)
API-->>Client: event-stream opens
par worker (producer)
Worker->>Agent: astream(stream_mode=lg_modes)
loop 每个 chunk
Agent-->>Worker: (mode, chunk)
Worker->>Bridge: publish(run_id, event, serialize(chunk))
end
Worker->>Bridge: publish_end(run_id)
and sse_consumer (consumer)
SSE->>Bridge: subscribe(run_id)
loop 每个 event
Bridge-->>SSE: StreamEvent
SSE-->>Client: "event: <name>\ndata: <json>\n\n"
end
end
```
关键组件:
- `runtime/runs/worker.py::run_agent` — 在 `asyncio.Task` 里跑 `agent.astream()`,把每个 chunk 通过 `serialize(chunk, mode=mode)` 转成 JSON`bridge.publish()`
- `runtime/stream_bridge` — 抽象 Queue。`publish/subscribe` 解耦生产者和消费者,支持 `Last-Event-ID` 重连、心跳、多订阅者 fan-out。
- `app/gateway/services.py::sse_consumer` — 从 bridge 订阅,格式化为 SSE wire 帧。
- `runtime/serialization.py::serialize` — mode-aware 序列化;`messages` mode 下 `serialize_messages_tuple``(chunk, metadata)` 转成 `[chunk.model_dump(), metadata]`
**`StreamBridge` 的存在价值**:当生产者(`run_agent` 任务和消费者HTTP 连接)在不同的 asyncio task 里运行时,需要一个可以跨 task 传递事件的中介。Queue 同时还承担断连重连的 buffer 和多订阅者的 fan-out。
---
## DeerFlowClient 路径sync + in-process
```mermaid
sequenceDiagram
participant User as Python caller
participant Client as DeerFlowClient.stream
participant Agent as LangGraph<br/>agent.stream (sync)
User->>Client: for event in client.stream("hi"):
Client->>Agent: stream(stream_mode=["values","messages","custom"])
loop 每个 chunk
Agent-->>Client: (mode, chunk)
Client->>Client: 分发 mode<br/>构建 StreamEvent
Client-->>User: yield StreamEvent
end
Client-->>User: yield StreamEvent(type="end")
```
对比之下sync 路径的每个环节都是显著更少的移动部件:
- 没有 `RunManager` —— 一次 `stream()` 调用对应一次生命周期,无需 run_id。
- 没有 `StreamBridge` —— 直接 `yield`,生产和消费在同一个 Python 调用栈,不需要跨 task 中介。
- 没有 JSON 序列化 —— `StreamEvent.data` 直接装原生 LangChain 对象(`AIMessage.content``usage_metadata``UsageMetadata` TypedDict。Jupyter 用户拿到的是真正的类型,不是匿名 dict。
- 没有 asyncio —— 调用者可以直接 `for event in ...`,不必写 `async for`
---
## 消费语义delta vs cumulative
LangGraph `messages` mode 给出的是 **delta**:每个 `AIMessageChunk.content` 只包含这一次新 yield 的 token**不是**从头的累计文本。
这个语义和 LangChain 的 `fs2 Stream` 风格一致:**上游发增量,下游负责累加**。Gateway 路径里前端 `useStream` React hook 自己维护累加器DeerFlowClient 路径里 `chat()` 方法替调用者做累加。
### `DeerFlowClient.chat()` 的 O(n) 累加器
```python
chunks: dict[str, list[str]] = {}
last_id: str = ""
for event in self.stream(message, thread_id=thread_id, **kwargs):
if event.type == "messages-tuple" and event.data.get("type") == "ai":
msg_id = event.data.get("id") or ""
delta = event.data.get("content", "")
if delta:
chunks.setdefault(msg_id, []).append(delta)
last_id = msg_id
return "".join(chunks.get(last_id, ()))
```
**为什么不是 `buffers[id] = buffers.get(id,"") + delta`**CPython 的字符串 in-place concat 优化仅在 refcount=1 且 LHS 是 local name 时生效;这里字符串存在 dict 里被 reassign优化失效每次都是 O(n) 拷贝 → 总体 O(n²)。实测 50 KB / 5000 chunk 的回复要 100-300ms 纯拷贝开销。用 `list` + `"".join()` 是 O(n)。
---
## 三个 id set 为什么不能合并
`DeerFlowClient.stream()` 在一次调用生命周期内维护三个 `set[str]`
```python
seen_ids: set[str] = set() # values 路径内部 dedup
streamed_ids: set[str] = set() # messages → values 跨模式 dedup
counted_usage_ids: set[str] = set() # usage_metadata 幂等计数
```
乍看像是"三份几乎一样的东西",实际每个管**不同的不变式**。
| Set | 负责的不变式 | 被谁填充 | 被谁查询 |
|---|---|---|---|
| `seen_ids` | 连续两个 `values` 快照里同一条 message 只生成一个 `messages-tuple` 事件 | values 分支每处理一条消息就加入 | values 分支处理下一条消息前检查 |
| `streamed_ids` | 如果一条消息已经通过 `messages` 模式 token 级流过values 快照到达时**不要**再合成一次完整 `messages-tuple` | messages 分支每发一个 AI/tool 事件就加入 | values 分支看到消息时检查 |
| `counted_usage_ids` | 同一个 `usage_metadata` 在 messages 末尾 chunk 和 values 快照的 final AIMessage 里各带一份,**累计总量只算一次** | `_account_usage()` 每次接受 usage 就加入 | `_account_usage()` 每次调用时检查 |
### 为什么不能只用一个 set
关键观察:**同一个 message id 在这三个 set 里的加入时机不同**。
```mermaid
sequenceDiagram
participant M as messages mode
participant V as values mode
participant SS as streamed_ids
participant SU as counted_usage_ids
participant SE as seen_ids
Note over M: 第一个 AI text chunk 到达
M->>SS: add(msg_id)
Note over M: 最后一个 chunk 带 usage
M->>SU: add(msg_id)
Note over V: snapshot 到达,包含同一条 AI message
V->>SE: add(msg_id)
V->>SS: 查询 → 已存在,跳过文本合成
V->>SU: 查询 → 已存在,不重复计数
```
- `seen_ids` **永远在 values 快照到达时**加入,所以它是 "values 已处理" 的标记。一条只出现在 messages 流里的消息(罕见但可能),`seen_ids` 里永远没有它。
- `streamed_ids` **在 messages 流的第一个有效事件时**加入。一条只通过 values 快照到达的非 AI 消息HumanMessage、被 truncate 的 tool 消息),`streamed_ids` 里永远没有它。
- `counted_usage_ids` **只在看到非空 `usage_metadata` 时**加入。一条完全没有 usage 的消息tool message、错误消息永远不会进去。
**集合包含关系**`counted_usage_ids ⊆ (streamed_ids seen_ids)` 大致成立,但**不是严格子集**,因为一条消息可以在 messages 模式流完 text 但**在最后那个带 usage 的 chunk 之前**就被 values snapshot 赶上——此时它已经在 `streamed_ids` 里,但还不在 `counted_usage_ids` 里。把它们合并成一个 dict-of-flags 会让这个微妙的时序依赖**从类型系统里消失**,变成注释里的一句话。三个独立的 set 把不变式显式化了:每个 set 名对应一个可以口头回答的问题。
---
## 端到端:一次真实对话的事件时序
假设调用 `client.stream("Count from 1 to 15")`LLM 给出 "one\ntwo\n...\nfifteen"88 字符tokenizer 把它拆成 ~35 个 BPE chunk。下面是事件到达序列的精简版
```mermaid
sequenceDiagram
participant U as User
participant C as DeerFlowClient
participant A as LangGraph<br/>agent.stream
U->>C: stream("Count ... 15")
C->>A: stream(mode=["values","messages","custom"])
A-->>C: ("values", {messages: [HumanMessage]})
C-->>U: StreamEvent(type="values", ...)
Note over A,C: LLM 开始 yield token
loop 35 次,约 476ms
A-->>C: ("messages", (AIMessageChunk(content="ele"), meta))
C->>C: streamed_ids.add(ai-1)
C-->>U: StreamEvent(type="messages-tuple",<br/>data={type:ai, content:"ele", id:ai-1})
end
Note over A: LLM finish_reason=stop最后一个 chunk 带 usage
A-->>C: ("messages", (AIMessageChunk(content="", usage_metadata={...}), meta))
C->>C: counted_usage_ids.add(ai-1)<br/>(无文本,不 yield)
A-->>C: ("values", {messages: [..., AIMessage(complete)]})
C->>C: ai-1 in streamed_ids → 跳过合成
C->>C: 捕获 usage (已在 counted_usage_idsno-op)
C-->>U: StreamEvent(type="values", ...)
C-->>U: StreamEvent(type="end", data={usage:{...}})
```
关键观察:
1. 用户看到 **35 个 messages-tuple 事件**,跨越约 476ms每个事件带一个 token delta 和同一个 `id=ai-1`
2. 最后一个 `values` 快照里的 `AIMessage` **不会**再触发一个完整的 `messages-tuple` 事件——因为 `ai-1 in streamed_ids` 跳过了合成。
3. `end` 事件里的 `usage` 正好等于那一份 cumulative usage**不是它的两倍**——`counted_usage_ids` 在 messages 末尾 chunk 上已经吸收了values 分支的重复访问是 no-op。
4. 消费者拿到的 `content` 是**增量**"ele" 只包含 3 个字符,不是 "one\ntwo\n...ele"。想要完整文本要按 `id` 累加,`chat()` 已经帮你做了。
---
## 为什么这个设计容易出 bug以及测试策略
本文档的直接起因是 bytedance/deer-flow#1969`DeerFlowClient.stream()` 原本只订阅 `["values", "custom"]`**漏了 `"messages"`**。结果 `client.stream("hello")` 等价于一次性返回,视觉上和 `chat()` 没区别。
这类 bug 有三个结构性原因:
1. **多协议层命名**`messages` / `messages-tuple` / HTTP SSE `messages` 是同一概念的三个名字。在其中一层出错不会在另外两层报错。
2. **多消费者模型**Gateway 和 DeerFlowClient 是两套独立实现,**没有单一的"订阅哪些 mode"的 single source of truth**。前者订阅对了不代表后者也订阅对了。
3. **mock 测试绕开了真实路径**:老测试用 `agent.stream.return_value = iter([dict_chunk, ...])` 喂 values 形状的 dict 模拟 state 快照。这样构造的输入**永远不会进入 `messages` mode 分支**,所以即使 `stream_mode` 里少一个元素CI 依然全绿。
### 防御手段
真正的防线是**显式断言 "messages" mode 被订阅 + 用真实 chunk shape mock**
```python
# tests/test_client.py::test_messages_mode_emits_token_deltas
agent.stream.return_value = iter([
("messages", (AIMessageChunk(content="Hel", id="ai-1"), {})),
("messages", (AIMessageChunk(content="lo ", id="ai-1"), {})),
("messages", (AIMessageChunk(content="world!", id="ai-1"), {})),
("values", {"messages": [HumanMessage(...), AIMessage(content="Hello world!", id="ai-1")]}),
])
# ...
assert [e.data["content"] for e in ai_text_events] == ["Hel", "lo ", "world!"]
assert len(ai_text_events) == 3 # values snapshot must NOT re-synthesize
assert "messages" in agent.stream.call_args.kwargs["stream_mode"]
```
**为什么这比"抽一个共享常量"更有效**:共享常量只能保证"用它的人写对字符串",但新增消费者的人可能根本不知道常量在哪。行为断言强制任何改动都要穿过**实际执行路径**,改回 `["values", "custom"]` 会立刻让 `assert "messages" in ...` 失败。
### 活体信号BPE 子词边界
回归的最终验证是让真实 LLM 数 1-15然后看是否能在输出里看到 tokenizer 的子词切分:
```
[5.460s] 'ele' / 'ven' eleven 被拆成两个 token
[5.508s] 'tw' / 'elve' twelve 拆两个
[5.568s] 'th' / 'irteen' thirteen 拆两个
[5.623s] 'four'/ 'teen' fourteen 拆两个
[5.677s] 'f' / 'if' / 'teen' fifteen 拆三个
```
子词切分是 tokenizer 的外部事实,**无法伪造**。能看到它就说明数据流**逐 chunk** 地穿过了整条管道,没有被任何中间层缓冲成整段。这种"活体信号"在流式系统里是比单元测试更高置信度的证据。
---
## 相关源码定位
| 关心什么 | 看这里 |
|---|---|
| DeerFlowClient 嵌入式流 | `packages/harness/deerflow/client.py::DeerFlowClient.stream` |
| `chat()` 的 delta 累加器 | `packages/harness/deerflow/client.py::DeerFlowClient.chat` |
| Gateway async 流 | `packages/harness/deerflow/runtime/runs/worker.py::run_agent` |
| HTTP SSE 帧输出 | `app/gateway/services.py::sse_consumer` / `format_sse` |
| 序列化到 wire 格式 | `packages/harness/deerflow/runtime/serialization.py` |
| LangGraph mode 命名翻译 | `packages/harness/deerflow/runtime/runs/worker.py:117-121` |
| 飞书渠道的增量卡片更新 | `app/channels/manager.py::_handle_streaming_chat` |
| Channels 自带的 delta/cumulative 防御性累加 | `app/channels/manager.py::_merge_stream_text` |
| Frontend useStream 支持的 mode 集合 | `frontend/src/core/api/stream-mode.ts` |
| 核心回归测试 | `backend/tests/test_client.py::TestStream::test_messages_mode_emits_token_deltas` |

View File

@@ -0,0 +1,222 @@
# 自动 Title 生成功能实现总结
## ✅ 已完成的工作
### 1. 核心实现文件
#### [`packages/harness/deerflow/agents/thread_state.py`](../packages/harness/deerflow/agents/thread_state.py)
- ✅ 添加 `title: str | None = None` 字段到 `ThreadState`
#### [`packages/harness/deerflow/config/title_config.py`](../packages/harness/deerflow/config/title_config.py) (新建)
- ✅ 创建 `TitleConfig` 配置类
- ✅ 支持配置enabled, max_words, max_chars, model_name, prompt_template
- ✅ 提供 `get_title_config()``set_title_config()` 函数
- ✅ 提供 `load_title_config_from_dict()` 从配置文件加载
#### [`packages/harness/deerflow/agents/middlewares/title_middleware.py`](../packages/harness/deerflow/agents/middlewares/title_middleware.py) (新建)
- ✅ 创建 `TitleMiddleware`
- ✅ 实现 `_should_generate_title()` 检查是否需要生成
- ✅ 实现 `_generate_title()` 调用 LLM 生成标题
- ✅ 实现 `after_agent()` 钩子,在首次对话后自动触发
- ✅ 包含 fallback 策略LLM 失败时使用用户消息前几个词)
#### [`packages/harness/deerflow/config/app_config.py`](../packages/harness/deerflow/config/app_config.py)
- ✅ 导入 `load_title_config_from_dict`
- ✅ 在 `from_file()` 中加载 title 配置
#### [`packages/harness/deerflow/agents/lead_agent/agent.py`](../packages/harness/deerflow/agents/lead_agent/agent.py)
- ✅ 导入 `TitleMiddleware`
- ✅ 注册到 `middleware` 列表:`[SandboxMiddleware(), TitleMiddleware()]`
### 2. 配置文件
#### [`config.yaml`](../../config.example.yaml)
- ✅ 添加 title 配置段:
```yaml
title:
enabled: true
max_words: 6
max_chars: 60
model_name: null
```
### 3. 文档
#### [`docs/AUTO_TITLE_GENERATION.md`](../docs/AUTO_TITLE_GENERATION.md) (新建)
- ✅ 完整的功能说明文档
- ✅ 实现方式和架构设计
- ✅ 配置说明
- ✅ 客户端使用示例TypeScript
- ✅ 工作流程图Mermaid
- ✅ 故障排查指南
- ✅ State vs Metadata 对比
#### [`TODO.md`](TODO.md)
- ✅ 添加功能完成记录
### 4. 测试
#### [`tests/test_title_generation.py`](../tests/test_title_generation.py) (新建)
- ✅ 配置类测试
- ✅ Middleware 初始化测试
- ✅ TODO: 集成测试(需要 mock Runtime
---
## 🎯 核心设计决策
### 为什么使用 State 而非 Metadata
| 方面 | State (✅ 采用) | Metadata (❌ 未采用) |
|------|----------------|---------------------|
| **持久化** | 自动(通过 checkpointer | 取决于实现,不可靠 |
| **版本控制** | 支持时间旅行 | 不支持 |
| **类型安全** | TypedDict 定义 | 任意字典 |
| **标准化** | LangGraph 核心机制 | 扩展功能 |
### 工作流程
```
用户发送首条消息
Agent 处理并返回回复
TitleMiddleware.after_agent() 触发
检查:是否首次对话?是否已有 title
调用 LLM 生成 title
返回 {"title": "..."} 更新 state
Checkpointer 自动持久化(如果配置了)
客户端从 state.values.title 读取
```
---
## 📋 使用指南
### 后端配置
1. **启用/禁用功能**
```yaml
# config.yaml
title:
enabled: true # 设为 false 禁用
```
2. **自定义配置**
```yaml
title:
enabled: true
max_words: 8 # 标题最多 8 个词
max_chars: 80 # 标题最多 80 个字符
model_name: null # 使用默认模型
```
3. **配置持久化(可选)**
如果需要在本地开发时持久化 title
```python
# checkpointer.py
from langgraph.checkpoint.sqlite import SqliteSaver
checkpointer = SqliteSaver.from_conn_string("checkpoints.db")
```
```json
// langgraph.json
{
"graphs": {
"lead_agent": "deerflow.agents:lead_agent"
},
"checkpointer": "checkpointer:checkpointer"
}
```
### 客户端使用
```typescript
// 获取 thread title
const state = await client.threads.getState(threadId);
const title = state.values.title || "New Conversation";
// 显示在对话列表
<li>{title}</li>
```
**⚠️ 注意**Title 在 `state.values.title`,而非 `thread.metadata.title`
---
## 🧪 测试
```bash
# 运行测试
pytest tests/test_title_generation.py -v
# 运行所有测试
pytest
```
---
## 🔍 故障排查
### Title 没有生成?
1. 检查配置:`title.enabled = true`
2. 查看日志:搜索 "Generated thread title"
3. 确认是首次对话1 个用户消息 + 1 个助手回复)
### Title 生成但看不到?
1. 确认读取位置:`state.values.title`(不是 `thread.metadata.title`
2. 检查 API 响应是否包含 title
3. 重新获取 state
### Title 重启后丢失?
1. 本地开发需要配置 checkpointer
2. LangGraph Platform 会自动持久化
3. 检查数据库确认 checkpointer 工作正常
---
## 📊 性能影响
- **延迟增加**:约 0.5-1 秒LLM 调用)
- **并发安全**:在 `after_agent` 中运行,不阻塞主流程
- **资源消耗**:每个 thread 只生成一次
### 优化建议
1. 使用更快的模型(如 `gpt-3.5-turbo`
2. 减少 `max_words``max_chars`
3. 调整 prompt 使其更简洁
---
## 🚀 下一步
- [ ] 添加集成测试(需要 mock LangGraph Runtime
- [ ] 支持自定义 prompt template
- [ ] 支持多语言 title 生成
- [ ] 添加 title 重新生成功能
- [ ] 监控 title 生成成功率和延迟
---
## 📚 相关资源
- [完整文档](../docs/AUTO_TITLE_GENERATION.md)
- [LangGraph Middleware](https://langchain-ai.github.io/langgraph/concepts/middleware/)
- [LangGraph State 管理](https://langchain-ai.github.io/langgraph/concepts/low_level/#state)
- [LangGraph Checkpointer](https://langchain-ai.github.io/langgraph/concepts/persistence/)
---
*实现完成时间: 2026-01-14*

View File

@@ -0,0 +1,34 @@
# TODO List
## Completed Features
- [x] Launch the sandbox only after the first file system or bash tool is called
- [x] Add Clarification Process for the whole process
- [x] Implement Context Summarization Mechanism to avoid context explosion
- [x] Integrate MCP (Model Context Protocol) for extensible tools
- [x] Add file upload support with automatic document conversion
- [x] Implement automatic thread title generation
- [x] Add Plan Mode with TodoList middleware
- [x] Add vision model support with ViewImageMiddleware
- [x] Skills system with SKILL.md format
## Planned Features
- [ ] Pooling the sandbox resources to reduce the number of sandbox containers
- [ ] Add authentication/authorization layer
- [ ] Implement rate limiting
- [ ] Add metrics and monitoring
- [ ] Support for more document formats in upload
- [ ] Skill marketplace / remote skill installation
- [ ] Optimize async concurrency in agent hot path (IM channels multi-task scenario)
- Replace `time.sleep(5)` with `asyncio.sleep()` in `packages/harness/deerflow/tools/builtins/task_tool.py` (subagent polling)
- Replace `subprocess.run()` with `asyncio.create_subprocess_shell()` in `packages/harness/deerflow/sandbox/local/local_sandbox.py`
- Replace sync `requests` with `httpx.AsyncClient` in community tools (tavily, jina_ai, firecrawl, infoquest, image_search)
- Replace sync `model.invoke()` with async `model.ainvoke()` in title_middleware and memory updater
- Consider `asyncio.to_thread()` wrapper for remaining blocking file I/O
- For production: use `langgraph up` (multi-worker) instead of `langgraph dev` (single-worker)
## Resolved Issues
- [x] Make sure that no duplicated files in `state.artifacts`
- [x] Long thinking but with empty content (answer inside thinking process)

View File

@@ -0,0 +1,114 @@
{
"version": "1.0",
"lastUpdated": "2026-03-28T10:30:00Z",
"user": {
"workContext": {
"summary": "Working on DeerFlow memory management UX, including local search, local filters, clear-all, and single-fact deletion in Settings > Memory.",
"updatedAt": "2026-03-28T10:30:00Z"
},
"personalContext": {
"summary": "Prefers Chinese during collaboration, but wants GitHub PR titles and bodies written in English with a Chinese translation provided alongside them.",
"updatedAt": "2026-03-28T10:28:00Z"
},
"topOfMind": {
"summary": "Wants reviewers to be able to reproduce the memory search and filter flow quickly with pre-populated sample data.",
"updatedAt": "2026-03-28T10:26:00Z"
}
},
"history": {
"recentMonths": {
"summary": "Recently contributed multiple DeerFlow pull requests covering memory, uploads, and compatibility fixes.",
"updatedAt": "2026-03-28T10:24:00Z"
},
"earlierContext": {
"summary": "Often prefers shipping smaller, reviewable changes with explicit validation notes.",
"updatedAt": "2026-03-28T10:22:00Z"
},
"longTermBackground": {
"summary": "Actively building open-source contribution experience and improving end-to-end delivery quality.",
"updatedAt": "2026-03-28T10:20:00Z"
}
},
"facts": [
{
"id": "fact_review_001",
"content": "User prefers Chinese for day-to-day collaboration.",
"category": "preference",
"confidence": 0.95,
"createdAt": "2026-03-28T09:50:00Z",
"source": "thread_pref_cn"
},
{
"id": "fact_review_002",
"content": "PR titles and bodies should be drafted in English and accompanied by a Chinese translation.",
"category": "workflow",
"confidence": 0.93,
"createdAt": "2026-03-28T09:52:00Z",
"source": "thread_pr_style"
},
{
"id": "fact_review_003",
"content": "User implemented memory search and filter improvements in the DeerFlow settings page.",
"category": "project",
"confidence": 0.91,
"createdAt": "2026-03-28T09:54:00Z",
"source": "thread_memory_filters"
},
{
"id": "fact_review_004",
"content": "User added clear-all memory support through the gateway memory API.",
"category": "project",
"confidence": 0.89,
"createdAt": "2026-03-28T09:56:00Z",
"source": "thread_memory_clear"
},
{
"id": "fact_review_005",
"content": "User added single-fact deletion support for persisted memory entries.",
"category": "project",
"confidence": 0.9,
"createdAt": "2026-03-28T09:58:00Z",
"source": "thread_memory_delete"
},
{
"id": "fact_review_006",
"content": "Reviewer can search for keyword memory to see multiple matching facts.",
"category": "testing",
"confidence": 0.84,
"createdAt": "2026-03-28T10:00:00Z",
"source": "thread_review_demo"
},
{
"id": "fact_review_007",
"content": "Reviewer can search for keyword Chinese to verify cross-category matching.",
"category": "testing",
"confidence": 0.82,
"createdAt": "2026-03-28T10:02:00Z",
"source": "thread_review_demo"
},
{
"id": "fact_review_008",
"content": "Reviewer can search for workflow to verify category text is included in local filtering.",
"category": "testing",
"confidence": 0.81,
"createdAt": "2026-03-28T10:04:00Z",
"source": "thread_review_demo"
},
{
"id": "fact_review_009",
"content": "Delete fact testing can target this disposable sample entry.",
"category": "testing",
"confidence": 0.78,
"createdAt": "2026-03-28T10:06:00Z",
"source": "thread_delete_demo"
},
{
"id": "fact_review_010",
"content": "This sample fact is intended for edit testing.",
"category": "testing",
"confidence": 0.8,
"createdAt": "2026-03-28T10:08:00Z",
"source": "manual"
}
]
}

View File

@@ -0,0 +1,291 @@
# Middleware 执行流程
## Middleware 列表
`create_deerflow_agent` 通过 `RuntimeFeatures` 组装的完整 middleware 链(默认全开时):
| # | Middleware | `before_agent` | `before_model` | `after_model` | `after_agent` | `wrap_tool_call` | 主 Agent | Subagent | 来源 |
|---|-----------|:-:|:-:|:-:|:-:|:-:|:-:|:-:|------|
| 0 | ThreadDataMiddleware | ✓ | | | | | ✓ | ✓ | `sandbox` |
| 1 | UploadsMiddleware | ✓ | | | | | ✓ | ✗ | `sandbox` |
| 2 | SandboxMiddleware | ✓ | | | ✓ | | ✓ | ✓ | `sandbox` |
| 3 | DanglingToolCallMiddleware | | | ✓ | | | ✓ | ✗ | 始终开启 |
| 4 | GuardrailMiddleware | | | | | ✓ | ✓ | ✓ | *Phase 2 纳入* |
| 5 | ToolErrorHandlingMiddleware | | | | | ✓ | ✓ | ✓ | 始终开启 |
| 6 | SummarizationMiddleware | | | ✓ | | | ✓ | ✗ | `summarization` |
| 7 | TodoMiddleware | | | ✓ | | | ✓ | ✗ | `plan_mode` 参数 |
| 8 | TitleMiddleware | | | ✓ | | | ✓ | ✗ | `auto_title` |
| 9 | MemoryMiddleware | | | | ✓ | | ✓ | ✗ | `memory` |
| 10 | ViewImageMiddleware | | ✓ | | | | ✓ | ✗ | `vision` |
| 11 | SubagentLimitMiddleware | | | ✓ | | | ✓ | ✗ | `subagent` |
| 12 | LoopDetectionMiddleware | | | ✓ | | | ✓ | ✗ | 始终开启 |
| 13 | ClarificationMiddleware | | | ✓ | | | ✓ | ✗ | 始终最后 |
主 agent **14 个** middleware`make_lead_agent`subagent **4 个**ThreadData、Sandbox、Guardrail、ToolErrorHandling`create_deerflow_agent` Phase 1 实现 **13 个**Guardrail 仅支持自定义实例,无内置默认)。
## 执行流程
LangChain `create_agent` 的规则:
- **`before_*` 正序执行**(列表位置 0 → N
- **`after_*` 反序执行**(列表位置 N → 0
```mermaid
graph TB
START(["invoke"]) --> TD
subgraph BA ["<b>before_agent</b> 正序 0→N"]
direction TB
TD["[0] ThreadData<br/>创建线程目录"] --> UL["[1] Uploads<br/>扫描上传文件"] --> SB["[2] Sandbox<br/>获取沙箱"]
end
subgraph BM ["<b>before_model</b> 正序 0→N"]
direction TB
VI["[10] ViewImage<br/>注入图片 base64"]
end
SB --> VI
VI --> M["<b>MODEL</b>"]
subgraph AM ["<b>after_model</b> 反序 N→0"]
direction TB
CL["[13] Clarification<br/>拦截 ask_clarification"] --> LD["[12] LoopDetection<br/>检测循环"] --> SL["[11] SubagentLimit<br/>截断多余 task"] --> TI["[8] Title<br/>生成标题"] --> SM["[6] Summarization<br/>上下文压缩"] --> DTC["[3] DanglingToolCall<br/>补缺失 ToolMessage"]
end
M --> CL
subgraph AA ["<b>after_agent</b> 反序 N→0"]
direction TB
SBR["[2] Sandbox<br/>释放沙箱"] --> MEM["[9] Memory<br/>入队记忆"]
end
DTC --> SBR
MEM --> END(["response"])
classDef beforeNode fill:#a0a8b5,stroke:#636b7a,color:#2d3239
classDef modelNode fill:#b5a8a0,stroke:#7a6b63,color:#2d3239
classDef afterModelNode fill:#b5a0a8,stroke:#7a636b,color:#2d3239
classDef afterAgentNode fill:#a0b5a8,stroke:#637a6b,color:#2d3239
classDef terminalNode fill:#a8b5a0,stroke:#6b7a63,color:#2d3239
class TD,UL,SB,VI beforeNode
class M modelNode
class CL,LD,SL,TI,SM,DTC afterModelNode
class SBR,MEM afterAgentNode
class START,END terminalNode
```
## 时序图
```mermaid
sequenceDiagram
participant U as User
participant TD as ThreadDataMiddleware
participant UL as UploadsMiddleware
participant SB as SandboxMiddleware
participant VI as ViewImageMiddleware
participant M as MODEL
participant CL as ClarificationMiddleware
participant SL as SubagentLimitMiddleware
participant TI as TitleMiddleware
participant SM as SummarizationMiddleware
participant DTC as DanglingToolCallMiddleware
participant MEM as MemoryMiddleware
U ->> TD: invoke
activate TD
Note right of TD: before_agent 创建目录
TD ->> UL: before_agent
activate UL
Note right of UL: before_agent 扫描上传文件
UL ->> SB: before_agent
activate SB
Note right of SB: before_agent 获取沙箱
SB ->> VI: before_model
activate VI
Note right of VI: before_model 注入图片 base64
VI ->> M: messages + tools
activate M
M -->> CL: AI response
deactivate M
activate CL
Note right of CL: after_model 拦截 ask_clarification
CL -->> SL: after_model
deactivate CL
activate SL
Note right of SL: after_model 截断多余 task
SL -->> TI: after_model
deactivate SL
activate TI
Note right of TI: after_model 生成标题
TI -->> SM: after_model
deactivate TI
activate SM
Note right of SM: after_model 上下文压缩
SM -->> DTC: after_model
deactivate SM
activate DTC
Note right of DTC: after_model 补缺失 ToolMessage
DTC -->> VI: done
deactivate DTC
VI -->> SB: done
deactivate VI
Note right of SB: after_agent 释放沙箱
SB -->> UL: done
deactivate SB
UL -->> TD: done
deactivate UL
Note right of MEM: after_agent 入队记忆
TD -->> U: response
deactivate TD
```
## 洋葱模型
列表位置决定在洋葱中的层级 — 位置 0 最外层,位置 N 最内层:
```
进入 before_* [0] → [1] → [2] → ... → [10] → MODEL
退出 after_* MODEL → [13] → [11] → ... → [6] → [3] → [2] → [0]
↑ 最内层最先执行
```
> [!important] 核心规则
> 列表最后的 middleware其 `after_model` **最先执行**。
> ClarificationMiddleware 在列表末尾,所以它第一个拦截 model 输出。
## 对比:真正的洋葱 vs DeerFlow 的实际情况
### 真正的洋葱(如 Koa/Express
每个 middleware 同时负责 before 和 after形成对称嵌套
```mermaid
sequenceDiagram
participant U as User
participant A as AuthMiddleware
participant L as LogMiddleware
participant R as RateLimitMiddleware
participant H as Handler
U ->> A: request
activate A
Note right of A: before: 校验 token
A ->> L: next()
activate L
Note right of L: before: 记录请求时间
L ->> R: next()
activate R
Note right of R: before: 检查频率
R ->> H: next()
activate H
H -->> R: result
deactivate H
Note right of R: after: 更新计数器
R -->> L: result
deactivate R
Note right of L: after: 记录耗时
L -->> A: result
deactivate L
Note right of A: after: 清理上下文
A -->> U: response
deactivate A
```
> [!tip] 洋葱特征
> 每个 middleware 都有 before/after 对称操作,`activate` 跨越整个内层执行,形成完美嵌套。
### DeerFlow 的实际情况
不是洋葱,是管道。大部分 middleware 只用一个钩子,不存在对称嵌套。多轮对话时 before_model / after_model 循环执行:
```mermaid
sequenceDiagram
participant U as User
participant TD as ThreadData
participant UL as Uploads
participant SB as Sandbox
participant VI as ViewImage
participant M as MODEL
participant CL as Clarification
participant SL as SubagentLimit
participant TI as Title
participant SM as Summarization
participant MEM as Memory
U ->> TD: invoke
Note right of TD: before_agent 创建目录
TD ->> UL: .
Note right of UL: before_agent 扫描文件
UL ->> SB: .
Note right of SB: before_agent 获取沙箱
loop 每轮对话tool call 循环)
SB ->> VI: .
Note right of VI: before_model 注入图片
VI ->> M: messages + tools
M -->> CL: AI response
Note right of CL: after_model 拦截 ask_clarification
CL -->> SL: .
Note right of SL: after_model 截断多余 task
SL -->> TI: .
Note right of TI: after_model 生成标题
TI -->> SM: .
Note right of SM: after_model 上下文压缩
end
Note right of SB: after_agent 释放沙箱
SB -->> MEM: .
Note right of MEM: after_agent 入队记忆
MEM -->> U: response
```
> [!warning] 不是洋葱
> 14 个 middleware 中只有 SandboxMiddleware 有 before/after 对称(获取/释放)。其余都是单向的:要么只在 `before_*` 做事,要么只在 `after_*` 做事。`before_agent` / `after_agent` 只跑一次,`before_model` / `after_model` 每轮循环都跑。
硬依赖只有 2 处:
1. **ThreadData 在 Sandbox 之前** — sandbox 需要线程目录
2. **Clarification 在列表最后**`after_model` 反序时最先执行,第一个拦截 `ask_clarification`
### 结论
| | 真正的洋葱 | DeerFlow 实际 |
|---|---|---|
| 每个 middleware | before + after 对称 | 大多只用一个钩子 |
| 激活条 | 嵌套(外长内短) | 不嵌套(串行) |
| 反序的意义 | 清理与初始化配对 | 仅影响 after_model 的执行优先级 |
| 典型例子 | Auth: 校验 token / 清理上下文 | ThreadData: 只创建目录,没有清理 |
## 关键设计点
### ClarificationMiddleware 为什么在列表最后?
位置最后 = `after_model` 最先执行。它需要**第一个**看到 model 输出,检查是否有 `ask_clarification` tool call。如果有立即中断`Command(goto=END)`),后续 middleware 的 `after_model` 不再执行。
### SandboxMiddleware 的对称性
`before_agent`(正序第 3 个)获取沙箱,`after_agent`(反序第 1 个)释放沙箱。外层进入 → 外层退出,天然的洋葱对称。
### 大部分 middleware 只用一个钩子
14 个 middleware 中,只有 SandboxMiddleware 同时用了 `before_agent` + `after_agent`(获取/释放)。其余都只在一个阶段执行。洋葱模型的反序特性主要影响 `after_model` 阶段的执行顺序。

View File

@@ -0,0 +1,204 @@
# Plan Mode with TodoList Middleware
This document describes how to enable and use the Plan Mode feature with TodoList middleware in DeerFlow 2.0.
## Overview
Plan Mode adds a TodoList middleware to the agent, which provides a `write_todos` tool that helps the agent:
- Break down complex tasks into smaller, manageable steps
- Track progress as work progresses
- Provide visibility to users about what's being done
The TodoList middleware is built on LangChain's `TodoListMiddleware`.
## Configuration
### Enabling Plan Mode
Plan mode is controlled via **runtime configuration** through the `is_plan_mode` parameter in the `configurable` section of `RunnableConfig`. This allows you to dynamically enable or disable plan mode on a per-request basis.
```python
from langchain_core.runnables import RunnableConfig
from deerflow.agents.lead_agent.agent import make_lead_agent
# Enable plan mode via runtime configuration
config = RunnableConfig(
configurable={
"thread_id": "example-thread",
"thinking_enabled": True,
"is_plan_mode": True, # Enable plan mode
}
)
# Create agent with plan mode enabled
agent = make_lead_agent(config)
```
### Configuration Options
- **is_plan_mode** (bool): Whether to enable plan mode with TodoList middleware. Default: `False`
- Pass via `config.get("configurable", {}).get("is_plan_mode", False)`
- Can be set dynamically for each agent invocation
- No global configuration needed
## Default Behavior
When plan mode is enabled with default settings, the agent will have access to a `write_todos` tool with the following behavior:
### When to Use TodoList
The agent will use the todo list for:
1. Complex multi-step tasks (3+ distinct steps)
2. Non-trivial tasks requiring careful planning
3. When user explicitly requests a todo list
4. When user provides multiple tasks
### When NOT to Use TodoList
The agent will skip using the todo list for:
1. Single, straightforward tasks
2. Trivial tasks (< 3 steps)
3. Purely conversational or informational requests
### Task States
- **pending**: Task not yet started
- **in_progress**: Currently working on (can have multiple parallel tasks)
- **completed**: Task finished successfully
## Usage Examples
### Basic Usage
```python
from langchain_core.runnables import RunnableConfig
from deerflow.agents.lead_agent.agent import make_lead_agent
# Create agent with plan mode ENABLED
config_with_plan_mode = RunnableConfig(
configurable={
"thread_id": "example-thread",
"thinking_enabled": True,
"is_plan_mode": True, # TodoList middleware will be added
}
)
agent_with_todos = make_lead_agent(config_with_plan_mode)
# Create agent with plan mode DISABLED (default)
config_without_plan_mode = RunnableConfig(
configurable={
"thread_id": "another-thread",
"thinking_enabled": True,
"is_plan_mode": False, # No TodoList middleware
}
)
agent_without_todos = make_lead_agent(config_without_plan_mode)
```
### Dynamic Plan Mode per Request
You can enable/disable plan mode dynamically for different conversations or tasks:
```python
from langchain_core.runnables import RunnableConfig
from deerflow.agents.lead_agent.agent import make_lead_agent
def create_agent_for_task(task_complexity: str):
"""Create agent with plan mode based on task complexity."""
is_complex = task_complexity in ["high", "very_high"]
config = RunnableConfig(
configurable={
"thread_id": f"task-{task_complexity}",
"thinking_enabled": True,
"is_plan_mode": is_complex, # Enable only for complex tasks
}
)
return make_lead_agent(config)
# Simple task - no TodoList needed
simple_agent = create_agent_for_task("low")
# Complex task - TodoList enabled for better tracking
complex_agent = create_agent_for_task("high")
```
## How It Works
1. When `make_lead_agent(config)` is called, it extracts `is_plan_mode` from `config.configurable`
2. The config is passed to `_build_middlewares(config)`
3. `_build_middlewares()` reads `is_plan_mode` and calls `_create_todo_list_middleware(is_plan_mode)`
4. If `is_plan_mode=True`, a `TodoListMiddleware` instance is created and added to the middleware chain
5. The middleware automatically adds a `write_todos` tool to the agent's toolset
6. The agent can use this tool to manage tasks during execution
7. The middleware handles the todo list state and provides it to the agent
## Architecture
```
make_lead_agent(config)
├─> Extracts: is_plan_mode = config.configurable.get("is_plan_mode", False)
└─> _build_middlewares(config)
├─> ThreadDataMiddleware
├─> SandboxMiddleware
├─> SummarizationMiddleware (if enabled via global config)
├─> TodoListMiddleware (if is_plan_mode=True) ← NEW
├─> TitleMiddleware
└─> ClarificationMiddleware
```
## Implementation Details
### Agent Module
- **Location**: `packages/harness/deerflow/agents/lead_agent/agent.py`
- **Function**: `_create_todo_list_middleware(is_plan_mode: bool)` - Creates TodoListMiddleware if plan mode is enabled
- **Function**: `_build_middlewares(config: RunnableConfig)` - Builds middleware chain based on runtime config
- **Function**: `make_lead_agent(config: RunnableConfig)` - Creates agent with appropriate middlewares
### Runtime Configuration
Plan mode is controlled via the `is_plan_mode` parameter in `RunnableConfig.configurable`:
```python
config = RunnableConfig(
configurable={
"is_plan_mode": True, # Enable plan mode
# ... other configurable options
}
)
```
## Key Benefits
1. **Dynamic Control**: Enable/disable plan mode per request without global state
2. **Flexibility**: Different conversations can have different plan mode settings
3. **Simplicity**: No need for global configuration management
4. **Context-Aware**: Plan mode decision can be based on task complexity, user preferences, etc.
## Custom Prompts
DeerFlow uses custom `system_prompt` and `tool_description` for the TodoListMiddleware that match the overall DeerFlow prompt style:
### System Prompt Features
- Uses XML tags (`<todo_list_system>`) for structure consistency with DeerFlow's main prompt
- Emphasizes CRITICAL rules and best practices
- Clear "When to Use" vs "When NOT to Use" guidelines
- Focuses on real-time updates and immediate task completion
### Tool Description Features
- Detailed usage scenarios with examples
- Strong emphasis on NOT using for simple tasks
- Clear task state definitions (pending, in_progress, completed)
- Comprehensive best practices section
- Task completion requirements to prevent premature marking
The custom prompts are defined in `_create_todo_list_middleware()` in `/Users/hetao/workspace/deer-flow/backend/packages/harness/deerflow/agents/lead_agent/agent.py:57`.
## Notes
- TodoList middleware uses LangChain's built-in `TodoListMiddleware` with **custom DeerFlow-style prompts**
- Plan mode is **disabled by default** (`is_plan_mode=False`) to maintain backward compatibility
- The middleware is positioned before `ClarificationMiddleware` to allow todo management during clarification flows
- Custom prompts emphasize the same principles as DeerFlow's main system prompt (clarity, action-oriented, critical rules)

View File

@@ -0,0 +1,503 @@
# RFC: `create_deerflow_agent` — 纯参数的 SDK 工厂 API
## 1. 问题
当前 harness 的唯一公开入口是 `make_lead_agent(config: RunnableConfig)`。它内部:
```
make_lead_agent
├─ get_app_config() ← 读 config.yaml
├─ _resolve_model_name() ← 读 config.yaml
├─ load_agent_config() ← 读 agents/{name}/config.yaml
├─ create_chat_model(name) ← 读 config.yaml反射加载 model class
├─ get_available_tools() ← 读 config.yaml + extensions_config.json
├─ apply_prompt_template() ← 读 skills 目录 + memory.json
└─ _build_middlewares() ← 读 config.yamlsummarization、model vision
```
**6 处隐式 I/O** — 全部依赖文件系统。如果你想把 `deerflow-harness` 当 Python 库嵌入自己的应用,你必须准备 `config.yaml` + `extensions_config.json` + skills 目录。这对 SDK 用户是不可接受的。
### 对比
| | `langchain.create_agent` | `make_lead_agent` | `DeerFlowClient`(增强后) |
|---|---|---|---|
| 定位 | 底层原语 | 内部工厂 | **唯一公开 API** |
| 配置来源 | 纯参数 | YAML 文件 | **参数优先config fallback** |
| 内置能力 | 无 | Sandbox/Memory/Skills/Subagent/... | **按需组合 + 管理 API** |
| 用户接口 | `graph.invoke(state)` | 内部使用 | **`client.chat("hello")`** |
| 适合谁 | 写 LangChain 的人 | 内部使用 | **所有 DeerFlow 用户** |
## 2. 设计原则
### Python 中的 DI 最佳实践
1. **函数参数即注入** — 不读全局状态,所有依赖通过参数传入
2. **Protocol 定义契约** — 不依赖具体类,依赖行为接口
3. **合理默认值**`sandbox=True` 等价于 `sandbox=LocalSandboxProvider()`
4. **分层 API** — 简单用法一行搞定,复杂用法有逃生舱
### 分层架构
```
┌──────────────────────┐
│ DeerFlowClient │ ← 唯一公开 APIchat/stream + 管理)
└──────────┬───────────┘
┌──────────▼───────────┐
│ make_lead_agent │ ← 内部:配置驱动工厂
└──────────┬───────────┘
┌──────────▼───────────┐
│ create_deerflow_agent │ ← 内部:纯参数工厂
└──────────┬───────────┘
┌──────────▼───────────┐
│ langchain.create_agent│ ← 底层原语
└──────────────────────┘
```
`DeerFlowClient` 是唯一公开 API。`create_deerflow_agent``make_lead_agent` 都是内部实现。
用户通过 `DeerFlowClient` 三个参数控制行为:
| 参数 | 类型 | 职责 |
|------|------|------|
| `config` | `dict` | 覆盖 config.yaml 的任意配置项 |
| `features` | `RuntimeFeatures` | 替换内置 middleware 实现 |
| `extra_middleware` | `list[AgentMiddleware]` | 新增用户 middleware |
不传参数 → 读 config.yaml现有行为完全兼容
### 核心约束
- **配置覆盖** — `config` dict > config.yaml > 默认值
- **三层不重叠** — config 传参数features 传实例extra_middleware 传新增
- **向前兼容** — 现有 `DeerFlowClient()` 无参构造行为不变
- **harness 边界合规** — 不 import `app.*``test_harness_boundary.py` 强制)
## 3. API 设计
### 3.1 `DeerFlowClient` — 唯一公开 API
在现有构造函数上增加三个可选参数:
```python
from deerflow.client import DeerFlowClient
from deerflow.agents.features import RuntimeFeatures
client = DeerFlowClient(
# 1. config — 覆盖 config.yaml 的任意 key结构和 yaml 一致)
config={
"models": [{"name": "gpt-4o", "use": "langchain_openai:ChatOpenAI", "model": "gpt-4o", "api_key": "sk-..."}],
"memory": {"max_facts": 50, "enabled": True},
"title": {"enabled": False},
"summarization": {"enabled": True, "trigger": [{"type": "tokens", "value": 10000}]},
"sandbox": {"use": "deerflow.sandbox.local:LocalSandboxProvider"},
},
# 2. features — 替换内置 middleware 实现
features=RuntimeFeatures(
memory=MyMemoryMiddleware(),
auto_title=MyTitleMiddleware(),
),
# 3. extra_middleware — 新增用户 middleware
extra_middleware=[
MyAuditMiddleware(), # @Next(SandboxMiddleware)
MyFilterMiddleware(), # @Prev(ClarificationMiddleware)
],
)
```
三种典型用法:
```python
# 用法 1全读 config.yaml现有行为不变
client = DeerFlowClient()
# 用法 2只改参数不换实现
client = DeerFlowClient(config={"memory": {"max_facts": 50}})
# 用法 3替换 middleware 实现
client = DeerFlowClient(features=RuntimeFeatures(auto_title=MyTitleMiddleware()))
# 用法 4添加自定义 middleware
client = DeerFlowClient(extra_middleware=[MyAuditMiddleware()])
# 用法 5纯 SDK无 config.yaml
client = DeerFlowClient(config={
"models": [{"name": "gpt-4o", "use": "langchain_openai:ChatOpenAI", ...}],
"tools": [{"name": "bash", "use": "deerflow.sandbox.tools:bash_tool", "group": "bash"}],
"memory": {"enabled": True},
})
```
内部实现:`final_config = deep_merge(file_config, code_config)`
### 3.2 `create_deerflow_agent` — 内部工厂(不公开)
```python
def create_deerflow_agent(
model: BaseChatModel,
tools: list[BaseTool] | None = None,
*,
system_prompt: str | None = None,
middleware: list[AgentMiddleware] | None = None,
features: RuntimeFeatures | None = None,
state_schema: type | None = None,
checkpointer: BaseCheckpointSaver | None = None,
name: str = "default",
) -> CompiledStateGraph:
...
```
`DeerFlowClient` 内部调用此函数。
### 3.3 `RuntimeFeatures` — 内置 Middleware 替换
只做一件事:用自定义实例替换内置 middleware。不管配置参数参数走 `config` dict
```python
@dataclass
class RuntimeFeatures:
sandbox: bool | AgentMiddleware = True
memory: bool | AgentMiddleware = False
summarization: bool | AgentMiddleware = False
subagent: bool | AgentMiddleware = False
vision: bool | AgentMiddleware = False
auto_title: bool | AgentMiddleware = False
```
| 值 | 含义 |
|---|---|
| `True` | 使用默认 middleware参数从 config 读) |
| `False` | 关闭该功能 |
| `AgentMiddleware` 实例 | 替换整个实现 |
不再有 `MemoryOptions``TitleOptions` 等。参数调整走 `config` dict
```python
# 改 memory 参数 → config
client = DeerFlowClient(config={"memory": {"max_facts": 50}})
# 换 memory 实现 → features
client = DeerFlowClient(features=RuntimeFeatures(memory=MyMemoryMiddleware()))
# 两者组合 — config 参数给默认 middleware但 title 换实现
client = DeerFlowClient(
config={"memory": {"max_facts": 50}},
features=RuntimeFeatures(auto_title=MyTitleMiddleware()),
)
```
### 3.4 Middleware 链组装
不使用 priority 数字排序。按固定顺序 append 构建列表:
```python
def _resolve(spec, default_cls):
"""bool → 默认实现 / AgentMiddleware → 替换"""
if isinstance(spec, AgentMiddleware):
return spec
return default_cls()
def _assemble_from_features(feat: RuntimeFeatures, config: AppConfig) -> tuple[list, list]:
chain = []
extra_tools = []
if feat.sandbox:
chain.append(_resolve(feat.sandbox, ThreadDataMiddleware))
chain.append(UploadsMiddleware())
chain.append(_resolve(feat.sandbox, SandboxMiddleware))
chain.append(DanglingToolCallMiddleware())
chain.append(ToolErrorHandlingMiddleware())
if feat.summarization:
chain.append(_resolve(feat.summarization, SummarizationMiddleware))
if config.title.enabled and feat.auto_title is not False:
chain.append(_resolve(feat.auto_title, TitleMiddleware))
if feat.memory:
chain.append(_resolve(feat.memory, MemoryMiddleware))
if feat.vision:
chain.append(ViewImageMiddleware())
extra_tools.append(view_image_tool)
if feat.subagent:
chain.append(_resolve(feat.subagent, SubagentLimitMiddleware))
extra_tools.append(task_tool)
if feat.loop_detection:
chain.append(_resolve(feat.loop_detection, LoopDetectionMiddleware))
# 插入 extra_middleware按 @Next/@Prev 声明定位)
_insert_extra(chain, extra_middleware)
# Clarification 永远最后
chain.append(ClarificationMiddleware())
extra_tools.append(ask_clarification_tool)
return chain, extra_tools
```
### 3.6 Middleware 排序策略
**两阶段排序:内置固定 + 外置插入**
1. **内置链固定顺序** — 按代码中的 append 顺序确定,不参与 @Next/@Prev
2. **外置 middleware 插入**`extra_middleware` 中的 middleware 通过 @Next/@Prev 声明锚点,自由锚定任意 middleware内置或其他外置均可
3. **冲突检测** — 两个外置 middleware 如果 @Next@Prev 同一个目标 → `ValueError`
**这不是全排序。** 内置链的顺序在代码中已确定,外置 middleware 只做插入操作。这样可以避免内置和外置同时竞争同一个位置的问题。
### 3.7 `@Next` / `@Prev` 装饰器
用户自定义 middleware 通过装饰器声明在链中的位置,类型安全:
```python
from deerflow.agents import Next, Prev
@Next(SandboxMiddleware)
class MyAuditMiddleware(AgentMiddleware):
"""排在 SandboxMiddleware 后面"""
def before_agent(self, state, runtime):
...
@Prev(ClarificationMiddleware)
class MyFilterMiddleware(AgentMiddleware):
"""排在 ClarificationMiddleware 前面"""
def after_model(self, state, runtime):
...
```
实现:
```python
def Next(anchor: type[AgentMiddleware]):
"""装饰器:声明本 middleware 排在 anchor 的下一个位置。"""
def decorator(cls: type[AgentMiddleware]) -> type[AgentMiddleware]:
cls._next_anchor = anchor
return cls
return decorator
def Prev(anchor: type[AgentMiddleware]):
"""装饰器:声明本 middleware 排在 anchor 的前一个位置。"""
def decorator(cls: type[AgentMiddleware]) -> type[AgentMiddleware]:
cls._prev_anchor = anchor
return cls
return decorator
```
`_insert_extra` 算法:
1. 遍历 `extra_middleware`,读取每个 middleware 的 `_next_anchor` / `_prev_anchor`
2. **冲突检测**:如果两个外置 middleware 的锚点相同(同方向同目标),抛出 `ValueError`
3. 有锚点的 middleware 插入到目标位置(@Next → 目标之后,@Prev → 目标之前)
4. 无声明的 middleware 追加到 Clarification 之前
## 4. Middleware 执行模型
### LangChain 的执行规则
```
before_agent 正序 → [0] → [1] → ... → [N]
before_model 正序 → [0] → [1] → ... → [N] ← 每轮循环
MODEL
after_model 反序 ← [N] → [N-1] → ... → [0] ← 每轮循环
after_agent 反序 ← [N] → [N-1] → ... → [0]
```
`before_agent` / `after_agent` 只跑一次。`before_model` / `after_model` 每轮 tool call 循环都跑。
### DeerFlow 的实际情况
**不是洋葱,是管道。** 11 个 middleware 中只有 SandboxMiddleware 有 before/after 对称(获取/释放),其余只用一个钩子。
硬依赖只有 2 处:
1. **ThreadData 在 Sandbox 之前** — sandbox 需要线程目录
2. **Clarification 在列表最后** — after_model 反序时最先执行,第一个拦截 `ask_clarification`
详见 [middleware-execution-flow.md](middleware-execution-flow.md)。
## 5. 使用示例
### 5.1 全读 config.yaml现有行为不变
```python
from deerflow.client import DeerFlowClient
client = DeerFlowClient()
response = client.chat("Hello")
```
### 5.2 覆盖配置参数
```python
client = DeerFlowClient(config={
"memory": {"max_facts": 50},
"title": {"enabled": False},
"summarization": {"trigger": [{"type": "tokens", "value": 10000}]},
})
```
### 5.3 纯 SDK无 config.yaml
```python
client = DeerFlowClient(config={
"models": [{"name": "gpt-4o", "use": "langchain_openai:ChatOpenAI", "model": "gpt-4o", "api_key": "sk-..."}],
"tools": [
{"name": "bash", "group": "bash", "use": "deerflow.sandbox.tools:bash_tool"},
{"name": "web_search", "group": "web", "use": "deerflow.community.tavily.tools:web_search_tool"},
],
"memory": {"enabled": True, "max_facts": 50},
"sandbox": {"use": "deerflow.sandbox.local:LocalSandboxProvider"},
})
```
### 5.4 替换内置 middleware
```python
from deerflow.agents.features import RuntimeFeatures
client = DeerFlowClient(
features=RuntimeFeatures(
memory=MyMemoryMiddleware(), # 替换
auto_title=MyTitleMiddleware(), # 替换
vision=False, # 关闭
),
)
```
### 5.5 插入自定义 middleware
```python
from deerflow.agents import Next, Prev
from deerflow.sandbox.middleware import SandboxMiddleware
from deerflow.agents.middlewares.clarification_middleware import ClarificationMiddleware
@Next(SandboxMiddleware)
class MyAuditMiddleware(AgentMiddleware):
def before_agent(self, state, runtime):
log_sandbox_acquired(state)
@Prev(ClarificationMiddleware)
class MyFilterMiddleware(AgentMiddleware):
def after_model(self, state, runtime):
filter_sensitive_output(state)
client = DeerFlowClient(
extra_middleware=[MyAuditMiddleware(), MyFilterMiddleware()],
)
```
## 6. Phase 1 限制
当前实现中以下 middleware 内部仍读 `config.yaml`SDK 用户需注意:
| Middleware | 读取内容 | Phase 2 解决方案 |
|------------|---------|-----------------|
| TitleMiddleware | `get_title_config()` + `create_chat_model()` | `TitleOptions(model=...)` 参数覆盖 |
| MemoryMiddleware | `get_memory_config()` | `MemoryOptions(...)` 参数覆盖 |
| SandboxMiddleware | `get_sandbox_provider()` | `SandboxProvider` 实例直传 |
Phase 1 中 `auto_title` 默认为 `False` 以避免无 config 时崩溃。其他有 config 依赖的 feature 默认也为 `False`
## 7. 迁移路径
```
Phase 1当前 PR #1203:
✓ 新增 create_deerflow_agent + RuntimeFeatures内部 API
✓ 不改 DeerFlowClient 和 make_lead_agent
✗ middleware 内部仍读 config已知限制
Phase 2#1380:
- DeerFlowClient 构造函数增加可选参数model, tools, features, system_prompt
- Options 参数覆盖 configMemoryOptions, TitleOptions 等)
- @Next/@Prev 装饰器
- 补缺失 middlewareGuardrail, TokenUsage, DeferredToolFilter
- make_lead_agent 改为薄壳调 create_deerflow_agent
Phase 3:
- SDK 文档和示例
- deerflow.client 稳定 API
```
## 8. 设计决议
| 问题 | 决议 | 理由 |
|------|------|------|
| 公开 API | `DeerFlowClient` 唯一入口 | 自顶向下,先改现有 API 再抽底层 |
| create_deerflow_agent | 内部实现,不公开 | 用户不需要接触 CompiledStateGraph |
| 配置覆盖 | `config` dict和 config.yaml 结构一致 | 无新概念deep merge 覆盖 |
| middleware 替换 | `features=RuntimeFeatures(memory=MyMW())` | bool 开关 + 实例替换 |
| middleware 扩展 | `extra_middleware` 独立参数 | 和内置 features 分开 |
| middleware 定位 | `@Next/@Prev` 装饰器 | 类型安全,不暴露排序细节 |
| 排序机制 | 顺序 append + @Next/@Prev | priority 数字无功能意义 |
| 运行时开关 | 保留 `RunnableConfig` | plan_mode、thread_id 等按请求切换 |
## 9. 附录Middleware 链
```mermaid
graph TB
subgraph BA ["before_agent 正序"]
direction TB
TD["ThreadData<br/>创建目录"] --> UL["Uploads<br/>扫描文件"] --> SB["Sandbox<br/>获取沙箱"]
end
subgraph BM ["before_model 正序 每轮"]
direction TB
VI["ViewImage<br/>注入图片"]
end
SB --> VI
VI --> M["MODEL"]
subgraph AM ["after_model 反序 每轮"]
direction TB
CL["Clarification<br/>拦截中断"] --> LD["LoopDetection<br/>检测循环"] --> SL["SubagentLimit<br/>截断 task"] --> TI["Title<br/>生成标题"] --> DTC["DanglingToolCall<br/>补缺失消息"]
end
M --> CL
subgraph AA ["after_agent 反序"]
direction TB
SBR["Sandbox<br/>释放沙箱"] --> MEM["Memory<br/>入队记忆"]
end
DTC --> SBR
classDef beforeNode fill:#a0a8b5,stroke:#636b7a,color:#2d3239
classDef modelNode fill:#b5a8a0,stroke:#7a6b63,color:#2d3239
classDef afterModelNode fill:#b5a0a8,stroke:#7a636b,color:#2d3239
classDef afterAgentNode fill:#a0b5a8,stroke:#637a6b,color:#2d3239
class TD,UL,SB,VI beforeNode
class M modelNode
class CL,LD,SL,TI,DTC afterModelNode
class SBR,MEM afterAgentNode
```
硬依赖:
- ThreadData → Uploads → Sandboxbefore_agent 阶段)
- Clarification 必须在列表最后after_model 反序时最先执行)
## 10. 主 Agent 与 Subagent 的 Middleware 差异
主 agent 和 subagent 共享基础 middleware 链(`_build_runtime_middlewares`subagent 在此基础上做精简:
| Middleware | 主 Agent | Subagent | 说明 |
|------------|:-------:|:--------:|------|
| ThreadDataMiddleware | ✓ | ✓ | 共享:创建线程目录 |
| UploadsMiddleware | ✓ | ✗ | 主 agent 独有:扫描上传文件 |
| SandboxMiddleware | ✓ | ✓ | 共享:获取/释放沙箱 |
| DanglingToolCallMiddleware | ✓ | ✗ | 主 agent 独有:补缺失 ToolMessage |
| GuardrailMiddleware | ✓ | ✓ | 共享:工具调用授权(可选) |
| ToolErrorHandlingMiddleware | ✓ | ✓ | 共享:工具异常处理 |
| SummarizationMiddleware | ✓ | ✗ | |
| TodoMiddleware | ✓ | ✗ | |
| TitleMiddleware | ✓ | ✗ | |
| MemoryMiddleware | ✓ | ✗ | |
| ViewImageMiddleware | ✓ | ✗ | |
| SubagentLimitMiddleware | ✓ | ✗ | |
| LoopDetectionMiddleware | ✓ | ✗ | |
| ClarificationMiddleware | ✓ | ✗ | |
**设计原则**
- `RuntimeFeatures``@Next/@Prev`、排序机制只作用于**主 agent**
- Subagent 链短且固定4 个),不需要动态组装
- `extra_middleware` 当前只影响主 agent不传递给 subagent

View File

@@ -0,0 +1,190 @@
# RFC: Extract Shared Skill Installer and Upload Manager into Harness
## 1. Problem
Gateway (`app/gateway/routers/skills.py`, `uploads.py`) and Client (`deerflow/client.py`) each independently implement the same business logic:
### Skill Installation
| Logic | Gateway (`skills.py`) | Client (`client.py`) |
|-------|----------------------|---------------------|
| Zip safety check | `_is_unsafe_zip_member()` | Inline `Path(info.filename).is_absolute()` |
| Symlink filtering | `_is_symlink_member()` | `p.is_symlink()` post-extraction delete |
| Zip bomb defence | `total_size += info.file_size` (declared) | `total_size > 100MB` (declared) |
| macOS metadata filter | `_should_ignore_archive_entry()` | None |
| Frontmatter validation | `_validate_skill_frontmatter()` | `_validate_skill_frontmatter()` |
| Duplicate detection | `HTTPException(409)` | `ValueError` |
**Two implementations, inconsistent behaviour**: Gateway streams writes and tracks real decompressed size; Client sums declared `file_size`. Gateway skips symlinks during extraction; Client extracts everything then walks and deletes symlinks.
### Upload Management
| Logic | Gateway (`uploads.py`) | Client (`client.py`) |
|-------|----------------------|---------------------|
| Directory access | `get_uploads_dir()` + `mkdir` | `_get_uploads_dir()` + `mkdir` |
| Filename safety | Inline `Path(f).name` + manual checks | No checks, uses `src_path.name` directly |
| Duplicate handling | None (overwrites) | None (overwrites) |
| Listing | Inline `iterdir()` | Inline `os.scandir()` |
| Deletion | Inline `unlink()` + traversal check | Inline `unlink()` + traversal check |
| Path traversal | `resolve().relative_to()` | `resolve().relative_to()` |
**The same traversal check is written twice** — any security fix must be applied to both locations.
## 2. Design Principles
### Dependency Direction
```
app.gateway.routers.skills ──┐
app.gateway.routers.uploads ──┤── calls ──→ deerflow.skills.installer
deerflow.client ──┘ deerflow.uploads.manager
```
- Shared modules live in the harness layer (`deerflow.*`), pure business logic, no FastAPI dependency
- Gateway handles HTTP adaptation (`UploadFile` → bytes, exceptions → `HTTPException`)
- Client handles local adaptation (`Path` → copy, exceptions → Python exceptions)
- Satisfies `test_harness_boundary.py` constraint: harness never imports app
### Exception Strategy
| Shared Layer Exception | Gateway Maps To | Client |
|----------------------|-----------------|--------|
| `FileNotFoundError` | `HTTPException(404)` | Propagates |
| `ValueError` | `HTTPException(400)` | Propagates |
| `SkillAlreadyExistsError` | `HTTPException(409)` | Propagates |
| `PermissionError` | `HTTPException(403)` | Propagates |
Replaces stringly-typed routing (`"already exists" in str(e)`) with typed exception matching (`SkillAlreadyExistsError`).
## 3. New Modules
### 3.1 `deerflow.skills.installer`
```python
# Safety checks
is_unsafe_zip_member(info: ZipInfo) -> bool # Absolute path / .. traversal
is_symlink_member(info: ZipInfo) -> bool # Unix symlink detection
should_ignore_archive_entry(path: Path) -> bool # __MACOSX / dotfiles
# Extraction
safe_extract_skill_archive(zip_ref, dest_path, max_total_size=512MB)
# Streaming write, accumulates real bytes (vs declared file_size)
# Dual traversal check: member-level + resolve-level
# Directory resolution
resolve_skill_dir_from_archive(temp_path: Path) -> Path
# Auto-enters single directory, filters macOS metadata
# Install entry point
install_skill_from_archive(zip_path, *, skills_root=None) -> dict
# is_file() pre-check before extension validation
# SkillAlreadyExistsError replaces ValueError
# Exception
class SkillAlreadyExistsError(ValueError)
```
### 3.2 `deerflow.uploads.manager`
```python
# Directory management
get_uploads_dir(thread_id: str) -> Path # Pure path, no side effects
ensure_uploads_dir(thread_id: str) -> Path # Creates directory (for write paths)
# Filename safety
normalize_filename(filename: str) -> str
# Path.name extraction + rejects ".." / "." / backslash / >255 bytes
deduplicate_filename(name: str, seen: set) -> str
# _N suffix increment for dedup, mutates seen in place
# Path safety
validate_path_traversal(path: Path, base: Path) -> None
# resolve().relative_to(), raises PermissionError on failure
# File operations
list_files_in_dir(directory: Path) -> dict
# scandir with stat inside context (no re-stat)
# follow_symlinks=False to prevent metadata leakage
# Non-existent directory returns empty list
delete_file_safe(base_dir: Path, filename: str) -> dict
# Validates traversal first, then unlinks
# URL helpers
upload_artifact_url(thread_id, filename) -> str # Percent-encoded for HTTP safety
upload_virtual_path(filename) -> str # Sandbox-internal path
enrich_file_listing(result, thread_id) -> dict # Adds URLs, stringifies sizes
```
## 4. Changes
### 4.1 Gateway Slimming
**`app/gateway/routers/skills.py`**:
- Remove `_is_unsafe_zip_member`, `_is_symlink_member`, `_safe_extract_skill_archive`, `_should_ignore_archive_entry`, `_resolve_skill_dir_from_archive_root` (~80 lines)
- `install_skill` route becomes a single call to `install_skill_from_archive(path)`
- Exception mapping: `SkillAlreadyExistsError → 409`, `ValueError → 400`, `FileNotFoundError → 404`
**`app/gateway/routers/uploads.py`**:
- Remove inline `get_uploads_dir` (replaced by `ensure_uploads_dir`/`get_uploads_dir`)
- `upload_files` uses `normalize_filename()` instead of inline safety checks
- `list_uploaded_files` uses `list_files_in_dir()` + enrichment
- `delete_uploaded_file` uses `delete_file_safe()` + companion markdown cleanup
### 4.2 Client Slimming
**`deerflow/client.py`**:
- Remove `_get_uploads_dir` static method
- Remove ~50 lines of inline zip handling in `install_skill`
- `install_skill` delegates to `install_skill_from_archive()`
- `upload_files` uses `deduplicate_filename()` + `ensure_uploads_dir()`
- `list_uploads` uses `get_uploads_dir()` + `list_files_in_dir()`
- `delete_upload` uses `get_uploads_dir()` + `delete_file_safe()`
- `update_mcp_config` / `update_skill` now reset `_agent_config_key = None`
### 4.3 Read/Write Path Separation
| Operation | Function | Creates dir? |
|-----------|----------|:------------:|
| upload (write) | `ensure_uploads_dir()` | Yes |
| list (read) | `get_uploads_dir()` | No |
| delete (read) | `get_uploads_dir()` | No |
Read paths no longer have `mkdir` side effects — non-existent directories return empty lists.
## 5. Security Improvements
| Improvement | Before | After |
|-------------|--------|-------|
| Zip bomb detection | Sum of declared `file_size` | Streaming write, accumulates real bytes |
| Symlink handling | Gateway skips / Client deletes post-extract | Unified skip + log |
| Traversal check | Member-level only | Member-level + `resolve().is_relative_to()` |
| Filename backslash | Gateway checks / Client doesn't | Unified rejection |
| Filename length | No check | Reject > 255 bytes (OS limit) |
| thread_id validation | None | Reject unsafe filesystem characters |
| Listing symlink leak | `follow_symlinks=True` (default) | `follow_symlinks=False` |
| 409 status routing | `"already exists" in str(e)` | `SkillAlreadyExistsError` type match |
| Artifact URL encoding | Raw filename in URL | `urllib.parse.quote()` |
## 6. Alternatives Considered
| Alternative | Why Not |
|-------------|---------|
| Keep logic in Gateway, Client calls Gateway via HTTP | Adds network dependency to embedded Client; defeats the purpose of `DeerFlowClient` as an in-process API |
| Abstract base class with Gateway/Client subclasses | Over-engineered for what are pure functions; no polymorphism needed |
| Move everything into `client.py` and have Gateway import it | Violates harness/app boundary — Client is in harness, but Gateway-specific models (Pydantic response types) should stay in app layer |
| Merge Gateway and Client into one module | They serve different consumers (HTTP vs in-process) with different adaptation needs |
## 7. Breaking Changes
**None.** All public APIs (Gateway HTTP endpoints, `DeerFlowClient` methods) retain their existing signatures and return formats. The `SkillAlreadyExistsError` is a subclass of `ValueError`, so existing `except ValueError` handlers still catch it.
## 8. Tests
| Module | Test File | Count |
|--------|-----------|:-----:|
| `skills.installer` | `tests/test_skills_installer.py` | 22 |
| `uploads.manager` | `tests/test_uploads_manager.py` | 20 |
| `client` hardening | `tests/test_client.py` (new cases) | ~40 |
| `client` e2e | `tests/test_client_e2e.py` (new file) | ~20 |
Coverage: unsafe zip / symlink / zip bomb / frontmatter / duplicate / extension / macOS filter / normalize / deduplicate / traversal / list / delete / agent invalidation / upload lifecycle / thread isolation / URL encoding / config pollution.

View File

@@ -0,0 +1,446 @@
# [RFC] 在 DeerFlow 中增加 `grep` 与 `glob` 文件搜索工具
## Summary
我认为这个方向是对的,而且值得做。
如果 DeerFlow 想更接近 Claude Code 这类 coding agent 的实际工作流,仅有 `ls` / `read_file` / `write_file` / `str_replace` 还不够。模型在进入修改前,通常还需要两类能力:
- `glob`: 快速按路径模式找文件
- `grep`: 快速按内容模式找候选位置
这两类工具的价值,不是“功能上 bash 也能做”,而是它们能以更低 token 成本、更强约束、更稳定的输出格式,替代模型频繁走 `bash find` / `bash grep` / `rg` 的习惯。
但前提是实现方式要对:**它们应该是只读、结构化、受限、可审计的原生工具,而不是对 shell 命令的简单包装。**
## Problem
当前 DeerFlow 的文件工具层主要覆盖:
- `ls`: 浏览目录结构
- `read_file`: 读取文件内容
- `write_file`: 写文件
- `str_replace`: 做局部字符串替换
- `bash`: 兜底执行命令
这套能力能完成任务,但在代码库探索阶段效率不高。
典型问题:
1. 模型想找 “所有 `*.tsx` 的 page 文件” 时,只能反复 `ls` 多层目录,或者退回 `bash find`
2. 模型想找 “某个 symbol / 文案 / 配置键在哪里出现” 时,只能逐文件 `read_file`,或者退回 `bash grep` / `rg`
3. 一旦退回 `bash`,工具调用就失去结构化输出,结果也更难做裁剪、分页、审计和跨 sandbox 一致化
4. 对没有开启 host bash 的本地模式,`bash` 甚至可能不可用,此时缺少足够强的只读检索能力
结论DeerFlow 现在缺的不是“再多一个 shell 命令”,而是**文件系统检索层**。
## Goals
- 为 agent 提供稳定的路径搜索和内容搜索能力
- 减少对 `bash` 的依赖,特别是在仓库探索阶段
- 保持与现有 sandbox 安全模型一致
- 输出格式结构化,便于模型后续串联 `read_file` / `str_replace`
- 让本地 sandbox、容器 sandbox、未来 MCP 文件系统工具都能遵守同一语义
## Non-Goals
- 不做通用 shell 兼容层
- 不暴露完整 grep/find/rg CLI 语法
- 不在第一版支持二进制检索、复杂 PCRE 特性、上下文窗口高亮渲染等重功能
- 不把它做成“任意磁盘搜索”,仍然只允许在 DeerFlow 已授权的路径内执行
## Why This Is Worth Doing
参考 Claude Code 这一类 agent 的设计思路,`glob``grep` 的核心价值不是新能力本身,而是把“探索代码库”的常见动作从开放式 shell 降到受控工具层。
这样有几个直接收益:
1. **更低的模型负担**
模型不需要自己拼 `find`, `grep`, `rg`, `xargs`, quoting 等命令细节。
2. **更稳定的跨环境行为**
本地、Docker、AIO sandbox 不必依赖容器里是否装了 `rg`,也不会因为 shell 差异导致行为漂移。
3. **更强的安全与审计**
调用参数就是“搜索什么、在哪搜、最多返回多少”,天然比任意命令更容易审计和限流。
4. **更好的 token 效率**
`grep` 返回的是命中摘要而不是整段文件,模型只对少数候选路径再调用 `read_file`
5. **对 `tool_search` 友好**
当 DeerFlow 持续扩展工具集时,`grep` / `glob` 会成为非常高频的基础工具,值得保留为 built-in而不是让模型总是退回通用 bash。
## Proposal
增加两个 built-in sandbox tools
- `glob`
- `grep`
推荐继续放在:
- `backend/packages/harness/deerflow/sandbox/tools.py`
并在 `config.example.yaml` 中默认加入 `file:read` 组。
### 1. `glob` 工具
用途:按路径模式查找文件或目录。
建议 schema
```python
@tool("glob", parse_docstring=True)
def glob_tool(
runtime: ToolRuntime[ContextT, ThreadState],
description: str,
pattern: str,
path: str,
include_dirs: bool = False,
max_results: int = 200,
) -> str:
...
```
参数语义:
- `description`: 与现有工具保持一致
- `pattern`: glob 模式,例如 `**/*.py``src/**/test_*.ts`
- `path`: 搜索根目录,必须是绝对路径
- `include_dirs`: 是否返回目录
- `max_results`: 最大返回条数,防止一次性打爆上下文
建议返回格式:
```text
Found 3 paths under /mnt/user-data/workspace
1. /mnt/user-data/workspace/backend/app.py
2. /mnt/user-data/workspace/backend/tests/test_app.py
3. /mnt/user-data/workspace/scripts/build.py
```
如果后续想更适合前端消费,也可以改成 JSON 字符串;但第一版为了兼容现有工具风格,返回可读文本即可。
### 2. `grep` 工具
用途:按内容模式搜索文件,返回命中位置摘要。
建议 schema
```python
@tool("grep", parse_docstring=True)
def grep_tool(
runtime: ToolRuntime[ContextT, ThreadState],
description: str,
pattern: str,
path: str,
glob: str | None = None,
literal: bool = False,
case_sensitive: bool = False,
max_results: int = 100,
) -> str:
...
```
参数语义:
- `pattern`: 搜索词或正则
- `path`: 搜索根目录,必须是绝对路径
- `glob`: 可选路径过滤,例如 `**/*.py`
- `literal`: 为 `True` 时按普通字符串匹配,不解释为正则
- `case_sensitive`: 是否大小写敏感
- `max_results`: 最大返回命中数,不是文件数
建议返回格式:
```text
Found 4 matches under /mnt/user-data/workspace
/mnt/user-data/workspace/backend/config.py:12: TOOL_GROUPS = [...]
/mnt/user-data/workspace/backend/config.py:48: def load_tool_config(...):
/mnt/user-data/workspace/backend/tools.py:91: "tool_groups"
/mnt/user-data/workspace/backend/tests/test_config.py:22: assert "tool_groups" in data
```
第一版建议只返回:
- 文件路径
- 行号
- 命中行摘要
不返回上下文块,避免结果过大。模型如果需要上下文,再调用 `read_file(path, start_line, end_line)`
## Design Principles
### A. 不做 shell wrapper
不建议把 `grep` 实现为:
```python
subprocess.run("grep ...")
```
也不建议在容器里直接拼 `find` / `rg` 命令。
原因:
- 会引入 shell quoting 和注入面
- 会依赖不同 sandbox 内镜像是否安装同一套命令
- Windows / macOS / Linux 行为不一致
- 很难稳定控制输出条数与格式
正确方向是:
- `glob` 使用 Python 标准库路径遍历
- `grep` 使用 Python 逐文件扫描
- 输出由 DeerFlow 自己格式化
如果未来为了性能考虑要优先调用 `rg`,也应该封装在 provider 内部,并保证外部语义不变,而不是把 CLI 暴露给模型。
### B. 继续沿用 DeerFlow 的路径权限模型
这两个工具必须复用当前 `ls` / `read_file` 的路径校验逻辑:
- 本地模式走 `validate_local_tool_path(..., read_only=True)`
- 支持 `/mnt/skills/...`
- 支持 `/mnt/acp-workspace/...`
- 支持 thread workspace / uploads / outputs 的虚拟路径解析
- 明确拒绝越权路径与 path traversal
也就是说,它们属于 **file:read**,不是 `bash` 的替代越权入口。
### C. 结果必须硬限制
没有硬限制的 `glob` / `grep` 很容易炸上下文。
建议第一版至少限制:
- `glob.max_results` 默认 200最大 1000
- `grep.max_results` 默认 100最大 500
- 单行摘要最大长度,例如 200 字符
- 二进制文件跳过
- 超大文件跳过,例如单文件大于 1 MB 或按配置控制
此外,命中数超过阈值时应返回:
- 已展示的条数
- 被截断的事实
- 建议用户缩小搜索范围
例如:
```text
Found more than 100 matches, showing first 100. Narrow the path or add a glob filter.
```
### D. 工具语义要彼此互补
推荐模型工作流应该是:
1. `glob` 找候选文件
2. `grep` 找候选位置
3. `read_file` 读局部上下文
4. `str_replace` / `write_file` 执行修改
这样工具边界清晰,也更利于 prompt 中教模型形成稳定习惯。
## Implementation Approach
## Option A: 直接在 `sandbox/tools.py` 实现第一版
这是我推荐的起步方案。
做法:
-`sandbox/tools.py` 新增 `glob_tool``grep_tool`
- 在 local sandbox 场景直接使用 Python 文件系统 API
- 在非 local sandbox 场景,优先也通过 DeerFlow 自己控制的路径访问层实现
优点:
- 改动小
- 能尽快验证 agent 效果
- 不需要先改 `Sandbox` 抽象
缺点:
- `tools.py` 会继续变胖
- 如果未来想在 provider 侧做性能优化,需要再抽象一次
## Option B: 先扩展 `Sandbox` 抽象
例如新增:
```python
class Sandbox(ABC):
def glob(self, path: str, pattern: str, include_dirs: bool = False, max_results: int = 200) -> list[str]:
...
def grep(
self,
path: str,
pattern: str,
*,
glob: str | None = None,
literal: bool = False,
case_sensitive: bool = False,
max_results: int = 100,
) -> list[GrepMatch]:
...
```
优点:
- 抽象更干净
- 容器 / 远程 sandbox 可以各自优化
缺点:
- 首次引入成本更高
- 需要同步改所有 sandbox provider
结论:
**第一版建议走 Option A等工具价值验证后再下沉到 `Sandbox` 抽象层。**
## Detailed Behavior
### `glob` 行为
- 输入根目录不存在:返回清晰错误
- 根路径不是目录:返回清晰错误
- 模式非法:返回清晰错误
- 结果为空:返回 `No files matched`
- 默认忽略项应尽量与当前 `list_dir` 对齐,例如:
- `.git`
- `node_modules`
- `__pycache__`
- `.venv`
- 构建产物目录
这里建议抽一个共享 ignore 集,避免 `ls``glob` 结果风格不一致。
### `grep` 行为
- 默认只扫描文本文件
- 检测到二进制文件直接跳过
- 对超大文件直接跳过或只扫前 N KB
- regex 编译失败时返回参数错误
- 输出中的路径继续使用虚拟路径,而不是暴露宿主真实路径
- 建议默认按文件路径、行号排序,保持稳定输出
## Prompting Guidance
如果引入这两个工具,建议同步更新系统提示中的文件操作建议:
- 查找文件名模式时优先用 `glob`
- 查找代码符号、配置项、文案时优先用 `grep`
- 只有在工具不足以完成目标时才退回 `bash`
否则模型仍会习惯性先调用 `bash`
## Risks
### 1. 与 `bash` 能力重叠
这是事实,但不是问题。
`ls``read_file` 也都能被 `bash` 替代,但我们仍然保留它们,因为结构化工具更适合 agent。
### 2. 性能问题
在大仓库上,纯 Python `grep` 可能比 `rg` 慢。
缓解方式:
- 第一版先加结果上限和文件大小上限
- 路径上强制要求 root path
- 提供 `glob` 过滤缩小扫描范围
- 后续如有必要,在 provider 内部做 `rg` 优化,但保持同一 schema
### 3. 忽略规则不一致
如果 `ls` 能看到的路径,`glob` 却看不到,模型会困惑。
缓解方式:
- 统一 ignore 规则
- 在文档里明确“默认跳过常见依赖和构建目录”
### 4. 正则搜索过于复杂
如果第一版就支持大量 grep 方言,边界会很乱。
缓解方式:
- 第一版只支持 Python `re`
- 并提供 `literal=True` 的简单模式
## Alternatives Considered
### A. 不增加工具,完全依赖 `bash`
不推荐。
这会让 DeerFlow 在代码探索体验上持续落后,也削弱无 bash 或受限 bash 场景下的能力。
### B. 只加 `glob`,不加 `grep`
不推荐。
只解决“找文件”,没有解决“找位置”。模型最终还是会退回 `bash grep`
### C. 只加 `grep`,不加 `glob`
也不推荐。
`grep` 缺少路径模式过滤时,扫描范围经常太大;`glob` 是它的天然前置工具。
### D. 直接接入 MCP filesystem server 的搜索能力
短期不推荐作为主路径。
MCP 可以是补充,但 `glob` / `grep` 作为 DeerFlow 的基础 coding tool最好仍然是 built-in这样才能在默认安装中稳定可用。
## Acceptance Criteria
- `config.example.yaml` 中可默认启用 `glob``grep`
- 两个工具归属 `file:read`
- 本地 sandbox 下严格遵守现有路径权限
- 输出不泄露宿主机真实路径
- 大结果集会被截断并明确提示
- 模型可以通过 `glob -> grep -> read_file -> str_replace` 完成典型改码流
- 在禁用 host bash 的本地模式下,仓库探索能力明显提升
## Rollout Plan
1.`sandbox/tools.py` 中实现 `glob_tool``grep_tool`
2. 抽取与 `list_dir` 一致的 ignore 规则,避免行为漂移
3.`config.example.yaml` 默认加入工具配置
4. 为本地路径校验、虚拟路径映射、结果截断、二进制跳过补测试
5. 更新 README / backend docs / prompt guidance
6. 收集实际 agent 调用数据,再决定是否下沉到 `Sandbox` 抽象
## Suggested Config
```yaml
tools:
- name: glob
group: file:read
use: deerflow.sandbox.tools:glob_tool
- name: grep
group: file:read
use: deerflow.sandbox.tools:grep_tool
```
## Final Recommendation
结论是:**可以加,而且应该加。**
但我会明确卡三个边界:
1. `grep` / `glob` 必须是 built-in 的只读结构化工具
2. 第一版不要做 shell wrapper不要把 CLI 方言直接暴露给模型
3. 先在 `sandbox/tools.py` 验证价值,再考虑是否下沉到 `Sandbox` provider 抽象
如果按这个方向做,它会明显提升 DeerFlow 在 coding / repo exploration 场景下的可用性,而且风险可控。

View File

@@ -0,0 +1,353 @@
# Conversation Summarization
DeerFlow includes automatic conversation summarization to handle long conversations that approach model token limits. When enabled, the system automatically condenses older messages while preserving recent context.
## Overview
The summarization feature uses LangChain's `SummarizationMiddleware` to monitor conversation history and trigger summarization based on configurable thresholds. When activated, it:
1. Monitors message token counts in real-time
2. Triggers summarization when thresholds are met
3. Keeps recent messages intact while summarizing older exchanges
4. Maintains AI/Tool message pairs together for context continuity
5. Injects the summary back into the conversation
## Configuration
Summarization is configured in `config.yaml` under the `summarization` key:
```yaml
summarization:
enabled: true
model_name: null # Use default model or specify a lightweight model
# Trigger conditions (OR logic - any condition triggers summarization)
trigger:
- type: tokens
value: 4000
# Additional triggers (optional)
# - type: messages
# value: 50
# - type: fraction
# value: 0.8 # 80% of model's max input tokens
# Context retention policy
keep:
type: messages
value: 20
# Token trimming for summarization call
trim_tokens_to_summarize: 4000
# Custom summary prompt (optional)
summary_prompt: null
```
### Configuration Options
#### `enabled`
- **Type**: Boolean
- **Default**: `false`
- **Description**: Enable or disable automatic summarization
#### `model_name`
- **Type**: String or null
- **Default**: `null` (uses default model)
- **Description**: Model to use for generating summaries. Recommended to use a lightweight, cost-effective model like `gpt-4o-mini` or equivalent.
#### `trigger`
- **Type**: Single `ContextSize` or list of `ContextSize` objects
- **Required**: At least one trigger must be specified when enabled
- **Description**: Thresholds that trigger summarization. Uses OR logic - summarization runs when ANY threshold is met.
**ContextSize Types:**
1. **Token-based trigger**: Activates when token count reaches the specified value
```yaml
trigger:
type: tokens
value: 4000
```
2. **Message-based trigger**: Activates when message count reaches the specified value
```yaml
trigger:
type: messages
value: 50
```
3. **Fraction-based trigger**: Activates when token usage reaches a percentage of the model's maximum input tokens
```yaml
trigger:
type: fraction
value: 0.8 # 80% of max input tokens
```
**Multiple Triggers:**
```yaml
trigger:
- type: tokens
value: 4000
- type: messages
value: 50
```
#### `keep`
- **Type**: `ContextSize` object
- **Default**: `{type: messages, value: 20}`
- **Description**: Specifies how much recent conversation history to preserve after summarization.
**Examples:**
```yaml
# Keep most recent 20 messages
keep:
type: messages
value: 20
# Keep most recent 3000 tokens
keep:
type: tokens
value: 3000
# Keep most recent 30% of model's max input tokens
keep:
type: fraction
value: 0.3
```
#### `trim_tokens_to_summarize`
- **Type**: Integer or null
- **Default**: `4000`
- **Description**: Maximum tokens to include when preparing messages for the summarization call itself. Set to `null` to skip trimming (not recommended for very long conversations).
#### `summary_prompt`
- **Type**: String or null
- **Default**: `null` (uses LangChain's default prompt)
- **Description**: Custom prompt template for generating summaries. The prompt should guide the model to extract the most important context.
**Default Prompt Behavior:**
The default LangChain prompt instructs the model to:
- Extract highest quality/most relevant context
- Focus on information critical to the overall goal
- Avoid repeating completed actions
- Return only the extracted context
## How It Works
### Summarization Flow
1. **Monitoring**: Before each model call, the middleware counts tokens in the message history
2. **Trigger Check**: If any configured threshold is met, summarization is triggered
3. **Message Partitioning**: Messages are split into:
- Messages to summarize (older messages beyond the `keep` threshold)
- Messages to preserve (recent messages within the `keep` threshold)
4. **Summary Generation**: The model generates a concise summary of the older messages
5. **Context Replacement**: The message history is updated:
- All old messages are removed
- A single summary message is added
- Recent messages are preserved
6. **AI/Tool Pair Protection**: The system ensures AI messages and their corresponding tool messages stay together
### Token Counting
- Uses approximate token counting based on character count
- For Anthropic models: ~3.3 characters per token
- For other models: Uses LangChain's default estimation
- Can be customized with a custom `token_counter` function
### Message Preservation
The middleware intelligently preserves message context:
- **Recent Messages**: Always kept intact based on `keep` configuration
- **AI/Tool Pairs**: Never split - if a cutoff point falls within tool messages, the system adjusts to keep the entire AI + Tool message sequence together
- **Summary Format**: Summary is injected as a HumanMessage with the format:
```
Here is a summary of the conversation to date:
[Generated summary text]
```
## Best Practices
### Choosing Trigger Thresholds
1. **Token-based triggers**: Recommended for most use cases
- Set to 60-80% of your model's context window
- Example: For 8K context, use 4000-6000 tokens
2. **Message-based triggers**: Useful for controlling conversation length
- Good for applications with many short messages
- Example: 50-100 messages depending on average message length
3. **Fraction-based triggers**: Ideal when using multiple models
- Automatically adapts to each model's capacity
- Example: 0.8 (80% of model's max input tokens)
### Choosing Retention Policy (`keep`)
1. **Message-based retention**: Best for most scenarios
- Preserves natural conversation flow
- Recommended: 15-25 messages
2. **Token-based retention**: Use when precise control is needed
- Good for managing exact token budgets
- Recommended: 2000-4000 tokens
3. **Fraction-based retention**: For multi-model setups
- Automatically scales with model capacity
- Recommended: 0.2-0.4 (20-40% of max input)
### Model Selection
- **Recommended**: Use a lightweight, cost-effective model for summaries
- Examples: `gpt-4o-mini`, `claude-haiku`, or equivalent
- Summaries don't require the most powerful models
- Significant cost savings on high-volume applications
- **Default**: If `model_name` is `null`, uses the default model
- May be more expensive but ensures consistency
- Good for simple setups
### Optimization Tips
1. **Balance triggers**: Combine token and message triggers for robust handling
```yaml
trigger:
- type: tokens
value: 4000
- type: messages
value: 50
```
2. **Conservative retention**: Keep more messages initially, adjust based on performance
```yaml
keep:
type: messages
value: 25 # Start higher, reduce if needed
```
3. **Trim strategically**: Limit tokens sent to summarization model
```yaml
trim_tokens_to_summarize: 4000 # Prevents expensive summarization calls
```
4. **Monitor and iterate**: Track summary quality and adjust configuration
## Troubleshooting
### Summary Quality Issues
**Problem**: Summaries losing important context
**Solutions**:
1. Increase `keep` value to preserve more messages
2. Decrease trigger thresholds to summarize earlier
3. Customize `summary_prompt` to emphasize key information
4. Use a more capable model for summarization
### Performance Issues
**Problem**: Summarization calls taking too long
**Solutions**:
1. Use a faster model for summaries (e.g., `gpt-4o-mini`)
2. Reduce `trim_tokens_to_summarize` to send less context
3. Increase trigger thresholds to summarize less frequently
### Token Limit Errors
**Problem**: Still hitting token limits despite summarization
**Solutions**:
1. Lower trigger thresholds to summarize earlier
2. Reduce `keep` value to preserve fewer messages
3. Check if individual messages are very large
4. Consider using fraction-based triggers
## Implementation Details
### Code Structure
- **Configuration**: `packages/harness/deerflow/config/summarization_config.py`
- **Integration**: `packages/harness/deerflow/agents/lead_agent/agent.py`
- **Middleware**: Uses `langchain.agents.middleware.SummarizationMiddleware`
### Middleware Order
Summarization runs after ThreadData and Sandbox initialization but before Title and Clarification:
1. ThreadDataMiddleware
2. SandboxMiddleware
3. **SummarizationMiddleware** ← Runs here
4. TitleMiddleware
5. ClarificationMiddleware
### State Management
- Summarization is stateless - configuration is loaded once at startup
- Summaries are added as regular messages in the conversation history
- The checkpointer persists the summarized history automatically
## Example Configurations
### Minimal Configuration
```yaml
summarization:
enabled: true
trigger:
type: tokens
value: 4000
keep:
type: messages
value: 20
```
### Production Configuration
```yaml
summarization:
enabled: true
model_name: gpt-4o-mini # Lightweight model for cost efficiency
trigger:
- type: tokens
value: 6000
- type: messages
value: 75
keep:
type: messages
value: 25
trim_tokens_to_summarize: 5000
```
### Multi-Model Configuration
```yaml
summarization:
enabled: true
model_name: gpt-4o-mini
trigger:
type: fraction
value: 0.7 # 70% of model's max input
keep:
type: fraction
value: 0.3 # Keep 30% of max input
trim_tokens_to_summarize: 4000
```
### Conservative Configuration (High Quality)
```yaml
summarization:
enabled: true
model_name: gpt-4 # Use full model for high-quality summaries
trigger:
type: tokens
value: 8000
keep:
type: messages
value: 40 # Keep more context
trim_tokens_to_summarize: null # No trimming
```
## References
- [LangChain Summarization Middleware Documentation](https://docs.langchain.com/oss/python/langchain/middleware/built-in#summarization)
- [LangChain Source Code](https://github.com/langchain-ai/langchain)

View File

@@ -0,0 +1,174 @@
# Task Tool Improvements
## Overview
The task tool has been improved to eliminate wasteful LLM polling. Previously, when using background tasks, the LLM had to repeatedly call `task_status` to poll for completion, causing unnecessary API requests.
## Changes Made
### 1. Removed `run_in_background` Parameter
The `run_in_background` parameter has been removed from the `task` tool. All subagent tasks now run asynchronously by default, but the tool handles completion automatically.
**Before:**
```python
# LLM had to manage polling
task_id = task(
subagent_type="bash",
prompt="Run tests",
description="Run tests",
run_in_background=True
)
# Then LLM had to poll repeatedly:
while True:
status = task_status(task_id)
if completed:
break
```
**After:**
```python
# Tool blocks until complete, polling happens in backend
result = task(
subagent_type="bash",
prompt="Run tests",
description="Run tests"
)
# Result is available immediately after the call returns
```
### 2. Backend Polling
The `task_tool` now:
- Starts the subagent task asynchronously
- Polls for completion in the backend (every 2 seconds)
- Blocks the tool call until completion
- Returns the final result directly
This means:
- ✅ LLM makes only ONE tool call
- ✅ No wasteful LLM polling requests
- ✅ Backend handles all status checking
- ✅ Timeout protection (5 minutes max)
### 3. Removed `task_status` from LLM Tools
The `task_status_tool` is no longer exposed to the LLM. It's kept in the codebase for potential internal/debugging use, but the LLM cannot call it.
### 4. Updated Documentation
- Updated `SUBAGENT_SECTION` in `prompt.py` to remove all references to background tasks and polling
- Simplified usage examples
- Made it clear that the tool automatically waits for completion
## Implementation Details
### Polling Logic
Located in `packages/harness/deerflow/tools/builtins/task_tool.py`:
```python
# Start background execution
task_id = executor.execute_async(prompt)
# Poll for task completion in backend
while True:
result = get_background_task_result(task_id)
# Check if task completed or failed
if result.status == SubagentStatus.COMPLETED:
return f"[Subagent: {subagent_type}]\n\n{result.result}"
elif result.status == SubagentStatus.FAILED:
return f"[Subagent: {subagent_type}] Task failed: {result.error}"
# Wait before next poll
time.sleep(2)
# Timeout protection (5 minutes)
if poll_count > 150:
return "Task timed out after 5 minutes"
```
### Execution Timeout
In addition to polling timeout, subagent execution now has a built-in timeout mechanism:
**Configuration** (`packages/harness/deerflow/subagents/config.py`):
```python
@dataclass
class SubagentConfig:
# ...
timeout_seconds: int = 300 # 5 minutes default
```
**Thread Pool Architecture**:
To avoid nested thread pools and resource waste, we use two dedicated thread pools:
1. **Scheduler Pool** (`_scheduler_pool`):
- Max workers: 4
- Purpose: Orchestrates background task execution
- Runs `run_task()` function that manages task lifecycle
2. **Execution Pool** (`_execution_pool`):
- Max workers: 8 (larger to avoid blocking)
- Purpose: Actual subagent execution with timeout support
- Runs `execute()` method that invokes the agent
**How it works**:
```python
# In execute_async():
_scheduler_pool.submit(run_task) # Submit orchestration task
# In run_task():
future = _execution_pool.submit(self.execute, task) # Submit execution
exec_result = future.result(timeout=timeout_seconds) # Wait with timeout
```
**Benefits**:
- ✅ Clean separation of concerns (scheduling vs execution)
- ✅ No nested thread pools
- ✅ Timeout enforcement at the right level
- ✅ Better resource utilization
**Two-Level Timeout Protection**:
1. **Execution Timeout**: Subagent execution itself has a 5-minute timeout (configurable in SubagentConfig)
2. **Polling Timeout**: Tool polling has a 5-minute timeout (30 polls × 10 seconds)
This ensures that even if subagent execution hangs, the system won't wait indefinitely.
### Benefits
1. **Reduced API Costs**: No more repeated LLM requests for polling
2. **Simpler UX**: LLM doesn't need to manage polling logic
3. **Better Reliability**: Backend handles all status checking consistently
4. **Timeout Protection**: Two-level timeout prevents infinite waiting (execution + polling)
## Testing
To verify the changes work correctly:
1. Start a subagent task that takes a few seconds
2. Verify the tool call blocks until completion
3. Verify the result is returned directly
4. Verify no `task_status` calls are made
Example test scenario:
```python
# This should block for ~10 seconds then return result
result = task(
subagent_type="bash",
prompt="sleep 10 && echo 'Done'",
description="Test task"
)
# result should contain "Done"
```
## Migration Notes
For users/code that previously used `run_in_background=True`:
- Simply remove the parameter
- Remove any polling logic
- The tool will automatically wait for completion
No other changes needed - the API is backward compatible (minus the removed parameter).