Initial commit: hardened DeerFlow factory
Vendored deer-flow upstream (bytedance/deer-flow) plus prompt-injection hardening: - New deerflow.security package: content_delimiter, html_cleaner, sanitizer (8 layers — invisible chars, control chars, symbols, NFC, PUA, tag chars, horizontal whitespace collapse with newline/tab preservation, length cap) - New deerflow.community.searx package: web_search, web_fetch, image_search backed by a private SearX instance, every external string sanitized and wrapped in <<<EXTERNAL_UNTRUSTED_CONTENT>>> delimiters - All native community web providers (ddg_search, tavily, exa, firecrawl, jina_ai, infoquest, image_search) replaced with hard-fail stubs that raise NativeWebToolDisabledError at import time, so a misconfigured tool.use path fails loud rather than silently falling back to unsanitized output - Native client back-doors (jina_client.py, infoquest_client.py) stubbed too - Native-tool tests quarantined under tests/_disabled_native/ (collect_ignore_glob via local conftest.py) - Sanitizer Layer 7 fix: only collapse horizontal whitespace, preserve newlines and tabs so list/table structure survives - Hardened runtime config.yaml references only the searx-backed tools - Factory overlay (backend/) kept in sync with deer-flow tree as a reference / source See HARDENING.md for the full audit trail and verification steps.
This commit is contained in:
92
deer-flow/backend/docs/SETUP.md
Normal file
92
deer-flow/backend/docs/SETUP.md
Normal file
@@ -0,0 +1,92 @@
|
||||
# Setup Guide
|
||||
|
||||
Quick setup instructions for DeerFlow.
|
||||
|
||||
## Configuration Setup
|
||||
|
||||
DeerFlow uses a YAML configuration file that should be placed in the **project root directory**.
|
||||
|
||||
### Steps
|
||||
|
||||
1. **Navigate to project root**:
|
||||
```bash
|
||||
cd /path/to/deer-flow
|
||||
```
|
||||
|
||||
2. **Copy example configuration**:
|
||||
```bash
|
||||
cp config.example.yaml config.yaml
|
||||
```
|
||||
|
||||
3. **Edit configuration**:
|
||||
```bash
|
||||
# Option A: Set environment variables (recommended)
|
||||
export OPENAI_API_KEY="your-key-here"
|
||||
|
||||
# Option B: Edit config.yaml directly
|
||||
vim config.yaml # or your preferred editor
|
||||
```
|
||||
|
||||
4. **Verify configuration**:
|
||||
```bash
|
||||
cd backend
|
||||
python -c "from deerflow.config import get_app_config; print('✓ Config loaded:', get_app_config().models[0].name)"
|
||||
```
|
||||
|
||||
## Important Notes
|
||||
|
||||
- **Location**: `config.yaml` should be in `deer-flow/` (project root), not `deer-flow/backend/`
|
||||
- **Git**: `config.yaml` is automatically ignored by git (contains secrets)
|
||||
- **Priority**: If both `backend/config.yaml` and `../config.yaml` exist, backend version takes precedence
|
||||
|
||||
## Configuration File Locations
|
||||
|
||||
The backend searches for `config.yaml` in this order:
|
||||
|
||||
1. `DEER_FLOW_CONFIG_PATH` environment variable (if set)
|
||||
2. `backend/config.yaml` (current directory when running from backend/)
|
||||
3. `deer-flow/config.yaml` (parent directory - **recommended location**)
|
||||
|
||||
**Recommended**: Place `config.yaml` in project root (`deer-flow/config.yaml`).
|
||||
|
||||
## Sandbox Setup (Optional but Recommended)
|
||||
|
||||
If you plan to use Docker/Container-based sandbox (configured in `config.yaml` under `sandbox.use: deerflow.community.aio_sandbox:AioSandboxProvider`), it's highly recommended to pre-pull the container image:
|
||||
|
||||
```bash
|
||||
# From project root
|
||||
make setup-sandbox
|
||||
```
|
||||
|
||||
**Why pre-pull?**
|
||||
- The sandbox image (~500MB+) is pulled on first use, causing a long wait
|
||||
- Pre-pulling provides clear progress indication
|
||||
- Avoids confusion when first using the agent
|
||||
|
||||
If you skip this step, the image will be automatically pulled on first agent execution, which may take several minutes depending on your network speed.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Config file not found
|
||||
|
||||
```bash
|
||||
# Check where the backend is looking
|
||||
cd deer-flow/backend
|
||||
python -c "from deerflow.config.app_config import AppConfig; print(AppConfig.resolve_config_path())"
|
||||
```
|
||||
|
||||
If it can't find the config:
|
||||
1. Ensure you've copied `config.example.yaml` to `config.yaml`
|
||||
2. Verify you're in the correct directory
|
||||
3. Check the file exists: `ls -la ../config.yaml`
|
||||
|
||||
### Permission denied
|
||||
|
||||
```bash
|
||||
chmod 600 ../config.yaml # Protect sensitive configuration
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [Configuration Guide](CONFIGURATION.md) - Detailed configuration options
|
||||
- [Architecture Overview](../CLAUDE.md) - System architecture
|
||||
Reference in New Issue
Block a user