Initial commit: hardened DeerFlow factory
Vendored deer-flow upstream (bytedance/deer-flow) plus prompt-injection hardening: - New deerflow.security package: content_delimiter, html_cleaner, sanitizer (8 layers — invisible chars, control chars, symbols, NFC, PUA, tag chars, horizontal whitespace collapse with newline/tab preservation, length cap) - New deerflow.community.searx package: web_search, web_fetch, image_search backed by a private SearX instance, every external string sanitized and wrapped in <<<EXTERNAL_UNTRUSTED_CONTENT>>> delimiters - All native community web providers (ddg_search, tavily, exa, firecrawl, jina_ai, infoquest, image_search) replaced with hard-fail stubs that raise NativeWebToolDisabledError at import time, so a misconfigured tool.use path fails loud rather than silently falling back to unsanitized output - Native client back-doors (jina_client.py, infoquest_client.py) stubbed too - Native-tool tests quarantined under tests/_disabled_native/ (collect_ignore_glob via local conftest.py) - Sanitizer Layer 7 fix: only collapse horizontal whitespace, preserve newlines and tabs so list/table structure survives - Hardened runtime config.yaml references only the searx-backed tools - Factory overlay (backend/) kept in sync with deer-flow tree as a reference / source See HARDENING.md for the full audit trail and verification steps.
This commit is contained in:
55
deer-flow/backend/docs/README.md
Normal file
55
deer-flow/backend/docs/README.md
Normal file
@@ -0,0 +1,55 @@
|
||||
# Documentation
|
||||
|
||||
This directory contains detailed documentation for the DeerFlow backend.
|
||||
|
||||
## Quick Links
|
||||
|
||||
| Document | Description |
|
||||
|----------|-------------|
|
||||
| [ARCHITECTURE.md](ARCHITECTURE.md) | System architecture overview |
|
||||
| [API.md](API.md) | Complete API reference |
|
||||
| [CONFIGURATION.md](CONFIGURATION.md) | Configuration options |
|
||||
| [SETUP.md](SETUP.md) | Quick setup guide |
|
||||
|
||||
## Feature Documentation
|
||||
|
||||
| Document | Description |
|
||||
|----------|-------------|
|
||||
| [STREAMING.md](STREAMING.md) | Token-level streaming design: Gateway vs DeerFlowClient paths, `stream_mode` semantics, per-id dedup |
|
||||
| [FILE_UPLOAD.md](FILE_UPLOAD.md) | File upload functionality |
|
||||
| [PATH_EXAMPLES.md](PATH_EXAMPLES.md) | Path types and usage examples |
|
||||
| [summarization.md](summarization.md) | Context summarization feature |
|
||||
| [plan_mode_usage.md](plan_mode_usage.md) | Plan mode with TodoList |
|
||||
| [AUTO_TITLE_GENERATION.md](AUTO_TITLE_GENERATION.md) | Automatic title generation |
|
||||
|
||||
## Development
|
||||
|
||||
| Document | Description |
|
||||
|----------|-------------|
|
||||
| [TODO.md](TODO.md) | Planned features and known issues |
|
||||
|
||||
## Getting Started
|
||||
|
||||
1. **New to DeerFlow?** Start with [SETUP.md](SETUP.md) for quick installation
|
||||
2. **Configuring the system?** See [CONFIGURATION.md](CONFIGURATION.md)
|
||||
3. **Understanding the architecture?** Read [ARCHITECTURE.md](ARCHITECTURE.md)
|
||||
4. **Building integrations?** Check [API.md](API.md) for API reference
|
||||
|
||||
## Document Organization
|
||||
|
||||
```
|
||||
docs/
|
||||
├── README.md # This file
|
||||
├── ARCHITECTURE.md # System architecture
|
||||
├── API.md # API reference
|
||||
├── CONFIGURATION.md # Configuration guide
|
||||
├── SETUP.md # Setup instructions
|
||||
├── FILE_UPLOAD.md # File upload feature
|
||||
├── PATH_EXAMPLES.md # Path usage examples
|
||||
├── summarization.md # Summarization feature
|
||||
├── plan_mode_usage.md # Plan mode feature
|
||||
├── STREAMING.md # Token-level streaming design
|
||||
├── AUTO_TITLE_GENERATION.md # Title generation
|
||||
├── TITLE_GENERATION_IMPLEMENTATION.md # Title implementation details
|
||||
└── TODO.md # Roadmap and issues
|
||||
```
|
||||
Reference in New Issue
Block a user