# DeerFlow Prompt Injection Protection Integration Plan **Based on OpenClaw Hardened Scripts Analysis** **Date:** 2026-04-11 **Source Reference:** `~/.openclaw/workspace-websearch/searx-scripts/` and `~/.openclaw/workspace-websearch/fetch-scripts/` --- ## Executive Summary This document outlines the integration of OpenClaw-style prompt injection hardening into DeerFlow's web search and web fetch tools. The OpenClaw implementation demonstrates a **defense-in-depth** approach with multiple sanitization layers and clear content delimitation. **Current State:** DeerFlow has NO prompt injection protection for web search/fetch results. **Target State:** Multi-layer sanitization with content delimiters and hardened script execution. --- ## 1. Analysis of OpenClaw Protection Layers ### 1.1 Content Delimiter Pattern (CRITICAL) OpenClaw wraps external content with explicit markers: ``` <<>> {sanitized_search_results} <<>> ``` **Benefit:** LLM can semantically distinguish between system instructions and untrusted external data. ### 1.2 Unicode Attack Surface Reduction | Category | Characters | Purpose | |----------|-----------|---------| | Zero-width | `\u200b-\u200f`, `\u2060-\u2064` | Steganography, hidden payloads | | BOM/Format | `\ufeff`, `\ufffe` | Byte-order confusion | | Control | `\u00ad`, `\u034f` | Soft hyphen, grapheme joiner | | Private Use | `\uE000-\uF8FF` | Custom glyph substitution attacks | | Tag Characters | `\uE0000-\uE007F` | Unicode tag sequences | ### 1.3 HTML Threat Reduction Removed elements: `

World

" result = extract_secure_text(html) assert "script" not in result.lower() assert "alert" not in result assert "Hello" in result assert "World" in result ``` --- ## 5. Deployment Plan ### Step 1: Add Security Module ```bash cd /home/data/deerflow-factory/deer-flow/backend/packages/harness/deerflow mkdir -p security # Create sanitizer.py, content_delimiter.py, html_cleaner.py ``` ### Step 2: Add SearX Provider ```bash mkdir -p community/searx # Create __init__.py, tools.py ``` ### Step 3: Update Dependencies ```bash # Verify httpx is available (should be via langchain) uv pip show httpx ``` ### Step 4: Configuration 1. Copy `config.example.yaml` to `config.yaml` 2. Replace `web_search` and `web_fetch` tools with hardened SearX versions 3. Set `searx_url` to your private instance ### Step 5: Testing ```bash cd backend uv run python -m pytest tests/test_security_sanitizer.py -v uv run python -m pytest tests/test_searx_tools.py -v ``` --- ## 6. Migration Guide for Existing Deployments ### From DuckDuckGo/Tavily to Hardened SearX 1. **Backup current config:** ```bash cp config.yaml config.yaml.pre-security.bak ``` 2. **Update tools section:** ```yaml # OLD (remove or comment) # - name: web_search # group: web # use: deerflow.community.ddg_search.tools:web_search_tool # NEW (add) - name: web_search group: web use: deerflow.community.searx.tools:web_search_tool searx_url: http://your-searx:8888 max_results: 10 ``` 3. **Restart services:** ```bash make docker-restart # or make dev-restart ``` --- ## 7. Verification Checklist - [ ] Sanitizer unit tests pass - [ ] Content delimiter tests pass - [ ] HTML cleaner tests pass - [ ] SearX search integration tests pass - [ ] SearX fetch integration tests pass - [ ] Malicious payload test: zero-width characters removed - [ ] Malicious payload test: control characters removed - [ ] Malicious payload test: script tags stripped - [ ] Content delimiters present in output - [ ] Private SearX instance responds correctly - [ ] Configuration migration documented --- ## 8. References ### OpenClaw Sources - `~/.openclaw/workspace-websearch/searx-scripts/search.sh` - `~/.openclaw/workspace-websearch/fetch-scripts/fetch.sh` - `~/.openclaw/workspace-websearch/AGENTS.md` - `~/.openclaw/workspace-websearch/SOUL.md` ### OWASP Resources - OWASP Top 10 for LLM Applications: LLM01 (Prompt Injection) - OWASP LLM Threats: https://genai.owasp.org/llm-top-10/ ### DeerFlow Integration Points - `deerflow/community/ddg_search/tools.py` (reference) - `deerflow/community/jina_ai/tools.py` (reference) - `deerflow/guardrails/` (existing security framework) --- ## 9. Summary This integration plan brings OpenClaw's battle-tested prompt injection hardening to DeerFlow through: 1. **Content Delimiters**: Clear semantic boundary markers 2. **Unicode Sanitization**: Removal of zero-width and invisible characters 3. **HTML Threat Reduction**: Stripping of dangerous elements 4. **Length Limiting**: Context overflow protection 5. **Clean Architecture**: Reusable security module **Estimated Effort:** 2-3 days for full implementation and testing **Risk Level:** LOW (additive changes, existing tools remain available) **Security Impact:** HIGH (eliminates major prompt injection vector)