Initial commit: hardened DeerFlow factory

Vendored deer-flow upstream (bytedance/deer-flow) plus prompt-injection hardening: - New deerflow.security package: content_delimiter, html_cleaner, sanitizer (8 layers — invisible chars, control chars, symbols, NFC, PUA, tag chars, horizontal whitespace collapse with newline/tab preservation, length cap) - New deerflow.community.searx package: web_search, web_fetch, image_search backed by a private SearX instance, every external string sanitized and wrapped in <<<EXTERNAL_UNTRUSTED_CONTENT>>> delimiters - All native community web providers (ddg_search, tavily, exa, firecrawl, jina_ai, infoquest, image_search) replaced with hard-fail stubs that raise NativeWebToolDisabledError at import time, so a misconfigured tool.use path fails loud rather than silently falling back to unsanitized output - Native client back-doors (jina_client.py, infoquest_client.py) stubbed too - Native-tool tests quarantined under tests/_disabled_native/ (collect_ignore_glob via local conftest.py) - Sanitizer Layer 7 fix: only collapse horizontal whitespace, preserve newlines and tabs so list/table structure survives - Hardened runtime config.yaml references only the searx-backed tools - Factory overlay (backend/) kept in sync with deer-flow tree as a reference / source See HARDENING.md for the full audit trail and verification steps.
2026-04-12 14:23:57 +02:00
commit 6de0bf9f5b
889 changed files with 173052 additions and 0 deletions
--- a/deer-flow/skills/public/github-deep-research/assets/report_template.md
+++ b/deer-flow/skills/public/github-deep-research/assets/report_template.md
@@ -0,0 +1,192 @@
+[!NOTE] Generate this report in user's own language.
+
+# {TITLE}
+
+- **Research Date:** {DATE}
+- **Timestamp:** {TIMESTAMP}
+- **Confidence Level:** {CONFIDENCE_LEVEL}
+- **Subject:** {SUBJECT_DESCRIPTION}
+
+---
+
+## Repository Information
+
+- **Name:** {REPOSITORY_NAME}
+- **Description:** {REPOSITORY_DESCRIPTION}
+- **URL:** {REPOSITORY_URL}
+- **Stars:** {REPOSITORY_STARS}
+- **Forks:** {REPOSITORY_FORKS}
+- **Open Issues:** {REPOSITORY_OPEN_ISSUES}
+- **Language(s):** {REPOSITORY_LANGUAGES}
+- **License:** {REPOSITORY_LICENSE}
+- **Created At:** {REPOSITORY_CREATED_AT}
+- **Updated At:** {REPOSITORY_UPDATED_AT}
+- **Pushed At:** {REPOSITORY_PUSHED_AT}
+- **Topics:** {REPOSITORY_TOPICS}
+
+---
+
+## Executive Summary
+
+{EXECUTIVE_SUMMARY}
+
+**IMPORTANT**: Include inline citations using `[citation:Title](URL)` format after each claim. Example:
+"The project gained 10k stars in 3 months [citation:GitHub Stats](https://github.com/owner/repo)."
+
+---
+
+## Complete Chronological Timeline
+
+### PHASE 1: {PHASE_1_NAME}
+
+#### {PHASE_1_PERIOD}
+
+{PHASE_1_CONTENT}
+
+### PHASE 2: {PHASE_2_NAME}
+
+#### {PHASE_2_PERIOD}
+
+{PHASE_2_CONTENT}
+
+### PHASE 3: {PHASE_3_NAME}
+
+#### {PHASE_3_PERIOD}
+
+{PHASE_3_CONTENT}
+
+---
+
+## Key Analysis
+
+**IMPORTANT**: Support each analysis point with inline citations `[citation:Title](URL)`.
+
+### {ANALYSIS_SECTION_1_TITLE}
+
+{ANALYSIS_SECTION_1_CONTENT}
+
+### {ANALYSIS_SECTION_2_TITLE}
+
+{ANALYSIS_SECTION_2_CONTENT}
+
+---
+
+## Architecture / System Overview
+
+```mermaid
+flowchart TD
+    A[Component A] --> B[Component B]
+    B --> C[Component C]
+    C --> D[Component D]
+```
+
+{ARCHITECTURE_DESCRIPTION}
+
+---
+
+## Metrics & Impact Analysis
+
+### Growth Trajectory
+
+```
+{METRICS_TIMELINE}
+```
+
+### Key Metrics
+
+| Metric | Value | Assessment |
+|--------|-------|------------|
+| {METRIC_1} | {VALUE_1} | {ASSESSMENT_1} |
+| {METRIC_2} | {VALUE_2} | {ASSESSMENT_2} |
+| {METRIC_3} | {VALUE_3} | {ASSESSMENT_3} |
+
+---
+
+## Comparative Analysis
+
+### Feature Comparison
+
+| Feature | {SUBJECT} | {COMPETITOR_1} | {COMPETITOR_2} |
+|---------|-----------|----------------|----------------|
+| {FEATURE_1} | {SUBJ_F1} | {COMP1_F1} | {COMP2_F1} |
+| {FEATURE_2} | {SUBJ_F2} | {COMP1_F2} | {COMP2_F2} |
+| {FEATURE_3} | {SUBJ_F3} | {COMP1_F3} | {COMP2_F3} |
+
+### Market Positioning
+
+{MARKET_POSITIONING}
+
+---
+
+## Strengths & Weaknesses
+
+### Strengths
+
+{STRENGTHS}
+
+### Areas for Improvement
+
+{WEAKNESSES}
+
+---
+
+## Key Success Factors
+
+{SUCCESS_FACTORS}
+
+---
+
+## Sources
+
+### Primary Sources
+
+{PRIMARY_SOURCES}
+
+### Media Coverage
+
+{MEDIA_SOURCES}
+
+### Academic / Technical Sources
+
+{ACADEMIC_SOURCES}
+
+### Community Sources
+
+{COMMUNITY_SOURCES}
+
+---
+
+## Confidence Assessment
+
+**High Confidence (90%+) Claims:**
+{HIGH_CONFIDENCE_CLAIMS}
+
+**Medium Confidence (70-89%) Claims:**
+{MEDIUM_CONFIDENCE_CLAIMS}
+
+**Lower Confidence (50-69%) Claims:**
+{LOW_CONFIDENCE_CLAIMS}
+
+---
+
+## Research Methodology
+
+This report was compiled using:
+
+1. **Multi-source web search** - Broad discovery and targeted queries
+2. **GitHub repository analysis** - Commits, issues, PRs, activity metrics
+3. **Content extraction** - Official docs, technical articles, media coverage
+4. **Cross-referencing** - Verification across independent sources
+5. **Chronological reconstruction** - Timeline from timestamped data
+6. **Confidence scoring** - Claims weighted by source reliability
+
+**Research Depth:** {RESEARCH_DEPTH}
+**Time Scope:** {TIME_SCOPE}
+**Geographic Scope:** {GEOGRAPHIC_SCOPE}
+
+---
+
+**Report Prepared By:** Github Deep Research by DeerFlow
+**Date:** {REPORT_DATE}
+**Report Version:** 1.0
+**Status:** Complete