Initial commit: hardened DeerFlow factory

Vendored deer-flow upstream (bytedance/deer-flow) plus prompt-injection hardening: - New deerflow.security package: content_delimiter, html_cleaner, sanitizer (8 layers — invisible chars, control chars, symbols, NFC, PUA, tag chars, horizontal whitespace collapse with newline/tab preservation, length cap) - New deerflow.community.searx package: web_search, web_fetch, image_search backed by a private SearX instance, every external string sanitized and wrapped in <<<EXTERNAL_UNTRUSTED_CONTENT>>> delimiters - All native community web providers (ddg_search, tavily, exa, firecrawl, jina_ai, infoquest, image_search) replaced with hard-fail stubs that raise NativeWebToolDisabledError at import time, so a misconfigured tool.use path fails loud rather than silently falling back to unsanitized output - Native client back-doors (jina_client.py, infoquest_client.py) stubbed too - Native-tool tests quarantined under tests/_disabled_native/ (collect_ignore_glob via local conftest.py) - Sanitizer Layer 7 fix: only collapse horizontal whitespace, preserve newlines and tabs so list/table structure survives - Hardened runtime config.yaml references only the searx-backed tools - Factory overlay (backend/) kept in sync with deer-flow tree as a reference / source See HARDENING.md for the full audit trail and verification steps.
2026-04-12 14:23:57 +02:00
commit 6de0bf9f5b
889 changed files with 173052 additions and 0 deletions
--- a/deer-flow/skills/public/video-generation/SKILL.md
+++ b/deer-flow/skills/public/video-generation/SKILL.md
@@ -0,0 +1,139 @@
+---
+name: video-generation
+description: Use this skill when the user requests to generate, create, or imagine videos. Supports structured prompts and reference image for guided generation.
+---
+
+# Video Generation Skill
+
+## Overview
+
+This skill generates high-quality videos using structured prompts and a Python script. The workflow includes creating JSON-formatted prompts and executing video generation with optional reference image.
+
+## Core Capabilities
+
+- Create structured JSON prompts for AIGC video generation
+- Support reference image as guidance or the first/last frame of the video
+- Generate videos through automated Python script execution
+
+## Workflow
+
+### Step 1: Understand Requirements
+
+When a user requests video generation, identify:
+
+- Subject/content: What should be in the image
+- Style preferences: Art style, mood, color palette
+- Technical specs: Aspect ratio, composition, lighting
+- Reference image: Any image to guide generation
+- You don't need to check the folder under `/mnt/user-data`
+
+### Step 2: Create Structured Prompt
+
+Generate a structured JSON file in `/mnt/user-data/workspace/` with naming pattern: `{descriptive-name}.json`
+
+### Step 3: Create Reference Image (Optional when image-generation skill is available)
+
+Generate reference image for the video generation.
+
+- If only 1 image is provided, use it as the guided frame of the video
+
+### Step 3: Execute Generation
+
+Call the Python script:
+```bash
+python /mnt/skills/public/video-generation/scripts/generate.py \
+  --prompt-file /mnt/user-data/workspace/prompt-file.json \
+  --reference-images /path/to/ref1.jpg \
+  --output-file /mnt/user-data/outputs/generated-video.mp4 \
+  --aspect-ratio 16:9
+```
+
+Parameters:
+
+- `--prompt-file`: Absolute path to JSON prompt file (required)
+- `--reference-images`: Absolute paths to reference image (optional)
+- `--output-file`: Absolute path to output image file (required)
+- `--aspect-ratio`: Aspect ratio of the generated image (optional, default: 16:9)
+
+[!NOTE]
+Do NOT read the python file, instead just call it with the parameters.
+
+## Video Generation Example
+
+User request: "Generate a short video clip depicting the opening scene from "The Chronicles of Narnia: The Lion, the Witch and the Wardrobe"
+
+Step 1: Search for the opening scene of "The Chronicles of Narnia: The Lion, the Witch and the Wardrobe" online
+
+Step 2: Create a JSON prompt file with the following content:
+
+```json
+{
+  "title": "The Chronicles of Narnia - Train Station Farewell",
+  "background": {
+    "description": "World War II evacuation scene at a crowded London train station. Steam and smoke fill the air as children are being sent to the countryside to escape the Blitz.",
+    "era": "1940s wartime Britain",
+    "location": "London railway station platform"
+  },
+  "characters": ["Mrs. Pevensie", "Lucy Pevensie"],
+  "camera": {
+    "type": "Close-up two-shot",
+    "movement": "Static with subtle handheld movement",
+    "angle": "Profile view, intimate framing",
+    "focus": "Both faces in focus, background soft bokeh"
+  },
+  "dialogue": [
+    {
+      "character": "Mrs. Pevensie",
+      "text": "You must be brave for me, darling. I'll come for you... I promise."
+    },
+    {
+      "character": "Lucy Pevensie",
+      "text": "I will be, mother. I promise."
+    }
+  ],
+  "audio": [
+    {
+      "type": "Train whistle blows (signaling departure)",
+      "volume": 1
+    },
+    {
+      "type": "Strings swell emotionally, then fade",
+      "volume": 0.5
+    },
+    {
+      "type": "Ambient sound of the train station",
+      "volume": 0.5
+    }
+  ]
+}
+```
+
+Step 3: Use the image-generation skill to generate the reference image
+
+Load the image-generation skill and generate a single reference image `narnia-farewell-scene-01.jpg` according to the skill.
+
+Step 4: Use the generate.py script to generate the video
+```bash
+python /mnt/skills/public/video-generation/scripts/generate.py \
+  --prompt-file /mnt/user-data/workspace/narnia-farewell-scene.json \
+  --reference-images /mnt/user-data/outputs/narnia-farewell-scene-01.jpg \
+  --output-file /mnt/user-data/outputs/narnia-farewell-scene-01.mp4 \
+  --aspect-ratio 16:9
+```
+> Do NOT read the python file, just call it with the parameters.
+
+## Output Handling
+
+After generation:
+
+- Videos are typically saved in `/mnt/user-data/outputs/`
+- Share generated videos (come first) with user as well as generated image if applicable, using `present_files` tool
+- Provide brief description of the generation result
+- Offer to iterate if adjustments needed
+
+## Notes
+
+- Always use English for prompts regardless of user's language
+- JSON format ensures structured, parsable prompts
+- Reference image enhance generation quality significantly
+- Iterative refinement is normal for optimal results
--- a/deer-flow/skills/public/video-generation/scripts/generate.py
+++ b/deer-flow/skills/public/video-generation/scripts/generate.py
@@ -0,0 +1,116 @@
+import base64
+import os
+import time
+
+import requests
+
+
+def generate_video(
+    prompt_file: str,
+    reference_images: list[str],
+    output_file: str,
+    aspect_ratio: str = "16:9",
+) -> str:
+    with open(prompt_file, "r", encoding="utf-8") as f:
+        prompt = f.read()
+    referenceImages = []
+    i = 0
+    json = {
+        "instances": [{"prompt": prompt}],
+    }
+    for reference_image in reference_images:
+        i += 1
+        with open(reference_image, "rb") as f:
+            image_b64 = base64.b64encode(f.read()).decode("utf-8")
+        referenceImages.append(
+            {
+                "image": {"mimeType": "image/jpeg", "bytesBase64Encoded": image_b64},
+                "referenceType": "asset",
+            }
+        )
+    if i > 0:
+        json["instances"][0]["referenceImages"] = referenceImages
+    api_key = os.getenv("GEMINI_API_KEY")
+    if not api_key:
+        return "GEMINI_API_KEY is not set"
+    response = requests.post(
+        "https://generativelanguage.googleapis.com/v1beta/models/veo-3.1-generate-preview:predictLongRunning",
+        headers={
+            "x-goog-api-key": api_key,
+            "Content-Type": "application/json",
+        },
+        json=json,
+    )
+    json = response.json()
+    operation_name = json["name"]
+    while True:
+        response = requests.get(
+            f"https://generativelanguage.googleapis.com/v1beta/{operation_name}",
+            headers={
+                "x-goog-api-key": api_key,
+            },
+        )
+        json = response.json()
+        if json.get("done", False):
+            sample = json["response"]["generateVideoResponse"]["generatedSamples"][0]
+            url = sample["video"]["uri"]
+            download(url, output_file)
+            break
+        time.sleep(3)
+    return f"The video has been generated successfully to {output_file}"
+
+
+def download(url: str, output_file: str):
+    api_key = os.getenv("GEMINI_API_KEY")
+    if not api_key:
+        return "GEMINI_API_KEY is not set"
+    response = requests.get(
+        url,
+        headers={
+            "x-goog-api-key": api_key,
+        },
+    )
+    with open(output_file, "wb") as f:
+        f.write(response.content)
+
+
+if __name__ == "__main__":
+    import argparse
+
+    parser = argparse.ArgumentParser(description="Generate videos using Gemini API")
+    parser.add_argument(
+        "--prompt-file",
+        required=True,
+        help="Absolute path to JSON prompt file",
+    )
+    parser.add_argument(
+        "--reference-images",
+        nargs="*",
+        default=[],
+        help="Absolute paths to reference images (space-separated)",
+    )
+    parser.add_argument(
+        "--output-file",
+        required=True,
+        help="Output path for generated image",
+    )
+    parser.add_argument(
+        "--aspect-ratio",
+        required=False,
+        default="16:9",
+        help="Aspect ratio of the generated image",
+    )
+
+    args = parser.parse_args()
+
+    try:
+        print(
+            generate_video(
+                args.prompt_file,
+                args.reference_images,
+                args.output_file,
+                args.aspect_ratio,
+            )
+        )
+    except Exception as e:
+        print(f"Error while generating video: {e}")