Files
DATA 6de0bf9f5b Initial commit: hardened DeerFlow factory
Vendored deer-flow upstream (bytedance/deer-flow) plus prompt-injection
hardening:

- New deerflow.security package: content_delimiter, html_cleaner,
  sanitizer (8 layers — invisible chars, control chars, symbols, NFC,
  PUA, tag chars, horizontal whitespace collapse with newline/tab
  preservation, length cap)
- New deerflow.community.searx package: web_search, web_fetch,
  image_search backed by a private SearX instance, every external
  string sanitized and wrapped in <<<EXTERNAL_UNTRUSTED_CONTENT>>>
  delimiters
- All native community web providers (ddg_search, tavily, exa,
  firecrawl, jina_ai, infoquest, image_search) replaced with hard-fail
  stubs that raise NativeWebToolDisabledError at import time, so a
  misconfigured tool.use path fails loud rather than silently falling
  back to unsanitized output
- Native client back-doors (jina_client.py, infoquest_client.py)
  stubbed too
- Native-tool tests quarantined under tests/_disabled_native/
  (collect_ignore_glob via local conftest.py)
- Sanitizer Layer 7 fix: only collapse horizontal whitespace, preserve
  newlines and tabs so list/table structure survives
- Hardened runtime config.yaml references only the searx-backed tools
- Factory overlay (backend/) kept in sync with deer-flow tree as a
  reference / source

See HARDENING.md for the full audit trail and verification steps.
2026-04-12 14:23:57 +02:00

4.6 KiB

name, description
name description
video-generation Use this skill when the user requests to generate, create, or imagine videos. Supports structured prompts and reference image for guided generation.

Video Generation Skill

Overview

This skill generates high-quality videos using structured prompts and a Python script. The workflow includes creating JSON-formatted prompts and executing video generation with optional reference image.

Core Capabilities

  • Create structured JSON prompts for AIGC video generation
  • Support reference image as guidance or the first/last frame of the video
  • Generate videos through automated Python script execution

Workflow

Step 1: Understand Requirements

When a user requests video generation, identify:

  • Subject/content: What should be in the image
  • Style preferences: Art style, mood, color palette
  • Technical specs: Aspect ratio, composition, lighting
  • Reference image: Any image to guide generation
  • You don't need to check the folder under /mnt/user-data

Step 2: Create Structured Prompt

Generate a structured JSON file in /mnt/user-data/workspace/ with naming pattern: {descriptive-name}.json

Step 3: Create Reference Image (Optional when image-generation skill is available)

Generate reference image for the video generation.

  • If only 1 image is provided, use it as the guided frame of the video

Step 3: Execute Generation

Call the Python script:

python /mnt/skills/public/video-generation/scripts/generate.py \
  --prompt-file /mnt/user-data/workspace/prompt-file.json \
  --reference-images /path/to/ref1.jpg \
  --output-file /mnt/user-data/outputs/generated-video.mp4 \
  --aspect-ratio 16:9

Parameters:

  • --prompt-file: Absolute path to JSON prompt file (required)
  • --reference-images: Absolute paths to reference image (optional)
  • --output-file: Absolute path to output image file (required)
  • --aspect-ratio: Aspect ratio of the generated image (optional, default: 16:9)

[!NOTE] Do NOT read the python file, instead just call it with the parameters.

Video Generation Example

User request: "Generate a short video clip depicting the opening scene from "The Chronicles of Narnia: The Lion, the Witch and the Wardrobe"

Step 1: Search for the opening scene of "The Chronicles of Narnia: The Lion, the Witch and the Wardrobe" online

Step 2: Create a JSON prompt file with the following content:

{
  "title": "The Chronicles of Narnia - Train Station Farewell",
  "background": {
    "description": "World War II evacuation scene at a crowded London train station. Steam and smoke fill the air as children are being sent to the countryside to escape the Blitz.",
    "era": "1940s wartime Britain",
    "location": "London railway station platform"
  },
  "characters": ["Mrs. Pevensie", "Lucy Pevensie"],
  "camera": {
    "type": "Close-up two-shot",
    "movement": "Static with subtle handheld movement",
    "angle": "Profile view, intimate framing",
    "focus": "Both faces in focus, background soft bokeh"
  },
  "dialogue": [
    {
      "character": "Mrs. Pevensie",
      "text": "You must be brave for me, darling. I'll come for you... I promise."
    },
    {
      "character": "Lucy Pevensie",
      "text": "I will be, mother. I promise."
    }
  ],
  "audio": [
    {
      "type": "Train whistle blows (signaling departure)",
      "volume": 1
    },
    {
      "type": "Strings swell emotionally, then fade",
      "volume": 0.5
    },
    {
      "type": "Ambient sound of the train station",
      "volume": 0.5
    }
  ]
}

Step 3: Use the image-generation skill to generate the reference image

Load the image-generation skill and generate a single reference image narnia-farewell-scene-01.jpg according to the skill.

Step 4: Use the generate.py script to generate the video

python /mnt/skills/public/video-generation/scripts/generate.py \
  --prompt-file /mnt/user-data/workspace/narnia-farewell-scene.json \
  --reference-images /mnt/user-data/outputs/narnia-farewell-scene-01.jpg \
  --output-file /mnt/user-data/outputs/narnia-farewell-scene-01.mp4 \
  --aspect-ratio 16:9

Do NOT read the python file, just call it with the parameters.

Output Handling

After generation:

  • Videos are typically saved in /mnt/user-data/outputs/
  • Share generated videos (come first) with user as well as generated image if applicable, using present_files tool
  • Provide brief description of the generation result
  • Offer to iterate if adjustments needed

Notes

  • Always use English for prompts regardless of user's language
  • JSON format ensures structured, parsable prompts
  • Reference image enhance generation quality significantly
  • Iterative refinement is normal for optimal results