Files
deerflow-factory/deer-flow/backend/docs/task_tool_improvements.md
DATA 6de0bf9f5b Initial commit: hardened DeerFlow factory
Vendored deer-flow upstream (bytedance/deer-flow) plus prompt-injection
hardening:

- New deerflow.security package: content_delimiter, html_cleaner,
  sanitizer (8 layers — invisible chars, control chars, symbols, NFC,
  PUA, tag chars, horizontal whitespace collapse with newline/tab
  preservation, length cap)
- New deerflow.community.searx package: web_search, web_fetch,
  image_search backed by a private SearX instance, every external
  string sanitized and wrapped in <<<EXTERNAL_UNTRUSTED_CONTENT>>>
  delimiters
- All native community web providers (ddg_search, tavily, exa,
  firecrawl, jina_ai, infoquest, image_search) replaced with hard-fail
  stubs that raise NativeWebToolDisabledError at import time, so a
  misconfigured tool.use path fails loud rather than silently falling
  back to unsanitized output
- Native client back-doors (jina_client.py, infoquest_client.py)
  stubbed too
- Native-tool tests quarantined under tests/_disabled_native/
  (collect_ignore_glob via local conftest.py)
- Sanitizer Layer 7 fix: only collapse horizontal whitespace, preserve
  newlines and tabs so list/table structure survives
- Hardened runtime config.yaml references only the searx-backed tools
- Factory overlay (backend/) kept in sync with deer-flow tree as a
  reference / source

See HARDENING.md for the full audit trail and verification steps.
2026-04-12 14:23:57 +02:00

175 lines
5.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Task Tool Improvements
## Overview
The task tool has been improved to eliminate wasteful LLM polling. Previously, when using background tasks, the LLM had to repeatedly call `task_status` to poll for completion, causing unnecessary API requests.
## Changes Made
### 1. Removed `run_in_background` Parameter
The `run_in_background` parameter has been removed from the `task` tool. All subagent tasks now run asynchronously by default, but the tool handles completion automatically.
**Before:**
```python
# LLM had to manage polling
task_id = task(
subagent_type="bash",
prompt="Run tests",
description="Run tests",
run_in_background=True
)
# Then LLM had to poll repeatedly:
while True:
status = task_status(task_id)
if completed:
break
```
**After:**
```python
# Tool blocks until complete, polling happens in backend
result = task(
subagent_type="bash",
prompt="Run tests",
description="Run tests"
)
# Result is available immediately after the call returns
```
### 2. Backend Polling
The `task_tool` now:
- Starts the subagent task asynchronously
- Polls for completion in the backend (every 2 seconds)
- Blocks the tool call until completion
- Returns the final result directly
This means:
- ✅ LLM makes only ONE tool call
- ✅ No wasteful LLM polling requests
- ✅ Backend handles all status checking
- ✅ Timeout protection (5 minutes max)
### 3. Removed `task_status` from LLM Tools
The `task_status_tool` is no longer exposed to the LLM. It's kept in the codebase for potential internal/debugging use, but the LLM cannot call it.
### 4. Updated Documentation
- Updated `SUBAGENT_SECTION` in `prompt.py` to remove all references to background tasks and polling
- Simplified usage examples
- Made it clear that the tool automatically waits for completion
## Implementation Details
### Polling Logic
Located in `packages/harness/deerflow/tools/builtins/task_tool.py`:
```python
# Start background execution
task_id = executor.execute_async(prompt)
# Poll for task completion in backend
while True:
result = get_background_task_result(task_id)
# Check if task completed or failed
if result.status == SubagentStatus.COMPLETED:
return f"[Subagent: {subagent_type}]\n\n{result.result}"
elif result.status == SubagentStatus.FAILED:
return f"[Subagent: {subagent_type}] Task failed: {result.error}"
# Wait before next poll
time.sleep(2)
# Timeout protection (5 minutes)
if poll_count > 150:
return "Task timed out after 5 minutes"
```
### Execution Timeout
In addition to polling timeout, subagent execution now has a built-in timeout mechanism:
**Configuration** (`packages/harness/deerflow/subagents/config.py`):
```python
@dataclass
class SubagentConfig:
# ...
timeout_seconds: int = 300 # 5 minutes default
```
**Thread Pool Architecture**:
To avoid nested thread pools and resource waste, we use two dedicated thread pools:
1. **Scheduler Pool** (`_scheduler_pool`):
- Max workers: 4
- Purpose: Orchestrates background task execution
- Runs `run_task()` function that manages task lifecycle
2. **Execution Pool** (`_execution_pool`):
- Max workers: 8 (larger to avoid blocking)
- Purpose: Actual subagent execution with timeout support
- Runs `execute()` method that invokes the agent
**How it works**:
```python
# In execute_async():
_scheduler_pool.submit(run_task) # Submit orchestration task
# In run_task():
future = _execution_pool.submit(self.execute, task) # Submit execution
exec_result = future.result(timeout=timeout_seconds) # Wait with timeout
```
**Benefits**:
- ✅ Clean separation of concerns (scheduling vs execution)
- ✅ No nested thread pools
- ✅ Timeout enforcement at the right level
- ✅ Better resource utilization
**Two-Level Timeout Protection**:
1. **Execution Timeout**: Subagent execution itself has a 5-minute timeout (configurable in SubagentConfig)
2. **Polling Timeout**: Tool polling has a 5-minute timeout (30 polls × 10 seconds)
This ensures that even if subagent execution hangs, the system won't wait indefinitely.
### Benefits
1. **Reduced API Costs**: No more repeated LLM requests for polling
2. **Simpler UX**: LLM doesn't need to manage polling logic
3. **Better Reliability**: Backend handles all status checking consistently
4. **Timeout Protection**: Two-level timeout prevents infinite waiting (execution + polling)
## Testing
To verify the changes work correctly:
1. Start a subagent task that takes a few seconds
2. Verify the tool call blocks until completion
3. Verify the result is returned directly
4. Verify no `task_status` calls are made
Example test scenario:
```python
# This should block for ~10 seconds then return result
result = task(
subagent_type="bash",
prompt="sleep 10 && echo 'Done'",
description="Test task"
)
# result should contain "Done"
```
## Migration Notes
For users/code that previously used `run_in_background=True`:
- Simply remove the parameter
- Remove any polling logic
- The tool will automatically wait for completion
No other changes needed - the API is backward compatible (minus the removed parameter).