Initial commit: hardened DeerFlow factory

Vendored deer-flow upstream (bytedance/deer-flow) plus prompt-injection
hardening:

- New deerflow.security package: content_delimiter, html_cleaner,
  sanitizer (8 layers — invisible chars, control chars, symbols, NFC,
  PUA, tag chars, horizontal whitespace collapse with newline/tab
  preservation, length cap)
- New deerflow.community.searx package: web_search, web_fetch,
  image_search backed by a private SearX instance, every external
  string sanitized and wrapped in <<<EXTERNAL_UNTRUSTED_CONTENT>>>
  delimiters
- All native community web providers (ddg_search, tavily, exa,
  firecrawl, jina_ai, infoquest, image_search) replaced with hard-fail
  stubs that raise NativeWebToolDisabledError at import time, so a
  misconfigured tool.use path fails loud rather than silently falling
  back to unsanitized output
- Native client back-doors (jina_client.py, infoquest_client.py)
  stubbed too
- Native-tool tests quarantined under tests/_disabled_native/
  (collect_ignore_glob via local conftest.py)
- Sanitizer Layer 7 fix: only collapse horizontal whitespace, preserve
  newlines and tabs so list/table structure survives
- Hardened runtime config.yaml references only the searx-backed tools
- Factory overlay (backend/) kept in sync with deer-flow tree as a
  reference / source

See HARDENING.md for the full audit trail and verification steps.
This commit is contained in:
2026-04-12 14:23:57 +02:00
commit 6de0bf9f5b
889 changed files with 173052 additions and 0 deletions

View File

@@ -0,0 +1,105 @@
# Agents Architecture
## Overview
DeerFlow is built on a sophisticated agent-based architecture using the [LangGraph SDK](https://github.com/langchain-ai/langgraph) to enable intelligent, stateful AI interactions. This document outlines the agent system architecture, patterns, and best practices for working with agents in the frontend application.
## Architecture Overview
### Core Components
```
┌────────────────────────────────────────────────────────┐
│ Frontend (Next.js) │
├────────────────────────────────────────────────────────┤
│ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │
│ │ UI Components│───▶│ Thread Hooks │───▶│ LangGraph│ │
│ │ │ │ │ │ SDK │ │
│ └──────────────┘ └──────────────┘ └──────────┘ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────┐ │ │
│ └───────────▶│ Thread State │◀──────────┘ │
│ │ Management │ │
│ └──────────────┘ │
└────────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────┐
│ LangGraph Backend (lead_agent) │
│ ┌────────────┐ ┌──────────┐ ┌───────────────────┐ │
│ │Main Agent │─▶│Sub-Agents│─▶│ Tools & Skills │ │
│ └────────────┘ └──────────┘ └───────────────────┘ │
└────────────────────────────────────────────────────────┘
```
## Project Structure
```
src/
├── app/ # Next.js App Router pages
│ ├── api/ # API routes
│ ├── workspace/ # Main workspace pages
│ └── mock/ # Mock/demo pages
├── components/ # React components
│ ├── ui/ # Reusable UI components
│ ├── workspace/ # Workspace-specific components
│ ├── landing/ # Landing page components
│ └── ai-elements/ # AI-related UI elements
├── core/ # Core business logic
│ ├── api/ # API client & data fetching
│ ├── artifacts/ # Artifact management
│ ├── config/ # App configuration
│ ├── i18n/ # Internationalization
│ ├── mcp/ # MCP integration
│ ├── messages/ # Message handling
│ ├── models/ # Data models & types
│ ├── settings/ # User settings
│ ├── skills/ # Skills system
│ ├── threads/ # Thread management
│ ├── todos/ # Todo system
│ └── utils/ # Utility functions
├── hooks/ # Custom React hooks
├── lib/ # Shared libraries & utilities
├── server/ # Server-side code (Not available yet)
│ └── better-auth/ # Authentication setup (Not available yet)
└── styles/ # Global styles
```
### Technology Stack
- **LangGraph SDK** (`@langchain/langgraph-sdk@1.5.3`) - Agent orchestration and streaming
- **LangChain Core** (`@langchain/core@1.1.15`) - Fundamental AI building blocks
- **TanStack Query** (`@tanstack/react-query@5.90.17`) - Server state management
- **React Hooks** - Thread lifecycle and state management
- **Shadcn UI** - UI components
- **MagicUI** - Magic UI components
- **React Bits** - React bits components
### Interaction Ownership
- `src/app/workspace/chats/[thread_id]/page.tsx` owns composer busy-state wiring.
- `src/core/threads/hooks.ts` owns pre-submit upload state and thread submission.
- `src/hooks/usePoseStream.ts` is a passive store selector; global WebSocket lifecycle stays in `App.tsx`.
## Resources
- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/)
- [LangChain Core Concepts](https://js.langchain.com/docs/concepts)
- [TanStack Query Documentation](https://tanstack.com/query/latest)
- [Next.js App Router](https://nextjs.org/docs/app)
## Contributing
When adding new agent features:
1. Follow the established project structure
2. Add comprehensive TypeScript types
3. Implement proper error handling
4. Write tests for new functionality
5. Update this documentation
6. Follow the code style guide (ESLint + Prettier)
## License
This agent architecture is part of the DeerFlow project.