This report performs a deep-dive audit of the "Brain & Hands" layer, focusing on rag.py, tool.py, and sub_agent.py through the lens of 12-Factor App Methodology, Pythonic Code Style, and Architectural Stability.
| Factor | Status | Observation |
|---|---|---|
| V. Build, Release, Run | 🔴 Major Issue | Dynamic Skill Loading: ToolService parses Markdown (SKILL.md) and YAML at runtime for every tool query. This violates the separation of Build/Run and introduces high-latency Disk I/O into the hot path of the LLM interaction. |
| VI. Processes | 🔴 Problem | Task State Volatility: SubAgent monitors long-running gRPC tasks using in-memory loops. If the Hub restarts, the "monitoring" state is lost and the Hub cannot re-attach to the still-running node-side task. |
| IX. Disposability | 🔴 Performance | Blocking I/O in Async Loop: rag.py (Line 287) performs Synchronous DB Commits inside an async-for streaming loop. This blocks the entire Python event loop, causing latency spikes for all concurrent users during every token-emission cycle. |
app/core/services/rag.pyThe orchestrator for LLM chat and context injection.
Identified Problems:
joinedload(models.Session.messages) is called on every request. For long-running sessions, this loads hundreds of messages into memory unnecessarily when only the last few might be needed for the prompt window.app/core/services/tool.pyManages skill discovery, permission routing, and execution.
[!CAUTION] CRITICAL SECURITY RISK: Shell Injection Line 383:
cmd = bash_logic.replace(f"${{{k}}}", str(v))performs raw string replacement for shell commands. An attacker providing an argument value like; rm -rf /or$(curl ...)will successfully execute arbitrary code if thebash_logictemplate is not carefully constructed. Fix: ALWAYS useshlex.quote()for shell argument interpolation (as seen correctly on line 387).
Identified Problems:
get_available_tools is currently 160 lines long and handles DB queries, YAML parsing, Regex Markdown extraction, and LiteLLM model info lookups.DynamicBashPlugin is defined inside a loop. This is an anti-pattern that slows down execution and makes debugging stack traces difficult.app/core/services/sub_agent.pyThe atomic execution watcher.
Identified Problems:
any(x in err_msg) check (Line 85). If the gRPC error string changes slightly, retries will fail to trigger.max_checks = 120 (10 minutes) is hardcoded (Line 200). Long-running code builds or system updates on slow nodes will be prematurely cut off by the SubAgent even if they are still healthy.SKILL.md parsing logic in tool.py to ensure shlex.quote is applied to ALL user-controllable inputs before shell execution.await db.commit() (if using an async driver) or move db.commit() to an external thread to avoid freezing the FastAPI event loop during token streaming.Please review this third feature audit. If approved, I will proceed to the final backend feature: API, Routing & Security.