Code Review Report: Feature 3 — RAG & Tool Services

This report performs a deep-dive audit of the "Brain & Hands" layer, focusing on rag.py, tool.py, and sub_agent.py through the lens of 12-Factor App Methodology, Pythonic Code Style, and Architectural Stability.

🏗️ 12-Factor App Compliance Audit

Factor	Status	Observation
V. Build, Release, Run	🔴 Major Issue	Dynamic Skill Loading: `ToolService` parses Markdown (`SKILL.md`) and YAML at runtime for every tool query. This violates the separation of Build/Run and introduces high-latency Disk I/O into the hot path of the LLM interaction.
VI. Processes	🔴 Problem	Task State Volatility: `SubAgent` monitors long-running gRPC tasks using in-memory loops. If the Hub restarts, the "monitoring" state is lost and the Hub cannot re-attach to the still-running node-side task.
IX. Disposability	🔴 Performance	Blocking I/O in Async Loop: `rag.py` (Line 287) performs Synchronous DB Commits inside an `async-for` streaming loop. This blocks the entire Python event loop, causing latency spikes for all concurrent users during every token-emission cycle.

🔍 File-by-File Diagnostic

1. `app/core/services/rag.py`

The orchestrator for LLM chat and context injection.

Identified Problems:

Memory Pressure: joinedload(models.Session.messages) is called on every request. For long-running sessions, this loads hundreds of messages into memory unnecessarily when only the last few might be needed for the prompt window.
Attached Node Bloat: If a session has many attached nodes, the loop (Lines 116-194) pulls terminal history for every node into the system prompt. This drastically increases token costs and degrades RAG accuracy due to "Context Pollution".

2. `app/core/services/tool.py`

Manages skill discovery, permission routing, and execution.

[!CAUTION] CRITICAL SECURITY RISK: Shell Injection Line 383: cmd = bash_logic.replace(f"${{{k}}}", str(v)) performs raw string replacement for shell commands. An attacker providing an argument value like ; rm -rf / or $(curl ...) will successfully execute arbitrary code if the bash_logic template is not carefully constructed. Fix: ALWAYS use shlex.quote() for shell argument interpolation (as seen correctly on line 387).

Identified Problems:

Complexity / SRP Violation: get_available_tools is currently 160 lines long and handles DB queries, YAML parsing, Regex Markdown extraction, and LiteLLM model info lookups.
Runtime Class Definition: DynamicBashPlugin is defined inside a loop. This is an anti-pattern that slows down execution and makes debugging stack traces difficult.

3. `app/core/services/sub_agent.py`

The atomic execution watcher.

Identified Problems:

Fragile Error Handling: Retries are based on a simple any(x in err_msg) check (Line 85). If the gRPC error string changes slightly, retries will fail to trigger.
Timeout Rigidity: max_checks = 120 (10 minutes) is hardcoded (Line 200). Long-running code builds or system updates on slow nodes will be prematurely cut off by the SubAgent even if they are still healthy.

🛠️ Summary Recommendations

Security Fix: Immediately audit all SKILL.md parsing logic in tool.py to ensure shlex.quote is applied to ALL user-controllable inputs before shell execution.
Optimize DB Usage: Use await db.commit() (if using an async driver) or move db.commit() to an external thread to avoid freezing the FastAPI event loop during token streaming.
Cache Skill Definitions: Pre-calculate tool schemas during the Bootstrap/Build phase or cache them in Redis. Parsing Markdown for every tool listed in every chat message is unsustainable.

Please review this third feature audit. If approved, I will proceed to the final backend feature: API, Routing & Security.

Code Review Report: Feature 3 — RAG & Tool Services

🏗️ 12-Factor App Compliance Audit

🔍 File-by-File Diagnostic

1. app/core/services/rag.py

2. app/core/services/tool.py

3. app/core/services/sub_agent.py

🛠️ Summary Recommendations

1. `app/core/services/rag.py`

2. `app/core/services/tool.py`

3. `app/core/services/sub_agent.py`