# Code Review Report: Feature 3 — RAG & Tool Services

This report performs a deep-dive audit of the "Brain & Hands" layer, focusing on `rag.py`, `tool.py`, and `sub_agent.py` through the lens of **12-Factor App Methodology**, **Pythonic Code Style**, and **Architectural Stability**.

---

## 🏗️ 12-Factor App Compliance Audit

| Factor | Status | Observation |
| :--- | :--- | :--- |
| **V. Build, Release, Run** | 🔴 **Major Issue** | **Dynamic Skill Loading**: `ToolService` parses Markdown (`SKILL.md`) and YAML at **runtime** for every tool query. This violates the separation of Build/Run and introduces high-latency Disk I/O into the hot path of the LLM interaction. |
| **VI. Processes** | 🔴 **Problem** | **Task State Volatility**: `SubAgent` monitors long-running gRPC tasks using in-memory loops. If the Hub restarts, the "monitoring" state is lost and the Hub cannot re-attach to the still-running node-side task. |
| **IX. Disposability** | 🔴 **Performance** | **Blocking I/O in Async Loop**: `rag.py` (Line 287) performs **Synchronous DB Commits** inside an `async-for` streaming loop. This blocks the entire Python event loop, causing latency spikes for all concurrent users during every token-emission cycle. |

---

## 🔍 File-by-File Diagnostic

### 1. `app/core/services/rag.py`
The orchestrator for LLM chat and context injection.

**Identified Problems**:
*   **Memory Pressure**: `joinedload(models.Session.messages)` is called on every request. For long-running sessions, this loads hundreds of messages into memory unnecessarily when only the last few might be needed for the prompt window.
*   **Attached Node Bloat**: If a session has many attached nodes, the loop (Lines 116-194) pulls terminal history for **every** node into the system prompt. This drastically increases token costs and degrades RAG accuracy due to "Context Pollution".

---

### 2. `app/core/services/tool.py`
Manages skill discovery, permission routing, and execution.

> [!CAUTION]
> **CRITICAL SECURITY RISK: Shell Injection**
> Line 383: `cmd = bash_logic.replace(f"${{{k}}}", str(v))` performs raw string replacement for shell commands.
> An attacker providing an argument value like `; rm -rf /` or `$(curl ...)` will successfully execute arbitrary code if the `bash_logic` template is not carefully constructed. 
> **Fix**: ALWAYS use `shlex.quote()` for shell argument interpolation (as seen correctly on line 387).

**Identified Problems**:
*   **Complexity / SRP Violation**: `get_available_tools` is currently 160 lines long and handles DB queries, YAML parsing, Regex Markdown extraction, and LiteLLM model info lookups. 
*   **Runtime Class Definition**: `DynamicBashPlugin` is defined inside a loop. This is an anti-pattern that slows down execution and makes debugging stack traces difficult.

---

### 3. `app/core/services/sub_agent.py`
The atomic execution watcher.

**Identified Problems**:
*   **Fragile Error Handling**: Retries are based on a simple `any(x in err_msg)` check (Line 85). If the gRPC error string changes slightly, retries will fail to trigger.
*   **Timeout Rigidity**: `max_checks = 120` (10 minutes) is hardcoded (Line 200). Long-running code builds or system updates on slow nodes will be prematurely cut off by the SubAgent even if they are still healthy.

---

## 🛠️ Summary Recommendations

1.  **Security Fix**: Immediately audit all `SKILL.md` parsing logic in `tool.py` to ensure `shlex.quote` is applied to ALL user-controllable inputs before shell execution.
2.  **Optimize DB Usage**: Use `await db.commit()` (if using an async driver) or move `db.commit()` to an external thread to avoid freezing the FastAPI event loop during token streaming.
3.  **Cache Skill Definitions**: Pre-calculate tool schemas during the **Bootstrap/Build** phase or cache them in Redis. Parsing Markdown for every tool listed in every chat message is unsustainable.

---

**Please review this third feature audit. If approved, I will proceed to the final backend feature: API, Routing & Security.**