Newer
Older
cortex-hub / docs / reviews / feature_review_rag_tools.md

Code Review Report: Feature 3 — RAG & Tool Services

This report performs a deep-dive audit of the "Brain & Hands" layer, focusing on rag.py, tool.py, and sub_agent.py through the lens of 12-Factor App Methodology, Pythonic Code Style, and Architectural Stability.


🏗️ 12-Factor App Compliance Audit

Factor Status Observation
V. Build, Release, Run 🔴 Major Issue Dynamic Skill Loading: ToolService parses Markdown (SKILL.md) and YAML at runtime for every tool query. This violates the separation of Build/Run and introduces high-latency Disk I/O into the hot path of the LLM interaction.
VI. Processes 🔴 Problem Task State Volatility: SubAgent monitors long-running gRPC tasks using in-memory loops. If the Hub restarts, the "monitoring" state is lost and the Hub cannot re-attach to the still-running node-side task.
IX. Disposability 🔴 Performance Blocking I/O in Async Loop: rag.py (Line 287) performs Synchronous DB Commits inside an async-for streaming loop. This blocks the entire Python event loop, causing latency spikes for all concurrent users during every token-emission cycle.

🔍 File-by-File Diagnostic

1. app/core/services/rag.py

The orchestrator for LLM chat and context injection.

Identified Problems:

  • Memory Pressure: joinedload(models.Session.messages) is called on every request. For long-running sessions, this loads hundreds of messages into memory unnecessarily when only the last few might be needed for the prompt window.
  • Attached Node Bloat: If a session has many attached nodes, the loop (Lines 116-194) pulls terminal history for every node into the system prompt. This drastically increases token costs and degrades RAG accuracy due to "Context Pollution".

2. app/core/services/tool.py

Manages skill discovery, permission routing, and execution.

[!CAUTION] CRITICAL SECURITY RISK: Shell Injection Line 383: cmd = bash_logic.replace(f"${{{k}}}", str(v)) performs raw string replacement for shell commands. An attacker providing an argument value like ; rm -rf / or $(curl ...) will successfully execute arbitrary code if the bash_logic template is not carefully constructed. Fix: ALWAYS use shlex.quote() for shell argument interpolation (as seen correctly on line 387).

Identified Problems:

  • Complexity / SRP Violation: get_available_tools is currently 160 lines long and handles DB queries, YAML parsing, Regex Markdown extraction, and LiteLLM model info lookups.
  • Runtime Class Definition: DynamicBashPlugin is defined inside a loop. This is an anti-pattern that slows down execution and makes debugging stack traces difficult.

3. app/core/services/sub_agent.py

The atomic execution watcher.

Identified Problems:

  • Fragile Error Handling: Retries are based on a simple any(x in err_msg) check (Line 85). If the gRPC error string changes slightly, retries will fail to trigger.
  • Timeout Rigidity: max_checks = 120 (10 minutes) is hardcoded (Line 200). Long-running code builds or system updates on slow nodes will be prematurely cut off by the SubAgent even if they are still healthy.

🛠️ Summary Recommendations

  1. Security Fix: Immediately audit all SKILL.md parsing logic in tool.py to ensure shlex.quote is applied to ALL user-controllable inputs before shell execution.
  2. Optimize DB Usage: Use await db.commit() (if using an async driver) or move db.commit() to an external thread to avoid freezing the FastAPI event loop during token streaming.
  3. Cache Skill Definitions: Pre-calculate tool schemas during the Bootstrap/Build phase or cache them in Redis. Parsing Markdown for every tool listed in every chat message is unsustainable.

Please review this third feature audit. If approved, I will proceed to the final backend feature: API, Routing & Security.