Newer
Older
cortex-hub / docs / reviews / feature_review_skills_fs.md

Code Review Report: Feature 10 — Skill Loading & Filesystem Workflows

This report performs a deep-dive audit of the dynamic skill loading engine, focusing on fs_loader.py through the lens of 12-Factor App Methodology, Pythonic Code Style, and Filesystem Efficiency.


🏗️ 12-Factor App Compliance Audit

Factor Status Observation
V. Build, Release, Run 🔴 Performance Problem Expensive Runtime Discovery: The FileSystemSkillLoader performs a recursive os.walk (Line 93) across all skill directories every time skills are requested. This violates the separation of "Build" (identifying skills) from "Run" (executing them). In a large-scale deployment, this constant Disk I/O will significantly slow down Hub response times.
I. Codebase 🟡 Warning Fragile Path Resolution: Line 155 uses a deep relative path (../../../skills) to resolve the system skills directory. This makes the Hub codebase fragile to refactoring and violates Factor I's principle of deterministic environment resolution.
X. Dev/Prod Parity Success The loader correctly distinguishes between system skills (read-only) and user-generated skills in settings.DATA_DIR, maintaining proper environmental separation.

🔍 File-by-File Diagnostic

1. app/core/skills/fs_loader.py

The engine that turns Markdown and YAML files on disk into interactive AI capabilities.

[!TIP] Optimization: Lazy Metadata Loading The LazyFileContent (Line 101) is a good pattern to avoid memory-bloat during initial walk. However, the vfs_files list still grows per-file recorded on disk. Fix: Implement a "High-Water Mark" for skill scanning. If a folder contains more than X files (e.g., 50), stop scanning and return a warning to prevent Hub memory exhaustion from accidental node_modules or .git folder inclusion.

Identified Problems:

  • Symbolic Link Hazard: os.walk (Line 93) does not explicitly handle symlink loops. If an AI agent accidentally creates a circular symlink in a skill directory, the get_all_skills call will enter an infinite loop, crashing the Hub.
  • Lack of File-Type Filtering: The VFS records every file except .metadata.json and hidden files (Line 96). It should strictly only index relevant text files (.py, .md, .json, .sql) to avoid polluting the LLM's context with binary assets (images, PDFs) that it cannot process natively without specific tools.

🛠️ Summary Recommendations

  1. Index Caching: Implement a simple file-watcher or a cache (e.g., JSON file on disk) to store the list of available skills, rather than re-walking the tree on every request.
  2. Hardened Path Resolution: Move the system skills path to app/config.py as a centralized settings.SYSTEM_SKILLS_DIR constant.
  3. Strict File Extensions: Update the VFS walk to only include known text-based extensions to optimize memory and LLM context usage.

This concludes Feature 10. I have persisted this report to /app/docs/reviews/feature_review_skills_fs.md. I have now reviewed the majority of the backend codebase and identified several critical security and performance risks. How would you like to proceed?