🔍 Cortex Agent Node: Feature Gap Analysis & Roadmap
This document outlines the critical missing features required to transition the current gRPC Proof of Concept (PoC) into a full-scale, production-ready Distributed AI Agent System.
1. 🗄️ Workspace & File Synchronization
The current node executes commands but lacks a native way to manage project-level files.
- The Gap: No bi-directional sync (e.g., local server files -> node workspace).
- Required: A content-addressable synchronization layer (Merkle Tree / Hash-based) to efficiently mirror workspaces to remote nodes without redundant transfers.
2. 🌊 Real-time Log Streaming (Observability)
Currently, stdout/stderr is only returned upon task completion.
- The Gap: No visibility into long-running tasks or hanging builds.
- Required: Implementing gRPC Server-to-Client streaming for live console logs, allowing the Main AI to detect progress or failures as they occur.
3. 🛡️ Robust Sandbox Isolation
The current sandbox relies on string-filtering shell commands.
- The Gap: Vulnerable to complex shell escapes, symlink attacks, and environment manipulation.
- Required: OS-level containerization (Docker, Podman, or Firecracker microVMs) to ensure each task is strictly trapped within its own namespace.
4. 🔗 Specialized Sub-Worker Protocols (CDP/LSP)
The agent treats browser automation and coding as generic shell commands.
- The Gap: Inefficiency; starting a fresh browser for every click is slow and loses state.
- Required: Persistent sub-bridges (e.g., Chrome DevTools Protocol link) allowing the Main AI to maintain a long-running session across multiple delegated tasks.
5. 📦 Binary Artifact & Large Data Handling
The system currently lacks logic for large file transport.
- The Gap: gRPC message limits (4MB) will crash the system if a node tries to return a video capture or large log file.
- Required: Chunked file upload/download logic for artifacts like screenshots, videos, and build binaries.
🏗️ Node Lifecycle & Quality of Life
- Automatic Updates: Mechanism for nodes to self-update their binary/logic when the central protocol evolves.
- Graceful Shutdown: Handling system signals to allow background workers to finish or clean up before disconnection.
- Local Cache: Persistence for task history and metadata on the node to handle temporary network partitions.
[!NOTE] These features bridge the gap between "Command Execution" and "Full Autonomous Collaboration."