Newer
Older
cortex-hub / docs / architecture / cortex_project_todo.md

πŸ“ Cortex Distributed Agent: Project TODO List

This document tracks prioritized tasks, technical debt, and future implementations for the Cortex project.

πŸš€ High Priority (Infrastructure)

[ ] Persistent Sub-Worker Bridges (CDP/LSP) - βœ… FOUNDATIONS BUILT

  • Status: Basic Navigation, Screenshotting, and Persistent Session state implemented.
  • Goal: Support a professional, high-fidelity Antigravity Browser Skill.

🌐 Comprehensive Browser Skill Requirements:

  • [ ] JS Console Tunnel: Pipe console.log/error from the browser back to the server in real-time.
  • [ ] Network Observability: Capture and return XHR/Fetch traffic (HAR or failed requests only) for AI debugging.
  • [ ] A11y Tree Perception: Provide the Accessibility Tree (JSON) to the AI instead of just raw HTML/DOM for better semantic understanding.
  • [ ] Advanced Interactions: Support Hover, Scroll, Drag & Drop, and Multi-key complex input.
  • [ ] EVAL Skill: Allow the AI to inject and execute arbitrary JavaScript (page.evaluate()) to extract data or trigger events.
  • [ ] Smart Wait Logic: Implement wait_for_network_idle, wait_for_selector, and custom predicates to reduce task flakiness.
  • [ ] Artifact Extraction: Export high-definition Videos (chunked) and HAR files for audit trails.

[ ] Multi-Tenancy & Resource Isolation

  • Description: Isolate node groups by user/tenant and enforce hardware quotas.
  • Why: Allows the Main AI (Antigravity) to manage resource usage and forcefully cancel zombie tasks that may be hanging or orphaned, ensuring node health.

[ ] Binary Artifact & Large Data Handling (Chunking)

  • Description: Implement gRPC stream-based chunking for large artifacts.
  • Specific Case: Support high-fidelity Video Recordings from Browser sessions (multi-GB files).
  • Requirement: Transparency. The Main AI should just see a "File" result; reassembly happens at the server layer.

[ ] Architectural Refinement: Unified Worker Shim

  • Description: Re-evaluate the "Skill" abstraction. Move towards a model where each task is a specialized worker process that decides its capability (Shell vs Playwright) at startup.
  • Goal: Simplifies context isolation and reduces manager-thread overhead.

[ ] Graceful Shutdown & Local Task Persistence (Built-in)

  • Description: Handle node interrupts (SIGTERM/SIGINT) to allow workers to finish or checkpoint. Store a local task_history.json on the node to recover state after crash/restart.

[ ] Server-Side Registry & Task Persistence

  • Description: Migrate NodeRegistry and WorkPool from in-memory to a persistent backend (Postgres/Redis).
  • Priority: Deferred until Full System Integration phase.

[ ] Workspace Mirroring & Efficient File Sync

  • Description: Maintain a local server-side mirror of node workspaces for Zero-latency AI perception.

[ ] Real-time gRPC Log Streaming

  • Description: Bidirectional stream for live stdout/stderr.

🐒 Low Priority / Observation

[ ] OS-Level Isolation (Firecracker/VNC)

  • Description: Lightweight virtualization (microVMs) for worker execution.
  • Status: Monitored.

[ ] Node Lifecycle: Auto-Updates

  • Description: Mechanism for nodes to self-update.

[ ] Vertical & Horizontal Scalability

  • Description: Migrate to a stateless server design with load balancing.

πŸ—ΊοΈ Future Roadmap (Strategic)

[ ] Advanced Scheduling & Capability Routing

  • Description: Sophisticated scheduler to match complex constraints (GPU, Region, Priority).

[ ] mTLS Certificate Lifecycle Management

  • Description: Automated renewal, revocation, and rotation of node certificates.

[ ] Immutable Audit & Compliance

  • Description: Cryptographically signed records of every TaskRequest and TaskResponse for forensics.