Newer
Older
cortex-hub / docs / architecture / ai_file_sync_integration.md

📁 Ghost Mirror & File Sync: AI Integration Plan

Currently Implemented Architecture (The Baseline)

The Cortex Swarm features a robust bidirectional file synchronization engine ("Ghost Mirror") built over the bidirectional gRPC task stream.

1. Agent Node (core/sync.py, core/watcher.py)

  • Real-Time Watcher: The Node uses watchdog to monitor its local workspace folder. When a user or command modifies a file, it chunks the file (64KB chunks) into FilePayload Protobufs and streams them to the Hub.
  • FS Controller: The Node listens for SyncControl gRPC commands:
    • Watch Controls: START_WATCHING, STOP_WATCHING, LOCK (blocks local edits).
    • Explorer Actions: LIST, READ, WRITE, DELETE.

2. AI Hub Server (core/grpc/core/mirror.py, assistant.py)

  • Ghost Mirror: The Hub receives the FilePayload chunks and maintains an exact, real-time replica of the workspace on the Hub's local disk at /app/data/mirrors/{session_id}.
  • Mesh Explorer: The AssistantService exposes .ls(), .cat(), .write(), and .rm() which send SyncControl commands over the network to the Node and wait for the result.
  • AI Access: The current mesh_file_explorer skill uses these AssistantService functions, meaning every time the AI reads a file, it does a full network round-trip to the node, waiting for the node to read it and send it back.

💡 Strategic Swarm Use Cases (Why this is powerful)

The file sync infrastructure (Ghost Mirror) is incredibly powerful for the AI. Because the sync engine guarantees eventual consistency across assigned nodes and the central Hub mirror, it natively unlocks several advanced Swarm workflows:

Use Case 1: Centralized Large Scale Refactoring (Continuous Integration Flow)

When the AI is tasked with refactoring an entire codebase, it can use the mesh_file_explorer to apply massive multi-file changes entirely on the Hub's local mirrored repository.

  • The Flow: The AI modifies the Hub's master files instantly. The Ghost Mirror automatically syncs these changes out to the specific Edge Node.
  • Testing: Once the AI finishes the refactor, it simply calls the execute skill on that remote edge node (e.g., npm run test). The code is already there. If it fails, the AI iterates locally on the hub.
  • Benefit: The AI operates in a clean, consistent local environment without fighting SSH streams or network latency during the heavy editing phase.

Use Case 2: Multi-Agent Collaboration & State Sharing

A single synced directory can act as a live shared memory state across multiple disjointed agent nodes that are otherwise unaware of each other.

  • The Flow: Node A is running a backend scraper script. Node B is running a Web UI testing tool. Both are mapped to the same logical session_id workspace.
  • State Bridge: Node A writes its scraped data to shared/data.json. The Hub's Ghost Mirror immediately replicates that change down to Node B. Node B's crawler detects the new file and begins processing it.
  • Benefit: By exposing a synced folder, individual node agents can work collaboratively as in a true distributed system, passing data physically rather than relying entirely on complex gRPC signals.

🎯 The Problem

Though the AI can technically access files using the mesh_file_explorer skill, the current implementation is inefficient. If the AI is debugging a file in a synced workspace (session_id), it is wasting time doing network round-trips to the Agent Node for ls and cat commands.

Since the Hub already maintains a perfect local Ghost Mirror (/app/data/mirrors), the AI should be querying the local Hub disk instantly.


🚀 AI Integration TODO Phase Plan

Phase 1: Local Mirror Fast-Paths (Zero-Latency File Explorer)

Goal: Reduce AI file exploration latency from ~1000ms (network) to ~1ms (local disk).

Implementation Blueprint (For AI reading this document):

  • Target File: Modifications should primarily happen in /app/ai-hub/app/core/grpc/services/assistant.py where ls and cat are defined.
  • Task: Intercept ls and cat if session_id != "__fs_explorer__".
    • The __fs_explorer__ session is a special session used by the UI to browse arbitrarily anywhere on the machine. This must still use gRPC.
    • Any other UUID session_id means the Hub has a mirror at /app/data/mirrors/{session_id}.
  • Local ls:
    • Use os.scandir or os.walk on the local Hub path (/app/data/mirrors/{session_id}/{path}).
    • Construct a dictionary matching the existing UI response signature: {"path": path, "files": [{"name": "file.txt", "size": 123, "is_dir": False}, ...]}.
    • Return this instantly. Bypassing the self.journal.register(...) and node.queue.put(...) entirely.
  • Local cat:
    • Use Python's built-in open(path, 'r').read() on the local Hub mirror.
    • Return {"path": path, "content": text}.
  • Reconciled write/rm:
    • Update the Hub's local mirror synchronously using Python's os and shutil tools.
    • Keep the node.queue.put(FileSyncMessage(SyncControl.WRITE)) line, but make it "fire and forget" or await it concurrently, returning Success to the AI instantly.

Phase 2: Active AI Sync Orchestration

Goal: Empower the Swarm AI to autonomously manage replication and locks across nodes.

Implementation Blueprint (For AI reading this document):

  • Target Files: Create or update /app/ai-hub/app/core/skills/definitions/mesh_sync_control.json and map it in /app/ai-hub/app/core/services/tool.py.
  • Capability Signatures:
    • start_sync(node_id: str, path: str): Sends SyncControl.START_WATCHING via AssistantService to instruct a new node edge to hook into the mesh.
    • lock_node(node_id: str): Sends SyncControl.LOCK to prevent a human dev from altering files while the SubAgent is running multi-file edits.
    • resync_node(node_id: str): Sends SyncControl.RESYNC to force the node to hash-check itself against the master mirror to fix desync errors naturally.

Phase 3: Autonomous Conflict Resolution

Goal: Allow the AI to act as the ultimate "git merge" authority over the distributed filesystem.

Implementation Blueprint (For AI reading this document):

  • Event Tunnel: In /app/ai-hub/app/core/grpc/services/assistant.py or the main task stream router, intercept SyncStatus.RECONCILE_REQUIRED events.
  • Action: Instead of just warning the UI, drop an Observation event directly into the SubAgent's RagPipeline queue.
    • "Warning: Node A has drifted. Hash mismatch on /src/lib.js."
  • New Skill: Provide the AI with an inspect_drift(node_id, file_path) skill which gives a unified diff of what the Hub thinks the file looks like vs. what the Node actually has, empowering the AI to issue the decisive write.