Design Document: Dedicated Browser Service Refactor

1. Rationale

Currently, every agent node in the Cortex Mesh can optionally support a "browser skill" via Playwright. While flexible, this introduces several issues:

Latency: High overhead in sending large DOM/A11Y snapshots over the bi-directional gRPC TaskStream.
Resource Heavy: Browser instances on edge nodes consume significant RAM/CPU.
Dependency Bloat: Every agent node needs Playwright/Chromium dependencies.
Complexity: Synchronizing browser state across a distributed mesh is difficult.

By moving browser automation to a dedicated service located alongside the AI Hub, we achieve near-zero latency for DOM extraction and more robust state management.

2. New Architecture

The new architecture introduces a standalone Browser Service container.

Component Diagram

graph TD
    User["USER / UI"] <--> Hub["AI Hub (ai-hub)"]
    
    subgraph "Internal Processing"
        Hub --> SA["Sub-Agent (Browser Expert)"]
        SA -- "gRPC / REST" --> BSC["Browser Service Client"]
    end

    BSC <--> BS["Dedicated Browser Service (Container)"]
    
    subgraph "Browser Service"
        BS <--> PW["Playwright / Chromium"]
    end

    Hub <--> Mesh["Agent Mesh (Edge Nodes)"]
    Mesh -- "Local Tasks" --> Shell["Shell / File System"]

Key Changes

Hub Integration: The ToolService in the Hub will no longer dispatch browser tasks to the TaskAssistant (which routes to random nodes). Instead, it will talk directly to the BrowserService client.
Stateless/Stateful Support: The BrowserService will manage browser contexts, allowing the Hub to reference a session_id to continue navigation on the same page.
Removal from Agents: All agent-node code related to the browser (bridges, dependencies, and proto fields) will be removed.

3. Refactoring Plan

Phase 1: Protos & Infrastructure

Update agent.proto:
- Remove BrowserAction, BrowserResponse, and BrowserEvent messages.
- Remove browser_action from TaskRequest.
- Remove browser_result and browser_event from ServerTaskMessage.
Define Browser Service API:
- Create a new proto (e.g., browser_service.proto) defining actions like Navigate, Click, Extract, EvaluateJS.

Phase 2: Agent Node Cleanup

Remove Skill Implementation:
- Delete agent-node/src/agent_node/skills/browser_bridge.py.
Update Manager:
- Remove browser registration in agent-node/src/agent_node/skills/manager.py.
Core Cleanup:
- Remove browser capability detection in agent-node/src/agent_node/node.py.
- Remove inbound browser_action routing in _process_server_message.

Phase 3: AI Hub Refactor

Tool Routing:
- In ai-hub/app/core/services/tool.py, update browser_automation_agent to use a BrowserServiceClient instead of assistant.dispatch_browser.
Assistant Cleanup:
- Remove dispatch_browser and all browser-related result handling from ai-hub/app/core/grpc/services/assistant.py.
GRPC Server Cleanup:
- Remove browser_event and result correlation logic from ai-hub/app/core/grpc/services/grpc_server.py.

Phase 4: New Browser Service Development

Implement a new Python service using FastAPI or gRPC.
Use Playwright with a pool of persistent contexts.
Deploy as a separate container in the docker-compose.

4. Performance Analysis & Optimization

To achieve "performance first" and "0 latency," we must choose the communication stack carefully.

Comparison

Feature	gRPC (HTTP/2)	REST (HTTP/1.1)	gRPC + Unix Sockets	Shared Memory (/dev/shm)
Serialization	Protobuf (Binary)	JSON (Text)	Protobuf (Binary)	Zero-copy / Reference
Network Overhead	Low (TCP)	High (TCP)	Near Zero (IPC)	Zero
Speed (Small Result)	High	Medium	Ultra High	N/A
Speed (Large DOM/A11Y)	Medium	Low	High	Instant

The "Performance First" Recommendation

For a local container-to-container deployment, gRPC over Unix Domain Sockets (UDS) is the optimal choice for command/control. However, for large data (DOM snapshots, Screenshots), we will implement a Sidecar Handoff via Shared Memory.

Control Path: AI Hub --[gRPC over UDS]--> Browser Service.
Data Path (Large Blobs):
- Browser Service writes the 2MB DOM or 5MB Screenshot to a shared volume mounted as tmpfs (e.g., /dev/shm/cortex_browser/).
- Browser Service returns the file path reference via gRPC.
- AI Hub reads the file directly from RAM.
- This bypasses the serialization/deserialization and stream-processing overhead of passing MBs of data through the network stack.

5. Implementation Roadmap Update

Phase 1: Shared Infrastructure

Configure docker-compose.production.yml to shared a high-speed RAM volume (/dev/shm) between the ai-hub and the new browser-service.
Implement a gRPC server on the Browser Service that listens on a Unix Socket (/tmp/browser.sock).

Phase 2: Agent Node Cleanup (Continued)

(Same as previously defined - removing all browser skill code from nodes to lighten their footprint).

Phase 3: Hub Logic

Implement the BrowserServiceClient to handle the UDS connection and the RAM-disk data retrieval for large snapshots.

6. Impact Analysis

Latency: Estimated 95% reduction in large data transfer time.
CPU/Memory: Drastically reduced on agents; focused on one optimized high-memory container on the Hub host.
Architecture: Cleaner separation of concerns. Agents handle hardware/local tasks; specialized containers handle high-resource simulated tasks.

Status: DRAFT
Author: Cortex Architect
Ref: Session Refactor Request (Turn 818)