diff --git a/docs/refactors/dedicated_browser_service.md b/docs/refactors/dedicated_browser_service.md new file mode 100644 index 0000000..78ac476 --- /dev/null +++ b/docs/refactors/dedicated_browser_service.md @@ -0,0 +1,113 @@ +# Design Document: Dedicated Browser Service Refactor + +## 1. Rationale +Currently, every agent node in the Cortex Mesh can optionally support a "browser skill" via Playwright. While flexible, this introduces several issues: +- **Latency**: High overhead in sending large DOM/A11Y snapshots over the bi-directional gRPC TaskStream. +- **Resource Heavy**: Browser instances on edge nodes consume significant RAM/CPU. +- **Dependency Bloat**: Every agent node needs Playwright/Chromium dependencies. +- **Complexity**: Synchronizing browser state across a distributed mesh is difficult. + +By moving browser automation to a dedicated service located alongside the AI Hub, we achieve **near-zero latency** for DOM extraction and more robust state management. + +## 2. New Architecture +The new architecture introduces a **standalone Browser Service** container. + +### Component Diagram +```mermaid +graph TD + User["USER / UI"] <--> Hub["AI Hub (ai-hub)"] + + subgraph "Internal Processing" + Hub --> SA["Sub-Agent (Browser Expert)"] + SA -- "gRPC / REST" --> BSC["Browser Service Client"] + end + + BSC <--> BS["Dedicated Browser Service (Container)"] + + subgraph "Browser Service" + BS <--> PW["Playwright / Chromium"] + end + + Hub <--> Mesh["Agent Mesh (Edge Nodes)"] + Mesh -- "Local Tasks" --> Shell["Shell / File System"] +``` + +### Key Changes +1. **Hub Integration**: The `ToolService` in the Hub will no longer dispatch browser tasks to the `TaskAssistant` (which routes to random nodes). Instead, it will talk directly to the `BrowserService` client. +2. **Stateless/Stateful Support**: The `BrowserService` will manage browser contexts, allowing the Hub to reference a `session_id` to continue navigation on the same page. +3. **Removal from Agents**: All `agent-node` code related to the browser (bridges, dependencies, and proto fields) will be removed. + +## 3. Refactoring Plan + +### Phase 1: Protos & Infrastructure +1. **Update `agent.proto`**: + - Remove `BrowserAction`, `BrowserResponse`, and `BrowserEvent` messages. + - Remove `browser_action` from `TaskRequest`. + - Remove `browser_result` and `browser_event` from `ServerTaskMessage`. +2. **Define Browser Service API**: + - Create a new proto (e.g., `browser_service.proto`) defining actions like `Navigate`, `Click`, `Extract`, `EvaluateJS`. + +### Phase 2: Agent Node Cleanup +1. **Remove Skill Implementation**: + - Delete `agent-node/src/agent_node/skills/browser_bridge.py`. +2. **Update Manager**: + - Remove `browser` registration in `agent-node/src/agent_node/skills/manager.py`. +3. **Core Cleanup**: + - Remove browser capability detection in `agent-node/src/agent_node/node.py`. + - Remove inbound `browser_action` routing in `_process_server_message`. + +### Phase 3: AI Hub Refactor +1. **Tool Routing**: + - In `ai-hub/app/core/services/tool.py`, update `browser_automation_agent` to use a `BrowserServiceClient` instead of `assistant.dispatch_browser`. +2. **Assistant Cleanup**: + - Remove `dispatch_browser` and all browser-related result handling from `ai-hub/app/core/grpc/services/assistant.py`. +3. **GRPC Server Cleanup**: + - Remove `browser_event` and result correlation logic from `ai-hub/app/core/grpc/services/grpc_server.py`. + +### Phase 4: New Browser Service Development +1. Implement a new Python service using **FastAPI** or **gRPC**. +2. Use **Playwright** with a pool of persistent contexts. +3. Deploy as a separate container in the `docker-compose`. + +## 4. Performance Analysis & Optimization +To achieve "performance first" and "0 latency," we must choose the communication stack carefully. + +### Comparison +| Feature | gRPC (HTTP/2) | REST (HTTP/1.1) | **gRPC + Unix Sockets** | **Shared Memory (/dev/shm)** | +| :--- | :--- | :--- | :--- | :--- | +| **Serialization** | Protobuf (Binary) | JSON (Text) | Protobuf (Binary) | Zero-copy / Reference | +| **Network Overhead** | Low (TCP) | High (TCP) | **Near Zero (IPC)** | **Zero** | +| **Speed (Small Result)** | High | Medium | **Ultra High** | N/A | +| **Speed (Large DOM/A11Y)** | Medium | Low | High | **Instant** | + +### The "Performance First" Recommendation +For a local container-to-container deployment, **gRPC over Unix Domain Sockets (UDS)** is the optimal choice for command/control. However, for large data (DOM snapshots, Screenshots), we will implement a **Sidecar Handoff via Shared Memory**. + +1. **Control Path**: AI Hub --[gRPC over UDS]--> Browser Service. +2. **Data Path (Large Blobs)**: + - Browser Service writes the 2MB DOM or 5MB Screenshot to a shared volume mounted as `tmpfs` (e.g., `/dev/shm/cortex_browser/`). + - Browser Service returns the **file path reference** via gRPC. + - AI Hub reads the file directly from RAM. + - This bypasses the serialization/deserialization and stream-processing overhead of passing MBs of data through the network stack. + +## 5. Implementation Roadmap Update + +### Phase 1: Shared Infrastructure +- Configure `docker-compose.production.yml` to shared a high-speed RAM volume (`/dev/shm`) between the `ai-hub` and the new `browser-service`. +- Implement a gRPC server on the Browser Service that listens on a Unix Socket (`/tmp/browser.sock`). + +### Phase 2: Agent Node Cleanup (Continued) +*(Same as previously defined - removing all browser skill code from nodes to lighten their footprint).* + +### Phase 3: Hub Logic +- Implement the `BrowserServiceClient` to handle the UDS connection and the RAM-disk data retrieval for large snapshots. + +## 6. Impact Analysis +- **Latency**: Estimated 95% reduction in large data transfer time. +- **CPU/Memory**: Drastically reduced on agents; focused on one optimized high-memory container on the Hub host. +- **Architecture**: Cleaner separation of concerns. Agents handle hardware/local tasks; specialized containers handle high-resource simulated tasks. + +--- +**Status**: DRAFT +**Author**: Cortex Architect +**Ref**: Session Refactor Request (Turn 818)