diff --git a/.agent/workflows/browser_setup.md b/.agent/workflows/browser_setup.md new file mode 100644 index 0000000..a87ec21 --- /dev/null +++ b/.agent/workflows/browser_setup.md @@ -0,0 +1,63 @@ +--- +description: How to enable browser visual access (Screenshot/Interaction) for the AI agent when running in a Dev Container +--- + +## Overview +When running inside a Dev Container, the AI agent lacks the system libraries to launch a local browser. This workflow explains how to "borrow" the user's host machine browser via remote debugging and a network bridge. + +## Step 1: Detect Environment +Run the following to check if you are in a constrained container: +```bash +if [ -f /.dockerenv ] || grep -q "docker\|lxc" /proc/1/cgroup; then + echo "DEBUG: Running in a container. Browser visuals restricted." +fi +``` + +## Step 2: User Action (Host Machine) +Ask the user to launch Chrome in remote debugging mode on their **Host machine** and expose the port. + +> [!IMPORTANT] +> **User Instructions:** +> Please run the following command in your local terminal (not inside the container): +> +> **MacOS:** +> ```bash +> killall "Google Chrome"; /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 +> ``` +> +> **Linux:** +> ```bash +> google-chrome --remote-debugging-port=9222 +> ``` +> +> **Windows (PowerShell):** +> ```powershell +> & "C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222 +> ``` +> +> **Then:** In VS Code, go to the **Ports** tab and ensure port **9222** is forwarded. + +## Step 3: Container Setup (By Agent) +Once the user confirms the browser is running, the agent must bridge the networking. + +// turbo +1. Install `socat`: +```bash +sudo apt-get update && sudo apt-get install -y socat +``` + +2. Establish the bridge from the container's loopback to the host's loopback: +```bash +# Detect the host IP (usually host.docker.internal) +HOST_ADDR="host.docker.internal" +socat TCP-LISTEN:9222,fork,reuseaddr TCP:$HOST_ADDR:9222 & +``` + +## Step 4: Verification +Confirm the agent can communicate with the remote browser: +```bash +curl -I http://127.0.0.1:9222/json/version +``` + +## Step 5: Execute Browser Task +Now the agent can use `browser_subagent` or `open_browser_url` tools. They will automatically connect to the forwarded port at `127.0.0.1:9222`.