@yangyang xie yangyang xie authored 7 days ago
cmd feat: containerd runtime with restart policy watchdog, exec user support, and systemd-native install 7 days ago
internal feat: containerd runtime with restart policy watchdog, exec user support, and systemd-native install 7 days ago
.gitignore chore: initial scaffold for aegis-agent 7 days ago
Dockerfile feat: containerd runtime with restart policy watchdog, exec user support, and systemd-native install 7 days ago
README.md docs: add README 7 days ago
go.mod refactor: replace Docker REST proxy with containerd runtime abstraction 7 days ago
README.md

aegis-agent

Node agent for Aegis — runs on each managed host and connects back to the Aegis control plane over an mTLS WebSocket tunnel.

What it does

  • Manages containerd containers on the local host (deploy, start, stop, restart, remove)
  • Enforces container restart policies via a watchdog goroutine
  • Streams exec sessions (interactive terminal) back to the Aegis UI
  • Sends heartbeats with node metrics (CPU, memory, GPU, container counts) every 30 seconds
  • Optionally manages existing Docker (moby) containers during migration to containerd-native

Requirements

  • Linux host with containerd installed and running
  • ctr CLI (bundled with containerd)
  • systemd (for agent process management)

Installation

Generate a join token from the Aegis UI (Nodes → Add Node), then run the displayed install script on your target machine. It will:

  1. Pull the agent image via ctr
  2. Install a systemd service (aegis-agent.service)
  3. Join the node to your Aegis instance

No Docker required.

Architecture

Aegis (control plane)
    │  mTLS WebSocket (port 8766)
    ▼
aegis-agent
    │
    ├── containerd socket  (/run/containerd/containerd.sock)
    └── logs               (/var/log/aegis/)

Agent config is persisted to /var/lib/aegis-agent/agent.json after the initial join so the agent survives restarts without re-joining.