diff --git a/docs/features/harness_engineering/co_pilot_agent_design.md b/docs/features/harness_engineering/co_pilot_agent_design.md new file mode 100644 index 0000000..1589250 --- /dev/null +++ b/docs/features/harness_engineering/co_pilot_agent_design.md @@ -0,0 +1,129 @@ +# Feature Design: Co-Worker (Evaluation & Rework Loop) + +## 1. Executive Summary +The **Co-Worker Agent** (or Co-Pilot) is an autonomous shadow agent tasked with ensuring the quality and accuracy of the **Main Agent**. By running alongside the main execution thread, it provides a "Check-and-Balance" loop that enables continuous self-improvement through structured evaluation and threshold-based reworks. + +--- + +## 2. Core Architecture: The Two-Agent System + +### A. Main Agent +The primary agent responsible for executing tasks (e.g., Code generation, research, or system orchestration). + +### B. Co-Worker Agent +A secondary agent that runs in parallel for every task. +- **Enabled/Disabled**: Can be toggled on/off in the Agent Instance settings. +- **Refresh Context**: Starts with a fresh, clean context window every time it is triggered to ensure objective evaluation. +- **Visibility**: Has access to the Main Agent’s **incoming request** and its **generated results**. +- **Shared Skills & Tools**: The Co-Worker has full access to all **Skills and Tools** assigned to the Main Agent. This allows it to perform its own independent verification (e.g., running tests, checking files) during the evaluation phase. + +--- + +## 3. The Execution Lifecycle + +The Co-Worker is triggered at two critical points during the Main Agent's run: + +### Phase 1: Pre-Execution (The "Ask-Oriented" Expectation Setter) +- **Trigger**: Immediately after the Main Agent receives a request. +- **Role**: A purely reflective turn. The Co-Worker should **not perform any actions or use tools** during this setup phase. It is strictly analytical. +- **Action**: + 1. The Co-Worker analyzes the specific requirements of the current request. + 2. It defines a **Task-Adaptive Evaluation Mechanism** by building a custom context around the **Core Rubric**: + - **Expectations**: A checklist of specific results the Main Agent should satisfy for *this* request. (e.g., "Add the login endpoint to `auth.py`"). + - **Core Rubric (The Foundation)**: Always includes **Quality**, **Accuracy**, and **Non-AI Alike**, but with **task-optimized weights** (e.g., 70% Accuracy for code, 70% Quality/Tone for creative docs). +- **Persistence**: The Co-Worker saves this request-specific rubric and expectations to the `evaluation.md` file in the Mirror System. +- **Purge Rule**: Any previous evaluation data is purged at the start of the round. + +### Phase 2: Post-Execution (The Dual-Stage Quality Gate) +- **Trigger**: After the Main Agent completes its task. +- **Stage 2A: The Blind Rating (Objective Evaluation)** + 1. **Fresh Context**: The Co-Worker starts a new run with **zero knowledge** of previous rework attempts. + 2. **Action**: It compares the **Current Result** against the **Original Expectations** (from the `evaluation.md` setup). + 3. **Output**: It generates a score (0-100) and writes it into the `evaluation.md` file. + 4. *Rationale*: This ensures the Co-Worker remains objective and doesn't "grade on a curve" or feel pressured to increase the score just because it's a second or third attempt. +- **Stage 2B: The Rework Justification (Historical Awareness)** + 1. **Trigger**: Occurs only if the score from Stage 2A is below the threshold. + 2. **Action**: The Co-Worker performs a **second run** where it specifically pulls all **historical ratings and feedback** from the `evaluation.md`. + 3. **Output**: It compares the current "Blind" score to the previous ones. It then justifies the next rework prompt by highlighting exactly what improved and what still needs work, ensuring the next iteration is pointedly better. + +--- + +## 4. The Self-Improvement Loop (The "Teaching" Rework) + +The "Continuous Improvement" is driven by a mentoring dynamic between the agents, featuring a "Double-Turn" feedback logic: + +1. **User Threshold & Max Reworks (UI)**: The user defines the **Rework Threshold** (e.g., 85/100) and **Max Reworks** (e.g., 3). +2. **Double-Turn Logic (Blind -> Aware)**: + - **Turn 1 (Objective)**: Score the result without bias. + - **Turn 2 (Mentoring)**: Compare the new score with the history to refine the teaching prompt. +3. **Context Preservation (Main Agent)**: While the Co-Worker uses fresh windows for rating, the **Main Agent’s history is strictly preserved**. The Co-Worker's "Aware" prompt from Step 2B is injected into the Main Agent’s context, allowing it to see its own journey and learn from the critique. +4. **Execution**: + - If **Score < Threshold** and **Attempts < Max Reworks**: The Main Agent is re-triggered with the Co-Worker’s justified feedback. + - If **Max Reworks** is reached: The loop halts and alerts the user for manual intervention. + +--- + +## 5. Mirror System Integration (evaluation.md) + +The system relies on a single, shared file within the **private Workspace Jail** of each agent for synchronization. + +- **File Path**: `.cortex/evaluation.md` (Scoped to the unique Jail path of the Agent Instance). +- **Isolation**: This is strictly **per-agent**. Different agents cannot see or interfere with each other's evaluations as they operate in completely separate filesystem jails. +- **Updated Content Example**: + ```markdown + # Current Round Evaluation + + ## Expectations + - [ ] Implement a clean API in Flask. + - [ ] Use environment variables for secrets. + + ## Rating System (Weights) + - Quality: 30% + - Accuracy: 50% + - Non-AI Alike: 20% + + ## Historical Loop Log + - **Attempt 1**: Score: 60/100 | "Logic good, but used hardcoded DB password." + - **Attempt 2**: Score: 78/100 | "Moved to .env, but formatting is broken." + + ## Current Status (Post-Run) + - **Attempt 3**: Score: 92/100 + - Feedback: "Perfect. Both logic and formatting are now solid." + - Action: "Accepted." + ``` + +--- + +## 6. UI / UX Design: User-Friendly Configuration + +To make the Co-Worker feature accessible, we will integrate it into the existing Agent orchestration dashboard with a focus on toggles and visual feedback. + +### A. Deployment Configuration (`DeployAgentModal`) +When deploying a new agent, a new **"Co-Worker settings"** section will be added: +- **Enable Co-Worker Toggle**: A primary toggle switch. +- **Rework Threshold & Max Reworks**: High-level sliders to control the quality gate and iteration limit. + - *Tooltip*: "If the Co-Worker scores the result below [Threshold], a rework is automatically triggered up to [Max Reworks] times." + +### B. Live Management (`AgentDrillDown` - Config Tab) +Users can modify the Co-Worker settings live without redeploying: +- **Toggles & Sliders**: Replicated in the "Metadata & System" tab, including the iterative limit. +- **Weighted Ratings**: A simple UI to adjust the importance of Quality vs. Accuracy vs. Non-AI Alike (e.g., three sliders that balance to 100%). + +### C. The "Evaluation" Tab (`AgentDrillDown`) +A new dedicated tab alongside "Metadata" and "Workspace": +- **Live Markdown Preview**: A rendered view of the current `evaluation.md` from the mirror system. +- **Quality Badges**: Displays the most recent score (e.g., a green "85/100" badge). +- **Rework History**: A small log showing how many times the Co-Worker triggered a rework for the current request. + +### D. Agent Dashboard Visibility (`AgentCard`) +- **Co-Worker Indicator**: A small "Evaluating" or "Co-Worker On" badge on the agent card. +- **Last Score**: The most recent evaluation score displayed next to the "Success Rate" metric. + +--- + +## 7. Implementation Checklist +- [ ] Add `co_worker_enabled` toggle to `AgentInstance` DB model. +- [ ] Implement `pre_run` and `post_run` hooks in `AgentExecutor`. +- [ ] Add UI slider for `rework_threshold`. +- [ ] Develop the logic to purge/write/read the `evaluation.md` in the workspace. +- [ ] Create a "Self-Improvement Loop" that re-dispatches the Main Agent with Co-Worker feedback.