diff --git a/docs/features/harness_engineering/co_pilot_agent_design.md b/docs/features/harness_engineering/co_pilot_agent_design.md index 1589250..18521ef 100644 --- a/docs/features/harness_engineering/co_pilot_agent_design.md +++ b/docs/features/harness_engineering/co_pilot_agent_design.md @@ -36,15 +36,18 @@ ### Phase 2: Post-Execution (The Dual-Stage Quality Gate) - **Trigger**: After the Main Agent completes its task. -- **Stage 2A: The Blind Rating (Objective Evaluation)** - 1. **Fresh Context**: The Co-Worker starts a new run with **zero knowledge** of previous rework attempts. - 2. **Action**: It compares the **Current Result** against the **Original Expectations** (from the `evaluation.md` setup). - 3. **Output**: It generates a score (0-100) and writes it into the `evaluation.md` file. - 4. *Rationale*: This ensures the Co-Worker remains objective and doesn't "grade on a curve" or feel pressured to increase the score just because it's a second or third attempt. -- **Stage 2B: The Rework Justification (Historical Awareness)** + +- **Stage 2A: The Blind Rating (Absolute Objectivity)** + 1. **Stateless Context**: The Co-Worker starts a new run with **zero knowledge** of previous scores or rework attempts. + 2. **Visibility**: It sees only the **Original Request**, the **Core Rubric**, and the **Current Result**. + 3. **Action**: It generates a "Blind Score" (0-100) based strictly on the current state of the implementation. + 4. **Rationale**: This prevents "Score Chasing," where an evaluator feels forced to increase a score just because it is a subsequent round. The evaluation must be as if it were the first time seeing the work. + +- **Stage 2B: The Delta Analysis (Feedback Loop)** 1. **Trigger**: Occurs only if the score from Stage 2A is below the threshold. - 2. **Action**: The Co-Worker performs a **second run** where it specifically pulls all **historical ratings and feedback** from the `evaluation.md`. - 3. **Output**: It compares the current "Blind" score to the previous ones. It then justifies the next rework prompt by highlighting exactly what improved and what still needs work, ensuring the next iteration is pointedly better. + 2. **Historical Context (Strictly Textual)**: The Co-Worker reads the **Previous Rework Instructions** (to see what was asked for) but is **precluded from seeing previous scores**. + 3. **Action**: It identifies the "Delta" (what improved vs. what is still failing). + 4. **Output**: It generates a new set of instructions. If the agent fixed A but broke B, the Co-Worker must report this gap objectively. --- @@ -53,9 +56,9 @@ The "Continuous Improvement" is driven by a mentoring dynamic between the agents, featuring a "Double-Turn" feedback logic: 1. **User Threshold & Max Reworks (UI)**: The user defines the **Rework Threshold** (e.g., 85/100) and **Max Reworks** (e.g., 3). -2. **Double-Turn Logic (Blind -> Aware)**: - - **Turn 1 (Objective)**: Score the result without bias. - - **Turn 2 (Mentoring)**: Compare the new score with the history to refine the teaching prompt. +2. **Double-Turn Logic (Blind Score -> Gap Analysis)**: + - **Turn 1 (Blind)**: Assign a score based on the *result only*. + - **Turn 2 (Gap)**: Compare the result to previous *feedback* (not scores) to refine instructions. 3. **Context Preservation (Main Agent)**: While the Co-Worker uses fresh windows for rating, the **Main Agent’s history is strictly preserved**. The Co-Worker's "Aware" prompt from Step 2B is injected into the Main Agent’s context, allowing it to see its own journey and learn from the critique. 4. **Execution**: - If **Score < Threshold** and **Attempts < Max Reworks**: The Main Agent is re-triggered with the Co-Worker’s justified feedback. @@ -69,27 +72,29 @@ - **File Path**: `.cortex/evaluation.md` (Scoped to the unique Jail path of the Agent Instance). - **Isolation**: This is strictly **per-agent**. Different agents cannot see or interfere with each other's evaluations as they operate in completely separate filesystem jails. -- **Updated Content Example**: +- **File Structure**: + - `.cortex/rubric.md`: The static checklist (Read-only for all rounds). + - `.cortex/feedback.md`: The active rework instructions (Transferred to Main Agent). + - `.cortex/history.log`: A hidden, append-only log of scores and timestamped justifications (JSON format). + +- **Updated `feedback.md` Example**: ```markdown - # Current Round Evaluation + # Rework Instructions (Round 2) - ## Expectations - - [ ] Implement a clean API in Flask. - - [ ] Use environment variables for secrets. + ## Progress Assessment + - [x] Security: Moved password to .env (Fixed). + - [ ] Formatting: Code is still unindented in `auth.py` (Remaining). - ## Rating System (Weights) - - Quality: 30% - - Accuracy: 50% - - Non-AI Alike: 20% - - ## Historical Loop Log - - **Attempt 1**: Score: 60/100 | "Logic good, but used hardcoded DB password." - - **Attempt 2**: Score: 78/100 | "Moved to .env, but formatting is broken." - - ## Current Status (Post-Run) - - **Attempt 3**: Score: 92/100 - - Feedback: "Perfect. Both logic and formatting are now solid." - - Action: "Accepted." + ## Required Actions + Please apply PEP8 formatting to the new logic in `auth.py`. The logic is now secure, but the readability is failing the Quality rubric. + ``` + +- **Hidden `history.log` (Internal Only)**: + ```json + [ + {"round": 1, "score": 60, "reason": "Hardcoded password"}, + {"round": 2, "score": 82, "reason": "Security fixed, formatting broken"} + ] ``` ---