diff --git a/.gitignore b/.gitignore
index 7f24cdf..24e5896 100644
--- a/.gitignore
+++ b/.gitignore
@@ -16,4 +16,5 @@
 data/*
 .env.gitbucket
 .env.ai
-**/.env*
\ No newline at end of file
+**/.env*
+app/CaudeCodeSourceCode/*
\ No newline at end of file
diff --git a/docs/features/harness_engineering/co_pilot_agent_design.md b/docs/features/harness_engineering/co_pilot_agent_design.md
index d54d8d5..2023cad 100644
--- a/docs/features/harness_engineering/co_pilot_agent_design.md
+++ b/docs/features/harness_engineering/co_pilot_agent_design.md
@@ -146,3 +146,52 @@
 ### 🟢 Stage 4: Testing & Safety
 - [ ] **Blind Context Audit**: Verify that the Co-Pilot in Stage 2A receives zero knowledge of previous rounds.
 - [ ] **Loop Breaker Test**: Ensure `max_reworks` correctly stops an infinite implementation loop.
+
+---
+
+## 8. Lessons from Claude Code (CC) Architecture
+
+After a deep dive into the Claude Code (recovered) source, we should adopt the following "premium" patterns to harden the Co-Worker system.
+
+### A. Memory Mechanics (The Index Pattern)
+Claude Code uses a two-tier memory system (`MEMORY.md` as an index + topic files). We should adopt this for `.cortex/evaluation.md`.
+- **Implementation**: `evaluation.md` should serve as a **Table of Contents**. Detailed rationales and rework logs should be split into `.cortex/logs/round_N.md`.
+- **Benefit**: Keeps the main evaluation context "concise and cache-friendly" while allowing the agent to "Deep Dive" into previous failures only when necessary.
+- **Reference**: `src/memdir/memdir.ts`
+
+### B. System Prompt Boundaries & "No Gold-Plating"
+Claude Code uses a `SYSTEM_PROMPT_DYNAMIC_BOUNDARY` to optimize caching and has strict "Doing Tasks" edicts.
+- **Edicts to Adopt** (Directly from CC's `# Doing tasks` Section): 
+  - *"Don't add features, refactor code, or make 'improvements' beyond what was asked. A bug fix doesn't need surrounding code cleaned up."*
+  - *"Don't add docstrings, comments, or type annotations to code you didn't change."*
+  - *"Don't create helpers, utilities, or abstractions for one-time operations. Three similar lines of code is better than a premature abstraction."*
+  - *"Before reporting a task complete, verify it actually works: run the test, execute the script, check the output."*
+- **The Boundary Pattern**: 
+  1. Insert a marker like `__DYNAMIC_BOUNDARY__` after the static system instructions.
+  2. Everything before this marker is cached by the LLM provider (e.g., Anthropic's Prompt Caching).
+  3. Per-session state (CWD, Tool List, Memory) is appended after this marker.
+- **Application**: The Co-Worker will use these edicts as its **Evaluation Criteria**. If the Main Agent adds a comment you didn't ask for, the Co-Worker will flag it as "Gold-Plating" and lower the Quality score.
+- **Reference**: `src/constants/prompts.ts`
+
+### C. Context Compaction Awareness
+Long rework loops will eventually hit token limits.
+- **CC Pattern**: "Microcompact" and "Autocompact" strategy.
+- **Application**: If `Attempts > 2`, the Co-Worker should be instructed to **Summarize the Rework History** instead of providing the full text of previous rounds. This prevents the Main Agent's context from becoming bloated with "criticism noise."
+- **Reference**: `src/query.ts` (`queryLoop` state management).
+
+### D. Visual "Buddy" Status
+Claude Code uses an animated sprite to show state.
+- **Application**: The `AgentCard` UI should display a unique **Co-Worker Avatar** whose expression changes based on the `Blind Score`.
+  - **90+**: Smiling/Approving.
+  - **70-89**: Thinking/Skeptical.
+  - **<70**: Warning/Frustrated signal.
+- **Benefit**: Immediate visual feedback for the user on perceived quality without reading logs.
+- **Reference**: `src/buddy/CompanionSprite.tsx`
+
+### E. The "Directive" Fork Pattern
+When the Co-Worker triggers a rework, the prompt shouldn't just be "Fix this."
+- **CC Pattern**: Sub-agents receive a **Directive** ("Brief the agent like a smart colleague who just walked into the room").
+- **Application**: The Phase 2B feedback should be formatted as a **Direct Command Set**, not a conversational critique.
+  - **Bad**: "I think the code is a bit messy, maybe fix it?"
+  - **Good**: "Directive: Refactor `auth.py:L24` to use the `.env` variable instead of the hardcoded string."
+- **Reference**: `src/tools/AgentTool/prompt.ts`
diff --git a/docs/features/harness_engineering/co_pilot_task_list.md b/docs/features/harness_engineering/co_pilot_task_list.md
index 8630592..5979c6a 100644
--- a/docs/features/harness_engineering/co_pilot_task_list.md
+++ b/docs/features/harness_engineering/co_pilot_task_list.md
@@ -1,40 +1,35 @@
-# Task List: Co-Pilot Agent Harness Implementation
+# Master Index: Co-Pilot Agent Harness Implementation
 
-This document tracks the progress of the autonomous evaluation and self-improvement loop for the Cortex Hub agents.
+This is the central index for tracking the autonomous evaluation system progress. Detailed tasks are split into specific topic files to maintain a lightweight context window during orchestration.
 
-## 🟢 Stage 1: Data & Models (Foundation)
-- [ ] **DB Model Update**: Modify the backend `AgentInstance` model (PostgreSQL/MongoDB as applicable) to include:
-    - `co_worker_enabled`: (Boolean) Default: `False`.
-    - `rework_threshold`: (Integer) Range 0-100. Default: `80`.
-    - `max_rework_count`: (Integer) Default: `3`.
-- [ ] **Workspace Mirroring**:
-    - [ ] Create `.cortex/` directory in the agent's unique jail during initialization.
-    - [ ] Implement `history.log` append logic (JSON format).
+---
 
-## 🟢 Stage 2: Orchestration Logic (The Engine)
-- [ ] **Request-Specific Rubric Generator**:
-    - [ ] Implement a pre-execution hook in `agent_loop.py`.
-    - [ ] Prompt the Co-Pilot to generate a task-specific `rubric.md`.
-- [ ] **Dual-Stage Post-Run Hook**:
-    - [ ] **Stage 2A (Blind Rating)**: Implement gRPC/Executor logic to call the Co-Pilot with a stripped context.
-    - [ ] **Stage 2B (Delta Analysis)**: Implement context-aware gap discovery (Score-Anonymized).
-- [ ] **Recursive Execution Logic**:
-    - [ ] Logic in `AgentExecutor` to recursively re-trigger if `Score < Threshold` and `Reworks < Max`.
+## 📈 Overall Status: [🟡 INITIALIZING]
 
-## 🟢 Stage 3: User Interface (Dashboard)
-- [ ] **Agent Config Tab**:
-    - [ ] Add the "Co-Worker Settings" section to `DeployAgentModal.tsx`.
-    - [ ] Implement HSL-styled sliders for threshold and count.
-- [ ] **Evaluation Tab (`AgentDrillDown`)**:
-    - [ ] Create a real-time markdown renderer for `.cortex/feedback.md`.
-    - [ ] Build a "Rework History" component that visualizes `history.log` JSON data.
-- [ ] **Status Badges**:
-    - [ ] Display "Evaluating..." state on the agent card during post-run turns.
-    - [ ] Show a permanent "Quality Score" badge (Green/Yellow/Red) derived from the last log entry.
+### 1. [Stage 1: Foundation (Data & Models)](./harness_tasks/foundation.md)
+   - **Focus**: DB updates and Mirror System filesystem setup.
+   - **Status**: [🟢 IN PROGRESS]
+   - **Key File**: `.cortex/evaluation.md`
 
-## 🟢 Stage 4: Reliability & Testing
-- [ ] **Integration Tests**:
-    - [ ] Test: A task that fails on attempt 1, reworks, and passes on attempt 2.
-    - [ ] Test: A task that reaches `max_reworks` and stops even if score is still low.
-- [ ] **Bias Validation**:
-    - [ ] Audit logs to ensure Stage 2A truly receives zero context of previous rounds.
+### 2. [Stage 2: Engine (Orchestration Logic)](./harness_tasks/orchestration.md)
+   - **Focus**: Dual-Pass evaluation loop and recursive re-triggering.
+   - **Status**: [⚪ PLANNED]
+   - **Key File**: `agent_loop.py` hooks.
+
+### 3. [Stage 3: Dashboard (User Interface)](./harness_tasks/ui_dashboard.md)
+   - **Focus**: Controls, markdown streaming, and quality badges.
+   - **Status**: [⚪ PLANNED]
+   - **Key File**: `AgentDrillDown.tsx`
+
+### 4. [Stage 4: Quality (Reliability & Testing)](./harness_tasks/reliability.md)
+   - **Focus**: Bias validation and loop breaker stability.
+   - **Status**: [⚪ PLANNED]
+
+---
+
+## 🛠 Lessons from Claude Code (Memory Mechanics Adherence)
+*Pattern: `MEMORY.md` Index + Topic Files*
+
+1. **Lightweight Index**: This file (the index) remains small so it can be loaded into any agent turn without busting the token budget.
+2. **Topic Segregation**: Details for Foundation, Engine, and UI are stored in `/harness_tasks/`. The agent only "reads" the relevant topic file when working on that specific stage.
+3. **Consistency**: Changes to tasks should be made in the topic files; the index only tracks high-level "Status" bubbles.
diff --git a/docs/features/harness_engineering/harness_tasks/foundation.md b/docs/features/harness_engineering/harness_tasks/foundation.md
new file mode 100644
index 0000000..4a4f400
--- /dev/null
+++ b/docs/features/harness_engineering/harness_tasks/foundation.md
@@ -0,0 +1,22 @@
+---
+title: Stage 1 - Data & Models (Foundation)
+status: IN_PROGRESS
+priority: HIGH
+---
+
+## Core Objectives
+Establish the underlying database structure and filesystem mirroring required for the Co-Worker agent's state management.
+
+## Task Breakdown
+- [ ] **DB Model Update**: Modify the backend `AgentInstance` model (PostgreSQL/MongoDB as applicable) to include:
+    - `co_worker_enabled`: (Boolean) Default: `False`.
+    - `rework_threshold`: (Integer) Range 0-100. Default: `80`.
+    - `max_rework_count`: (Integer) Default: `3`.
+- [ ] **Workspace Mirroring**:
+    - [ ] Create `.cortex/` directory in the agent's unique jail during initialization.
+    - [ ] Implement `history.log` append logic (JSON format).
+
+## Claude Code Inspiration: Memory Context
+*Reference: `src/memdir/memdir.ts`*
+- Ensure the `.cortex/` directory exists immediately on agent startup (idempotent initialization).
+- Use a single line append-only JSON format for `history.log` to prevent partial write corruption.
diff --git a/docs/features/harness_engineering/harness_tasks/orchestration.md b/docs/features/harness_engineering/harness_tasks/orchestration.md
new file mode 100644
index 0000000..a22c898
--- /dev/null
+++ b/docs/features/harness_engineering/harness_tasks/orchestration.md
@@ -0,0 +1,28 @@
+---
+title: Stage 2 - Orchestration Logic (The Engine)
+status: PLANNED
+priority: CRITICAL
+---
+
+## Core Objectives
+Implement the logic that triggers the Co-Worker agent at pre-run and post-run phases, managing the dual-stage evaluation.
+
+## Task Breakdown
+- [ ] **Request-Specific Rubric Generator**:
+    - [ ] Implement a pre-execution hook in `agent_loop.py`.
+    - [ ] Prompt the Co-Pilot to generate a task-specific `rubric.md`.
+- [ ] **Dual-Stage Post-Run Hook**:
+    - [ ] **Stage 2A (Blind Rating)**: Implement gRPC/Executor logic to call the Co-Pilot with a stripped context.
+    - [ ] **Stage 2B (Delta Analysis)**: Implement context-aware gap discovery (Score-Anonymized).
+- [ ] **Directive-Based Rework Injection**:
+    - [ ] Update the `agent_loop.py` rework trigger logic.
+    - [ ] Instead of passing raw feedback, format the Co-Worker's gaps into a **Directive block** (e.g., *"Actionable Command: Refactor X to resolve Y"*).
+- [ ] **Context Compaction Gate**:
+    - [ ] Implement a logic to detect token usage/turn count in the rework loop.
+    - [ ] If `Attempts > 2`, trigger the Co-Pilot to summarize the `.cortex/history.log` and replace the full rework history with a **Compacted Delta** for the Main Agent.
+
+## Claude Code Inspiration: Loop Orchestration
+*Reference: `src/query.ts`*
+- Adopt the `QueryLoop` state object to track `maxOutputTokensRecoveryCount` (or in our case, `reworkCount`) across iterations to avoid losing terminal state.
+- Use the **"Directive Fork"** pattern: In Phase 2B, provide a strict directive rather than just commentary to improve fix accuracy.
+- **Context Management**: Adopt the `Microcompact` and `Autocompact` principles—summarize previous attempts in long sessions to save tokens and focus the agent's attention on the latest delta.
diff --git a/docs/features/harness_engineering/harness_tasks/reliability.md b/docs/features/harness_engineering/harness_tasks/reliability.md
new file mode 100644
index 0000000..652a0a5
--- /dev/null
+++ b/docs/features/harness_engineering/harness_tasks/reliability.md
@@ -0,0 +1,19 @@
+---
+title: Stage 4 - Reliability & Testing
+status: PLANNED
+priority: HIGH
+---
+
+## Core Objectives
+Validate the rework loop's stability and ensures objectivity in the evaluation process.
+
+## Task Breakdown
+- [ ] **Integration Tests**:
+    - [ ] Test: A task that fails on attempt 1, reworks, and passes on attempt 2.
+    - [ ] Test: A task that reaches `max_reworks` and stops even if score is still low.
+- [ ] **Bias Validation**:
+    - [ ] Audit logs to ensure Stage 2A truly receives zero context of previous rounds.
+
+## Claude Code Inspiration: Recovery Circuit Breakers
+*Reference: `src/query.ts`*
+- Ensure the `max_reworks` logic is a hard circuit breaker (similar to `MAX_OUTPUT_TOKENS_RECOVERY_LIMIT`) to avoid infinite loops and runaway costs.
diff --git a/docs/features/harness_engineering/harness_tasks/ui_dashboard.md b/docs/features/harness_engineering/harness_tasks/ui_dashboard.md
new file mode 100644
index 0000000..0483aa8
--- /dev/null
+++ b/docs/features/harness_engineering/harness_tasks/ui_dashboard.md
@@ -0,0 +1,27 @@
+---
+title: Stage 3 - User Interface (Dashboard)
+status: PLANNED
+priority: MEDIUM
+---
+
+## Core Objectives
+Build the user-facing controls and monitoring tabs for the evaluation loop.
+
+## Task Breakdown
+- [ ] **Agent Config Tab**:
+    - [ ] Add the "Co-Worker Settings" section to `DeployAgentModal.tsx`.
+    - [ ] Implement HSL-styled sliders for threshold and count.
+- [ ] **Evaluation Tab (`AgentDrillDown`)**:
+    - [ ] Create a real-time markdown renderer for `.cortex/feedback.md`.
+    - [ ] Build a "Rework History" component that visualizes `history.log` JSON data.
+- [ ] **Mood-Based Co-Worker Avatar**:
+    - [ ] Create a `CoWorkerAvatar` component to be displayed in the `AgentDrillDown` and `AgentCard`.
+    - [ ] Implement logic to map the numerical `Quality Score` to an avatar mood:
+        - `90-100`: High Approval (Happy/Smiling).
+        - `75-89`: Skeptical (Thinking).
+        - `<75`: Critical (Warn/Stern).
+
+## Claude Code Inspiration: Visual Feedback
+*Reference: `src/buddy/CompanionSprite.tsx`*
+- **Deterministic Avatars**: CC uses a seeded calculation based on user IDs (`userId + SALT`) to determine the "Buddy." While we want a single Co-Worker persona, the **Mood State Tree** (e.g., `HAPPY`, `THINKING`, `WARN`) is directly applicable to our Quality Score mapping.
+- **Personality through Animation**: Consider adding micro-animations to the avatar (e.g., a "Thinking" spin during the Co-Pilot evaluation phase) to match CC's high-polish terminal experience.