# Code Review Report: Feature 1 — Autonomous Rework & Evaluation Hub

This report performs a deep-dive audit of the backend orchestration layer, focusing on `agent_loop.py`, `harness_evaluator.py`, and `architect.py` through the lens of **12-Factor App Methodology**, **Pythonic Code Style**, and **Architectural Stability**.

---

## 🏗️ 12-Factor App Compliance Audit

| Factor | Status | Observation |
| :--- | :--- | :--- |
| **III. Config** | 🟡 **Warning** | While using Pydantic `Settings`, `agent_loop.py` contains some hardcoded fallbacks and heuristic parsing of provider strings (e.g., `split("/")`). This logic should be encapsulated in a `ConfigurationResolver` service. |
| **VI. Processes** | 🔴 **Problem** | `AgentExecutor` runs as a long-lived background task. If the Hub process restarts, there is no state persistence/recovery for the "active" loop. It relies on the database status, but the `asyncio` task is lost. |
| **XI. Logs** | 🔴 **Major Issue** | Evaluation "logs" (`history.log`) are treated as **application state** stored as files on remote mesh nodes. This violates the "Logs as Event Streams" principle and introduces significant latency/corruption risks. |
| **XII. Admin** | ✅ **Success** | Database migrations and node provisioning are handled as one-off processes, maintaining a clean runtime environment. |

---

## 🔍 File-by-File Diagnostic

### 1. `app/core/orchestration/agent_loop.py`
The "brain" of the autonomous rework cycle.

> [!WARNING]
> **Single Responsibility Principle (SRP) Violation**
> This file is currently 25KB and performs DB management, LLM provider instantiation, Token Usage regex counting, AND the rework state machine.
> **Fix**: Extract `TokenUsageCounter`, `ReWorkStrategy`, and `LLMProviderFactory` into standalone services.

**Identified Problems**:
*   **Legacy Logging**: Still uses several `print()` statements for exception reporting (Lines 328, 377). This makes production monitoring difficult via ELK/CloudWatch.
*   **Manual Token Counting**: Uses regex/manual subtraction instead of relying on provider-level usage metadata where possible.
*   **Database Contention**: Performs frequent `instance = db.query(...).first()` calls inside the streaming loop. This creates high DB pressure during high-throughput reasoning tasks.

---

### 2. `app/core/orchestration/harness_evaluator.py`
Manages quality gates and rubric generation.

> [!CAUTION]
> **Catastrophic Failure Potential: Persistent Audit Store**
> Lines 313-343 use a `cat` $\rightarrow$ `json.loads` $\rightarrow$ `write` pattern to manage rework history. 
> 1. **Concurrency**: If multiple processes write to history, data is lost.
> 2. **Scalability**: If the log grows to several megabytes, every audit attempt will timeout due to gRPC payload limits for essentially a "database" operation executed via shell commands.
> **Fix**: Store evaluation history in the primary SQL database or an append-only JSONL stream.

**Identified Problems**:
*   **Hardcoded Prompts**: System prompts for Rubric and Delta Analysis are hardcoded in the class (Lines 98, 176). This makes prompt engineering difficult without code deploys.
*   **Error Handling**: `initialize_cortex` ignores failures. If a node has a read-only filesystem, the agent will appear to "hang" silently.

---

### 3. `app/core/orchestration/architect.py`
The tool-calling execution engine.

**Identified Problems**:
*   **Memory Management**: The context-chunk injection logic doesn't strictly enforce model context limits before dispatching the final turn, which could lead to `context_window_exceeded` errors for smaller models.
*   **Tool Filtering**: The tool selection logic is repeated in several places. Consistency is a concern if a new "Guardrail" tool is added.

---

## 🛠️ Summary Recommendations

1.  **Stop using `history.log` as primary state**: Move rework attempt records to a Database table (`AgentAttempt`). The file-based log should be a secondary "mirror" only.
2.  **Modularize `AgentExecutor`**: Break down the massive `run` method into distinct phases: `PreConfig`, `Execute`, `Audit`, `Iterate`.
3.  **Upgrade Logging**: Replace all `print()` and `logger.error(f"error: {e}")` with `logger.exception("Context-rich error message")` to capture stack traces.

---

**Please review this first feature audit. If approved, I will proceed to the next feature: Node Registry & Mesh Monitoring.**
