diff --git a/docs/features/harness_engineering/harness_engineering_execution_plan.md b/docs/features/harness_engineering/harness_engineering_execution_plan.md index 0a0d234..a5e31a7 100644 --- a/docs/features/harness_engineering/harness_engineering_execution_plan.md +++ b/docs/features/harness_engineering/harness_engineering_execution_plan.md @@ -24,6 +24,11 @@ - **Action (Backend):** Expand the `Session` DB model with an `is_locked` (Boolean) property. Update the `DELETE /api/v1/sessions/purge` endpoint to strictly run a condition (`WHERE is_locked = False`), making active Agents immune to global deletion sweeps. - **Action (Frontend):** In the normal Swarm Control sidebar, visually render a literal 🔒 Lock Icon next to any session where `is_locked` is true. Disable the manual delete button for that specific row, enforcing that an Agent's memory can only be purged by destroying the Agent itself from the Orchestrator Dashboard. +### Task 0.4: Context Truncation (Head & Tail Preservation) +If `history=session.messages` is passed natively to the AI orchestrator, an autonomous loop will rapidly exceed the 128k token API limit and crash with an `HTTP 400 ContextWindowExceededError`. However, blindly slicing the last 20 messages destroys the Agent's foundational mission prompt at the start of the chat. +- **Action:** Refactor `chat_with_rag` to aggressively chunk the message array. Provide the AI with the **Head** (The initial 3 messages containing its core directive) and the **Tail** (The most recent 10-15 messages containing its immediate working memory/errors). +- **Action:** Before hitting the API, completely remove the vast middle section of the array to guarantee the Agent never exceeds the API limit, while still retaining absolute knowledge of *why* it is working. + --- ## Area 1: Core Database & Context Scaffolding