This report performs a deep-dive audit of the task-tracking state machine and memory protection layer within journal.py, focusing on 12-Factor App Methodology, Memory Management, and Agent Reliability.
| Factor | Status | Observation |
|---|---|---|
| VI. Processes | 🔴 Major Issue | Volatile Journal State: The TaskJournal stores all active task metadata and stream buffers in-memory (self.tasks, Line 20). If the Hub process restarts (deployment/crash), all currently running agent sub-tasks lose their "result hook." The AI Agent will continue waiting indefinitely for a result that disappeared from the Hub's memory. This state must be synchronized to a persistent store (SQLite/Redis). |
| IX. Disposability | ✅ Success | Robust Memory Sandboxing: The Hub's "Head + Tail" buffer strategy (_trim_stream, Line 41) is a best-in-class implementation for agentic systems. It prevents the Hub from OOM-crashing during accidental massive stdout bursts while preserving the critical initial context and final status needed by the AI. |
app/core/grpc/core/journal.pyThe Hub's short-term memory for tracking asynchronous node execution.
[!TIP] Performance: Thread Safety vs. Throughput Line 19:
self.lock = threading.Lock()The journal uses a single global lock for all task updates (thought logs, stdout chunks, result fulfillment). For a mesh of 100+ nodes streaming build logs, this lock will become a significant point of contention. Fix: Shard the task registry (e.g., 16 separate dictionaries with their own locks) based on thetask_idhash to improve concurrent update performance.
Identified Problems:
cleanup task (Line 216) removes completed results after only 120 seconds. If a calling service (like the UI or a background RAG aggregator) fails to poll exactly in that window due to network latency, the result is lost.task_id and assigned node_id in the backend Database upon registration to enable "Re-attachment" logic after a Hub reboot.app/config.py to allow tuning for specific AI model context windows.This concludes Feature 15. I have persisted this report to /app/docs/reviews/feature_review_task_journal.md. All major gRPC core components have now been audited. Shall I proceed to the final review of the mesh "Assistant" and STT/STT providers?