This report performs a deep-dive audit of the file synchronization and "Ghost Mirror" architecture within mirror.py, focusing on 12-Factor App Methodology, NFS Performance, and Filesystem Security.
| Factor | Status | Observation |
|---|---|---|
| VI. Processes | 🔴 Performance Hazard | Volatile Hash Cache: self.hash_cache (Line 17) is kept in-memory only. Upon Hub restart (deployment/crash), the Hub loses all file hashes. On the next reconciliation cycle, it must perform a full recursive re-hash of all mirrored files. On NFS-backed storage, this "Re-hashing Wave" will cause catastrophic I/O spikes and delay mesh synchronization for minutes. |
| IX. Disposability | ✅ Success | Stale Lock Reclamation: The Hub implements rigorous background cleanup logic (purge_stale_locks, Line 513) to ensure orphaned .cortex_lock and .cortex_tmp files do not accumulate indefinitely during crashed transfers. |
| X. Dev/Prod Parity | 🟡 Warning | Local Path Logic: The use of os.chown (Line 100) and parent_stat.st_uid assumes the Hub process has sufficient permissions to manipulate file ownership on the underlying filesystem. This works in a privileged dev container but may fail in restrictive Rootless K8s/Docker production environments. |
app/core/grpc/core/mirror.pyThe synchronization engine that bridges Hub-side storage and Mesh-node workspaces.
[!CAUTION] I/O Starvation Risk (Reconciliation) Line 485:
with ThreadPoolExecutor(max_workers=32) as executor:The reconciliation process uses a high thread-count for hash verification. While good for parallelization, 32 threads performing 1MB buffer reads simultaneously from NFS storage can lead to I/O starvation for other Hub services (like the Database or API). Fix: Externalize thehash_cacheto a persistent SQLite or Redis store to eliminate redundant re-hashing across Hub reboots.
Identified Problems:
_check_skill_promotion (Line 177) automatically moves directories based on shutil.move (Line 219). If the target directory already exists or contains broken symlinks, this operation can fail silently or cause data loss in the global skill registry..skills/ (Line 59) is robust for symlinks, but it does not prevent a node from "shadowing" a system skill by creating a physical directory of the same name BEFORE the Hub can symlink it.hash_cache to a local persistent store (e.g., DATA_DIR/hashes.db) to enable sub-second reconciliation after Hub restarts.app/config.py to allow tuning for specific storage backends (SSD vs. NFS)._check_skill_promotion uses a transactional move strategy (check $\rightarrow$ temp $\rightarrow$ rename) to prevent race conditions during concurrent skill creation.This concludes Feature 13. I have persisted this report to /app/docs/reviews/feature_review_mirror_sync.md. Which of these storage optimizations should we prioritize?