yangyangxie/cortex-hub

Fork: 0

yangyangxie / cortex-hub

History for cortex-hub / mesh-sdk

2026-04-30	c4a3bb6 Browse files » Self-recovery: dispatch dict, disable updater in Docker, fix watchdog semantics ... 1. Replace if/elif dispatch chain with _DISPATCH dict (engines/node.py) Adding a proto message kind now requires exactly one entry in _DISPATCH — impossible to add a callback slot without the routing case. Missing kinds emit WARNING immediately. Extractor is colocated with the route so the wrong-object bug (Bug 1) cannot recur. 2. Honour CORTEX_DISABLE_AUTO_UPDATE env var (updater.py) In Docker, updates are delivered via image rebuilds. The old behaviour called sys.exit(0) to spawn a bootstrapper, which Docker restart:always turned into an infinite boot loop. Setting CORTEX_DISABLE_AUTO_UPDATE=1 in the container env prevents this entirely. 3. Watchdog ticks unconditionally in health reporter (node.py) Previously the watchdog tick was skipped when the hub was unreachable, causing os._exit(1) after 300s of any disconnect — even during normal gRPC reconnect backoff. Now the watchdog proves the reporter thread is alive regardless of hub state; it only fires if the thread itself deadlocks. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> axieyangb committed 12 days ago
	5a8bfa0 Browse files » Harden mesh dispatch against silent failures ... Three structural improvements to prevent future silent drops: 1. Add else-warning in on_message() dispatch chain — any unhandled proto message kind now logs a WARNING in production instead of being silently swallowed. This would have made the work_pool_update miss immediately visible. 2. Fix on_ready to use getattr(message, kind) — same safe extraction pattern now applied consistently to all dispatch branches. Prevents Bug 1 class from recurring if _on_mesh_ready ever starts using the argument. 3. Remove redundant _health_thread_started boolean from send_health() guard — _start_health_stream() already gates on thread.is_alive(), so the outer boolean was a stale fast-path that could mislead future readers back into the original thread-restart bug. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> axieyangb committed 12 days ago
	9c189a5 Browse files » Fix node disconnect: health stream restart, policy dispatch, work_pool routing ... Three bugs causing nodes to drop offline and stay offline after disconnect: 1. on_policy passed full ServerMessage instead of sub-message — caused Dispatch Error: mode on every connection (AgentNode._on_policy_update accesses policy.mode directly) 2. _health_thread_started never reset in close() — health stream could not restart after reconnect, so hub watchdog eventually timed out the node 3. work_pool_update had no dispatch case in on_message() — _handle_work_pool was dead code, causing hub to flood nodes with unclaimed task updates Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> axieyangb committed 12 days ago
2026-04-28	97f7137 Browse files » Fix Windows agent offline issues: Added run_loop.bat wrapper and fixed Sandbox policy handshake axieyangb committed 14 days ago
	d759d53 Browse files » Fix thread-unsafe sequence counter causing heapq comparison crash ... The _send_seq += 1 approach had a race condition: two threads calling send() simultaneously could load the same counter value, producing two queue items with identical (priority, seq) tuples. heapq then compares the third element (protobuf message), which crashes with TypeError. Replace with itertools.count() whose next() is GIL-atomic in CPython — each call is a single C-level operation that cannot be interrupted mid- increment. Also fix the legacy fallback path in LiveNodeRecord.send_message and remove the duplicate 'import queue' in node_registry.py. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> axieyangb committed 14 days ago
	6965166 Browse files » Fix PriorityQueue crash when protobuf messages are compared as tiebreakers ... Python's heapq compares all tuple elements if earlier ones are equal. Using timestamp as the second key means two rapidly queued messages at the same millisecond trigger comparison of the protobuf message objects, which don't support '<'. Replace timestamp with a monotonic sequence counter so the message object is never reached in the comparison. Fixes: Reader thread FATAL exception: '<' not supported between instances of 'ClientTaskMessage' and 'ClientTaskMessage' Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> axieyangb committed 14 days ago
	bbe9c75 Browse files » Add token self-recovery to survive auth failures without SSH access ... - GrpcMeshTransport: on handshake rejection, calls /api/v1/agent/token-sync using stable secret_key to fetch fresh invite_token and retries handshake. Persists recovered token to all known config file locations. - agent_update.py: new /token-sync endpoint (auth: hub SECRET_KEY header). - node.py: wire hub_http_url + secret_key into GrpcMeshTransport constructor. - reinstall_windows_agent.ps1: idempotent all-in-one reinstall script — kills competing python processes, disables ghost startup bat, syncs config, optionally updates token, re-registers task with RestartCount=3. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> axieyangb committed 14 days ago
2026-04-26	0037ed7 Browse files » fix(mesh): resolve Windows interactive terminal backspace and reconnect loops, optimize link CSS yangyang xie committed 15 days ago
2026-04-25	fbc0170 Browse files » Fix integration tests, deadlocks, and race conditions in coworker flow Antigravity AI committed 17 days ago
2026-04-25	d897dfe Browse files » refactor done yangyang xie committed 17 days ago
2026-04-24	f8632c9 Browse files » half done refactoring Antigravity AI committed 17 days ago