For the past 12 months, I have been executing a solo build of JARVIS — a three-repository, multi-process autonomous AI operating system spanning Python, C++, Rust, Go, Swift, Objective-C, and TypeScript. The system orchestrates 60+ asynchronous agents across a neural mesh, routes inference dynamically between local Apple Silicon and GCP, performs real-time voice biometric authentication, controls macOS at the native API level, and continuously trains its own models through a self-improving feedback loop.
Full Stack Inventory (text)
| Category | Technologies |
|---|---|
| Languages | Python, C, C++, Rust, Go, Swift, Objective-C, Objective-C++, TypeScript, JavaScript, SQL, Shell/Bash, ARM64 Assembly (NEON SIMD), Metal Shading Language, AppleScript, Protobuf, HCL/Terraform, CUDA, HTML/CSS |
| ML / Inference | PyTorch, Transformers, llama.cpp, llama-cpp-python, GGUF quantization, ONNX Runtime, CoreML Tools, SpeechBrain, scikit-learn, SentenceTransformers, HuggingFace Hub, safetensors, tiktoken, Numba (JIT), sympy, LangChain, YOLO |
| Training | LoRA, DPO, RLHF, FSDP, MAML (meta-learning), curriculum learning, federated learning, causal reasoning, world model training, online learning, active learning, EWC |
| Models / Vision | LLaVA (multimodal), ECAPA-TDNN (speaker verification), Whisper (faster-whisper, openai-whisper), Porcupine/Picovoice (wake word), Piper TTS, OmniParser (OCR) |
| LLM APIs | Anthropic Claude API (chat, vision, computer use), OpenAI API (chat completions, embeddings), Google Gemini API, Ollama (local inference) |
| Rust | PyO3, ndarray, rayon, parking_lot, DashMap, crossbeam, serde, mimalloc, image crate, Metal (GPU compute), tokio, zstd, lz4, candle (on-device ML) |
| Swift / macOS | Swift Package Manager, CoreLocation, WeatherKit, AppKit, Foundation, Quartz/CoreGraphics, Accessibility API, AVFoundation, pyobjc, launchd, osascript, yabai |
| Vector / Data | ChromaDB, FAISS, Redis, PostgreSQL (asyncpg, psycopg2), SQLite (aiosqlite), NetworkX, bloom filters |
| Infrastructure | GCP (Compute Engine, Cloud SQL, Cloud Run, Secret Manager, Monitoring), Docker, docker-compose, Terraform, Kubernetes, systemd, CMake, pybind11, cpp-httplib |
| CI/CD | GitHub Actions (30+ workflows), CodeQL, Super-Linter, Dependabot, Gitleaks, Postman/Newman, git worktrees |
| Backend | FastAPI, uvicorn, uvloop, gRPC, Protobuf, asyncio, aiohttp, httpx, WebSocket, Cloud SQL Proxy, circuit breakers, exponential backoff, distributed locks, epoch fencing |
| Observability | OpenTelemetry (tracing + metrics + OTLP/gRPC export), Prometheus, structlog, psutil, Pydantic, JSONL telemetry pipeline, LangFuse, Helicone, PostHog |
| Frontend | React 19, Next.js, Framer Motion, Axios, WebSocket real-time streaming |
| Audio / Vision | OpenCV, sounddevice, PyAudio, webrtcvad (VAD), Silero VAD, speexdsp (AEC), librosa, pyautogui, CoreML VAD, Tesseract OCR |
| Voice / TTS | ElevenLabs, GCP TTS, Piper TTS, Edge-TTS, gTTS, pyttsx3, macOS Say, Wav2Vec2 |
| C++ (ReactorCore) | Custom mlforge ML library: KD-trees, graph structures, trie, matrix ops, linear/logistic regression, decision trees, neural nets, model serialization, deployment API |
| AI Orchestration | LangChain, LangGraph, CrewAI, OpenHands, Open Interpreter, OmniParser |
| Experiment Tracking | Weights & Biases (wandb), TensorBoard |
| Browser Automation | Playwright, DuckDuckGo Search, Beautiful Soup |
| Quality / Linting | pytest, Ruff, Black, isort, Flake8, mypy, Pyright, Bandit, ESLint, pre-commit |
| Notifications | Discord, Slack, Telegram, SMTP/Email |
| External APIs | OpenWeather, Alpha Vantage, News API, Wikipedia API, Google Safe Browsing |
Full AI & Dev Tools Inventory
| Category | Tools |
|---|---|
| LLM Platforms | Anthropic Claude (chat, vision, computer use), OpenAI (Whisper, embeddings), Google Gemini, Ollama, HuggingFace Transformers, llama.cpp (GGUF), Apple MLX, Candle (Rust ML), ONNX Runtime, CoreML |
| AI Development | Cursor IDE, Claude Code CLI, Claude GitHub Actions (5 workflows: PR analyzer, docs generator, test generator, security analyzer, auto-fix) |
| AI Orchestration | LangChain, LangGraph, CrewAI (multi-agent), OpenHands (coding assistant), Open Interpreter, OmniParser (vision parsing) |
| Experiment Tracking | Weights & Biases (wandb), TensorBoard, LangFuse (LLM observability), Helicone (LLM cost tracking), PostHog (product analytics) |
| Voice & Audio | OpenAI Whisper, Faster-Whisper, SpeechBrain, Wav2Vec2, ElevenLabs TTS, GCP TTS, Piper TTS, Edge-TTS, gTTS, pyttsx3, Picovoice/Porcupine (wake word), WebRTC VAD, Silero VAD, CoreML VAD |
| Browser Automation | Playwright, DuckDuckGo Search, Beautiful Soup, Google Safe Browsing API |
| Testing & Quality | pytest, Ruff, Black, isort, Flake8, mypy, Pyright, Bandit, ESLint, Super-Linter, CodeQL, Dependabot, Gitleaks, Postman/Newman, pre-commit hooks |
| Notifications | Discord, Slack, Telegram, SMTP/Email (Gmail) |
| External Data APIs | OpenWeather, Alpha Vantage (stocks), News API, Wikipedia API, Google NotebookLM |
Every component below is production code running in the JARVIS ecosystem — not academic exercises.
Data Structures (50+ types)
| Category | Structures | Implementation |
|---|---|---|
| Trees | Quadtree (spatial indexing), KD-Tree (nearest neighbor + radius search), Trie (prefix search), DAG (startup dependency graph), Scene Graph, Knowledge Graph, Process Tree | Python + Rust + C++ |
| Graphs | Reasoning Graph, Dependency Graph, Multi-Space Context Graph, Window Relationship Graph, Service Mesh Discovery Graph, LangGraph state machines, Causal Graphs (do-calculus) | Python |
| Hash-Based | Bloom Filters (3 languages), LSH Semantic Cache, LRU Cache, TTL Cache, Consistent Hashing, DashMap (lock-free concurrent), Bitmaps/Bitsets | Python + Rust + Swift |
| Heaps & Queues | Binary Heap (heapq), Priority Queue, Bounded Queue, Ring Buffer, Circular Buffer, Work-Stealing Queue, Zero-Copy IPC (mmap), Lock-Free SPSC Queue | Python + Rust + JS |
| Concurrent | Arc<Mutex<>>, RwLock, DashMap, mpsc channels, Vector Clock, CRDT, Distributed Lock, asyncio.Queue | Rust + Python |
| Matrices & Tensors | Matrix2D, Matrix3D (row-major), Sparse Matrices (nalgebra-sparse), PyTorch Tensors, Quantized Tensors (INT8/INT4), Embedding Vectors | Rust + C++ + Python |
| Memory | Memory Pool, Slab Allocator, Zero-Copy Buffers, Object Recycler, mmap Ring Buffers | Rust + Python |
| State | Finite State Machine, Event Bus, Event Store, Sliding Window, Bounded Collections | Python |
Algorithms (80+ implementations)
| Category | Algorithms | Where |
|---|---|---|
| Resilience | Circuit Breaker (5 variants), Exponential Backoff w/ Jitter, Graceful Degradation, Self-Healing, Leader Election, Distributed Locking, Distributed Transactions, Distributed Dedup | JARVIS + Prime |
| Scheduling | Round Robin, Token Bucket, Leaky Bucket, Sliding Window Rate Limiter, Work Stealing, Backpressure Control, Adaptive ML-Based Rate Limiting | All three repos |
| Graph / Search | Topological Sort (DAG), BFS/DFS, A* Search, Dijkstra's Shortest Path, K-Nearest Neighbor, PageRank (file importance ranking) | All three repos |
| Statistical / Bayesian | Bayesian Inference (Beta-Bernoulli, Normal-Normal posteriors), Bayesian Confidence Fusion, Multi-Armed Bandit (Thompson Sampling, epsilon-greedy), Monte Carlo Validation, Kalman Filter (RSSI smoothing), Markov Chain Prediction | JARVIS + Prime |
| ML Training | LoRA/QLoRA, DPO (preference optimization), RLHF (PPO pipeline), FSDP (parameter sharding), MAML/Reptile (meta-learning), Federated Learning (FedAvg, FedProx, Byzantine-robust), Curriculum Learning, Causal Reasoning (do-calculus), Online Learning w/ EWC, World Model Training (Dreamer/MuZero-inspired), Knowledge Distillation (Hinton, FitNets, attention transfer, multi-teacher), Gradient Accumulation, Mixed Precision (BF16/FP16) | ReactorCore + Prime |
| ML Inference | Quantized INT8/INT4, Cosine Similarity, LSH, Vector Search, Anomaly Detection, Pattern Recognition, Goal Inference, Activity Recognition, Tiered Complexity Routing, Flash Attention | JARVIS + Prime |
| Neural Networks | Multi-Head Self-Attention, Dropout, BatchNorm, LayerNorm, LSTM + Attention, Feedforward w/ Backpropagation, Cognitive Layers (cross-attention + residual) | All three repos |
| Clustering & Reduction | K-Means, DBSCAN, PCA, Truncated SVD, TF-IDF Vectorization | JARVIS + Reactor |
| Ensemble Methods | Random Forest, Gradient Boosting, Isolation Forest, Ensemble STT (multi-model voting), Weighted Model Ensemble (majority/cascade) | JARVIS + Reactor |
| Signal Processing | VAD (WebRTC + Silero + CoreML), MFCC/Mel Filterbanks, Spectrogram, Anti-Spoofing, Barge-In Detection, ECAPA-TDNN Speaker Verification | JARVIS |
| Compression | Zstd, LZ4, Gzip/Zlib, Custom Vision Compression | Rust + Python |
| Cryptography | HMAC, SHA-256, MD5, JWT, Secure Password Hashing, File Integrity Checksums, Checkpoint Verification | All three repos |
| Caching | LRU Eviction, TTL Eviction, Predictive Cache Warming (EWMA + time-series), LSH Semantic Cache, Bloom Filter Negative Cache, Memoization (lru_cache) | All three repos |
| Evolutionary | Genetic Algorithm (Ouroboros self-programming loop — B+ branch-isolated sagas, v262.0 fully activated) | JARVIS |
| Concurrency | Deadlock Prevention, CPU Affinity Pinning, Parallel DAG Initialization, Zero-Copy mmap IPC, Lock-Free Channels | JARVIS + Prime |
| GPU / SIMD | Metal Compute Shaders, ARM64 NEON SIMD Intrinsics | JARVIS (Rust + C + Assembly) |
| C++ ML (mlforge) | Linear Regression (Ridge/Lasso), Logistic Regression, Decision Tree (Gini), Neural Net (backprop), Matrix Serialization, KD-Tree, Graph (BFS/DFS), Trie | ReactorCore |
JARVIS is not a chatbot wrapper. It is a distributed AI operating system composed of three interdependent repositories — each a standalone production system, together forming a self-improving autonomous intelligence.
- Single command control plane:
python3 unified_supervisor.pyboots Body, Mind, and Forge with deterministic lifecycle ownership - Trinity operating model:
JARVISexecutes,JARVIS-Primereasons/routes,ReactorCoretrains and redeploys - Reliability-first inference: policy-based failover from GCP golden image to local Apple Silicon to API fallback
- Closed learning loop: runtime telemetry flows to Reactor training, then gated deployment returns improved models to Prime
- Native autonomy stack: async agent mesh, Google Workspace workflows, voice biometrics, and vision-driven macOS control
- Safety by design: policy gates, contract checks, kill-switch controls, circuit breakers, and probation-based rollback
flowchart TD
K["UNIFIED SUPERVISOR<br/>single control plane"] --> B["JARVIS (Body)<br/>agents + tools + execution"]
K --> P["JARVIS-Prime (Mind)<br/>routing + reasoning"]
K --> R["ReactorCore (Forge)<br/>training + deployment gates"]
B <--> P
P --> R
R --> P
B --> R
P --> T1["Tier 1: GCP Golden Image"]
T1 -->|"degraded"| T2["Tier 2: Local Apple Silicon"]
T2 -->|"degraded"| T3["Tier 3: API Fallback"]
R --> G["Gate + Probation"]
G -->|"pass"| P
G -->|"fail"| RB["Rollback"]
Three repos previously made independent lifecycle decisions (restart/health/kill), which created restart storms, readiness split-brain, and contract drift. This architecture is now unified under a single root authority model.
flowchart TD
U["UNIFIED SUPERVISOR<br/>Root Control Plane"] --> W["RootAuthorityWatcher<br/>Policy Brain"]
U --> O["ProcessOrchestrator<br/>Execution Plane"]
O --> P["JARVIS-Prime<br/>managed mode"]
O --> R["Reactor-Core<br/>managed mode"]
W -->|LifecycleVerdict| O
O -->|ExecutionResult| W
P -->|health + drain contract| W
R -->|health + drain contract| W
W --> H{"Handshake Gate"}
H -->|"schema N/N-1 + capability hash pass"| READY["ALIVE/READY"]
H -->|"contract mismatch"| REJECT["REJECTED"]
W --> E["Escalation Engine"]
E --> D["drain"]
E --> T["SIGTERM"]
E --> K["process-group SIGKILL"]
What we built (21 tasks, 5 waves, 3 repos)
- Wave 0 — Foundation types: canonical lifecycle contracts (
LifecycleAction,SubsystemState,ProcessIdentity,LifecycleVerdict, policy/timeout structures) + managed-mode contract + golden conformance tests - Wave 1 — Root authority watcher: lifecycle state machine ownership, verdict emission, incident dedup, and policy/execution separation via
VerdictExecutor - Wave 2 — Prime/Reactor conformance: managed-mode behavior (
JARVIS_ROOT_MANAGED), health envelope enrichment, authenticated/lifecycle/drain - Wave 3 — Orchestrator integration + shadow mode:
ProcessOrchestratoradapter methods wired; active crash watch (proc.wait) + jittered health polling - Wave 4 — Activation hardening: active verdict dispatch, contract hash gating at boot handshake, policy delegation hooks for restart/health ownership
What this resolved
- Restart storms: single restart policy with budgeted windows and deduplication
- Readiness split-brain: unified two-field liveness/readiness state ownership
- Contract drift: cross-repo managed-mode parity with conformance tests and compatibility gates
- Crash blind spots: ms-latency process-exit detection plus health-path observability
- Competing supervisors: Prime/Reactor demoted to managed mode while root authority owns lifecycle decisions
- Escalation ambiguity: deterministic kill ladder (
drain -> SIGTERM -> process-group SIGKILL) - PID reuse risk: identity validation strengthened via multi-factor
ProcessIdentity - Control-plane auth gaps: HMAC-authenticated lifecycle commands and session-aware checks
Production rollout path (remaining ops work)
- Shadow soak: run in
shadowmode and verify decision parity against legacy behavior - Per-subsystem activation: promote one subsystem at a time (
reactor-corethenjarvis-prime) - Final policy cut-wire: fully bypass legacy autonomous monitor decisions when delegation flags are enabled
- CI anti-drift: enforce cross-repo parity checks for managed-mode contract files on every PR
Hidden profile bullet packs (copy-ready)
Ultra-short TL;DR
- Triple Authority Fixed: one root control plane governs restart/readiness/lifecycle
- Safe by Contract: managed-mode + authenticated lifecycle endpoints + handshake gating
- Staged Rollout: shadow parity -> subsystem activation -> full active cutover
Recruiter-friendly
- Architecture leadership: unified three competing supervisors into one production control plane
- Reliability outcome: removed restart storms and readiness split-brain via centralized lifecycle policy
- Security hardening: added authenticated lifecycle controls and contract-gated activation
- Operational rigor: designed staged rollout for safe production adoption
Infra-architect
- Control-plane convergence: root watcher owns lifecycle state transitions across Body/Prime/Reactor
- Policy/execution isolation: watcher emits verdicts; orchestrator executes side effects
- Deterministic escalation: bounded
drain -> term -> group-killwith race-safe identity checks - Protocol hardening: schema/capability handshake gates + managed-mode health/drain envelopes
- Progressive activation: shadow validation, per-subsystem enablement, legacy path retirement
unified_supervisor.py grew into a ~96K-line orchestration monolith with multiple high-impact domains in one file. The risk is not just size; it is coupling density: local edits can create non-local regressions.
flowchart TD
E["Single Entry Point<br/>python3 unified_supervisor.py"] --> S["Kernel Shell (thin)"]
S --> R["Domain Controller Registry"]
R --> L["Lifecycle Controller"]
R --> H["Health Controller"]
R --> W["Workflow Controller"]
R --> M["Resource Controller"]
R --> X["Self-Healing Controller"]
R --> A["AGI/Training Controller"]
L --> C["Contract Boundaries<br/>typed interfaces + DTOs"]
H --> C
W --> C
M --> C
X --> C
A --> C
C --> T["Isolated Domain Tests"]
C --> O["Cross-Domain Observability"]
Why this is dangerous
- Reasoning collapse: too many orthogonal responsibilities in one file
- Test isolation gap: difficult to unit-test a single subsystem without broad kernel context
- High merge friction: concentrated edit surface increases conflict rate
- Refactor risk: tooling and human review quality degrade as coupling grows
- Mandate conflict: monolith bottleneck violates "no single structural choke point"
Structural cure path
- Preserve single boot command while shrinking policy from the shell
- Extract domain controllers behind protocol boundaries
- Replace direct cross-calls with typed contract interfaces
- Enforce isolation tests per domain before integration tests
- Ship in waves with parity gates to avoid behavioral drift
Hidden profile bullets (copy-ready)
Ultra-short TL;DR
- Monolith Risk Neutralized (in progress): convert a 96K-line supervisor choke point into contract-bounded controllers
- Single Entry Point Preserved: one boot command, modular internals
- Safer Evolution: isolation tests + parity-gated extraction waves
Recruiter-friendly
- Architecture insight: identified the monolith paradox as the largest systemic reliability and velocity risk
- Execution strategy: designed a phased decomposition that keeps runtime stable while reducing coupling
- Engineering rigor: paired extraction with contract boundaries and isolation testing to prevent regressions
Infra-architect
- Kernel shell model: retain entrypoint authority but move domain policy to controller registry
- Protocol-first decomposition: typed interfaces replace direct cross-domain invocation
- Risk-managed migration: parity validation, observability gates, and staged rollout per domain
Purpose, Problem, Challenge, Solution
- Purpose: Define the three-system operating model (
JARVIS,JARVIS-Prime,ReactorCore) under one unified kernel. - Problem: Most AI systems stop at a single model endpoint and fail at end-to-end autonomy, coordination, and lifecycle management.
- Core Challenge: Keep orchestration, inference, and training decoupled enough to scale independently while still behaving like one product.
- What This Solves: Creates a durable systems contract:
JARVISruns operations,Primeserves intelligence,Reactorcontinuously improves intelligence.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'primaryBorderColor': '#70a5fd', 'lineColor': '#545c7e', 'secondaryColor': '#24283b', 'tertiaryColor': '#1a1b27', 'fontSize': '14px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%
flowchart TD
KERNEL["<b>UNIFIED SUPERVISOR KERNEL</b><br/>Single Entry Point · 50K+ LOC<br/>7-Zone Parallel Initialization"]
KERNEL -->|"orchestrates"| JARVIS
KERNEL -->|"routes inference"| PRIME
KERNEL -->|"triggers training"| REACTOR
subgraph JARVIS["<b>JARVIS — The Body</b> Python / Rust / Swift :8010"]
direction TB
J1["🕸️ Neural Mesh<br/><i>16+ async agents · capability routing</i>"]
J2["🎙️ Voice & Auth<br/><i>ECAPA-TDNN · full-duplex · wake word</i>"]
J3["👁️ Vision & Spatial<br/><i>LLaVA · YOLO · Ghost Display · OCR</i>"]
J4["🍎 macOS Native<br/><i>Swift 203 files · ObjC · Rust · CoreML</i>"]
J5["🧠 Intelligence<br/><i>RAG · Ouroboros · Google Workspace</i>"]
end
subgraph PRIME["<b>JARVIS-Prime — The Mind</b> Python / GGUF :8000-8001"]
direction TB
P1["📡 Task-Type Router<br/><i>11 specialist models · 40.4 GB</i>"]
P2["⚡ Neural Switchboard<br/><i>v98.1 · WebSocket contracts</i>"]
P3["👁️ LLaVA Vision Server<br/><i>multimodal · OpenAI-compatible API</i>"]
P4["💭 Reasoning Engine<br/><i>CoT / ToT / self-reflection</i>"]
P5["📊 Telemetry Capture<br/><i>JSONL · deployment feedback loop</i>"]
end
subgraph REACTOR["<b>ReactorCore — The Forge</b> C++ / Python :8090"]
direction TB
R1["🔥 Training Pipeline<br/><i>LoRA · DPO · RLHF · FSDP</i>"]
R2["🚪 Deployment Gate<br/><i>integrity validation · probation monitor</i>"]
R3["🧬 Model Lineage<br/><i>full provenance chain · append-only JSONL</i>"]
R4["☁️ GCP Spot Recovery<br/><i>checkpoint persistence · 60% cost savings</i>"]
R5["⚙️ C++ Kernels<br/><i>CMake · pybind11 · native performance</i>"]
end
PRIME -.->|"telemetry + experiences"| REACTOR
REACTOR -.->|"improved GGUF models"| PRIME
JARVIS <-.->|"inference requests / responses"| PRIME
REACTOR -.->|"training signals"| JARVIS
style KERNEL fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
style JARVIS fill:#0d1117,stroke:#70a5fd,stroke-width:2px,color:#a9b1d6
style PRIME fill:#0d1117,stroke:#bf91f3,stroke-width:2px,color:#a9b1d6
style REACTOR fill:#0d1117,stroke:#bb9af7,stroke-width:2px,color:#a9b1d6
Purpose, Problem, Challenge, Solution
- Purpose: Show the runtime request path from multimodal inputs to routed inference and back to user-visible action.
- Problem: Input streams (voice, screen, command) are heterogeneous and require different model strategies and latencies.
- Core Challenge: Route by task type in real time while capturing high-quality telemetry for future model improvement.
- What This Solves: Demonstrates a closed execution path where each response both serves the user now and improves the system later.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%
flowchart LR
A["🎤 Voice Input"] --> B["JARVIS Kernel"]
C["👁️ Screen Capture"] --> B
D["⌨️ User Command"] --> B
B --> E["JARVIS-Prime<br/><i>inference routing</i>"]
E --> F{"Task Type?"}
F -->|"math"| G["Qwen2.5-7B"]
F -->|"code"| H["DeepCoder"]
F -->|"vision"| I["LLaVA"]
F -->|"simple"| J["Fast 2.2GB"]
F -->|"complex"| K["Claude API"]
G & H & I & J & K --> L["Response"]
L --> B
E -->|"telemetry"| M["ReactorCore"]
M -->|"LoRA/DPO training"| N["Improved Model"]
N -->|"deploy + probation"| E
style B fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
style E fill:#1a1b27,stroke:#bf91f3,stroke-width:2px,color:#bf91f3
style M fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
style F fill:#24283b,stroke:#545c7e,stroke-width:1px,color:#a9b1d6
Purpose, Problem, Challenge, Solution
- Purpose: Define a deterministic fallback ladder for reliability under changing infrastructure and hardware conditions.
- Problem: A single inference backend is a single point of failure (downtime, cold starts, local resource pressure, API outages).
- Core Challenge: Preserve quality and uptime while controlling cost and avoiding hard dependency on any one execution tier.
- What This Solves: Guarantees service continuity through policy-based failover:
GCP->Local Metal->Claude API.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%
flowchart LR
REQ["Inference Request"] --> T1
T1["☁️ Tier 1: GCP Golden Image<br/><i>11 models · ~30s cold start</i>"]
T1 -->|"unavailable"| T2["💻 Tier 2: Local Apple Silicon<br/><i>M1 Metal GPU · on-device</i>"]
T2 -->|"resource constrained"| T3["🔑 Tier 3: Claude API<br/><i>emergency fallback</i>"]
T1 -->|"✅ success"| RES["Response"]
T2 -->|"✅ success"| RES
T3 -->|"✅ success"| RES
style T1 fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
style T2 fill:#1a1b27,stroke:#bf91f3,stroke-width:2px,color:#bf91f3
style T3 fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
style REQ fill:#24283b,stroke:#545c7e,stroke-width:1px,color:#a9b1d6
style RES fill:#24283b,stroke:#545c7e,stroke-width:1px,color:#a9b1d6
Purpose, Problem, Challenge, Solution
- Purpose: Wire autonomy lifecycle events through the Trinity loop so the system can learn from its own autonomous actions.
- Problem: JARVIS Body performs autonomous actions (Google Workspace agent) but the outcomes are not captured as structured training signals.
- Core Challenge: Events must be strictly validated, deduplicated, and classified before reaching the training pipeline — malformed or replayed events would corrupt model weights.
- What This Solves: Creates a closed feedback loop where autonomous actions generate training data, improving future autonomy decisions without manual intervention.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%
flowchart TD
AGENT["🤖 Google Workspace Agent<br/><i>execute_task()</i>"]
AGENT -->|"7 event types"| EMIT["📡 _emit_autonomy_event()<br/><i>strict metadata schema</i>"]
EMIT -->|"token-bucket<br/>rate limiter"| FWD["🔀 CrossRepoExperienceForwarder<br/><i>forward_autonomy_event()</i>"]
FWD -->|"ExperienceEvent<br/>(type=METRIC)"| ING["🔬 AutonomyEventIngestor"]
ING --> V{"Validate<br/>7 required keys?"}
V -->|"❌ malformed"| Q["🗃️ Quarantine<br/><i>disk-based · 7d retention</i>"]
V -->|"✅ valid"| D{"Deduplicate<br/>composite key?"}
D -->|"duplicate"| SKIP["⏭️ Skip"]
D -->|"unique"| CLS["🏷️ AutonomyEventClassifier"]
CLS -->|"committed / failed"| TRAIN["🔥 UnifiedPipeline<br/><i>DPO / LoRA training</i>"]
CLS -->|"infrastructure /<br/>excluded"| EXCLUDE["📊 Metrics Only<br/><i>no training</i>"]
AGENT <-.->|"autonomy_policy /<br/>action_plan"| PRIME["💭 JARVIS-Prime<br/><i>policy gate</i>"]
SUP["🛡️ Supervisor Boot"] -->|"check_autonomy_contracts()"| COMPAT{"Schema<br/>Compatible?"}
COMPAT -->|"✅ pass"| FULL["Full Autonomy Mode"]
COMPAT -->|"❌ mismatch"| RO["Read-Only Mode"]
style AGENT fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
style PRIME fill:#1a1b27,stroke:#bf91f3,stroke-width:2px,color:#bf91f3
style ING fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
style TRAIN fill:#1a1b27,stroke:#9ece6a,stroke-width:2px,color:#9ece6a
style Q fill:#1a1b27,stroke:#f7768e,stroke-width:2px,color:#f7768e
style SUP fill:#1a1b27,stroke:#e0af68,stroke-width:2px,color:#e0af68
How it works:
- Body emits 7 canonical events — Every autonomous action (email send, calendar create, doc edit) emits a lifecycle event:
intent_written(about to execute),committed(success),failed(error),policy_denied(blocked by Prime),deduplicated(suppressed duplicate),superseded(stale intent),no_journal_lease(fail-closed safety) - Strict metadata schema — Each event carries 7 required keys (
autonomy_event_type,autonomy_schema_version,idempotency_key,trace_id,correlation_id,action,request_kind). Malformed events are quarantined to disk, never silently coerced - Token-bucket rate limiter — Prevents replay storms during startup reconciliation (default: 50 events/second)
- Effectively-once semantics — Deduplication by composite key
(idempotency_key, autonomy_event_type, trace_id)with a 50K sliding window - Centralized classification —
AutonomyEventClassifieris the single source of truth: onlycommittedandfailedare trainable; infrastructure events are excluded from training but retained for observability - Boot contract validation — Supervisor checks schema version compatibility across all three repos at startup. Any mismatch degrades to read-only autonomy mode (no autonomous writes)
- Prime as policy gate — Body attaches
autonomy_policy(allowed/denied actions, risk thresholds) to commands; Prime validates and returns structuredaction_planwithpolicy_compatibleflag
Purpose, Problem, Challenge, Solution
- Purpose: Enable JARVIS to autonomously detect, generate, validate, and apply code improvements across all three repos (JARVIS, JARVIS-Prime, Reactor-Core) in real time — without human intervention.
- Problem: Cross-repo code applies without isolation are dangerous: partial failures leave repos in inconsistent states, no rollback exists, TARGET_MOVED (another commit landing mid-apply) goes undetected, and forensics branches are lost on failure.
- Core Challenge: Production-grade saga apply safety across three independent git repos — ephemeral branch isolation, deterministic lock ordering, ff-only promote gates, and bounded passive observability — all without changing the external execution contract.
- What This Solves (v262.0 B+): Full activation of the autonomous self-development loop with B+ branch-isolated sagas, passive SagaMessageBus observer, TestFailureSensor with real polling watcher, and all 4 P0 config blockers resolved.
JARVIS_SAGA_BRANCH_ISOLATION=true+JARVIS_GOVERNANCE_MODE=governed= fully operational.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%
flowchart TD
subgraph INTAKE["Zone 6.9 — Intake Layer (per repo × 3)"]
B["📋 BacklogSensor<br/><i>polls .jarvis/backlog.json · 30s</i>"]
T["🧪 TestFailureSensor + TestWatcher<br/><i>pytest subprocess · streak ≥ 2 · 300s</i>"]
M["⛏️ OpportunityMiner<br/><i>complexity ≥ 10 · 300s</i>"]
V["🎤 VoiceCommandSensor<br/><i>event-driven · always on</i>"]
end
subgraph GLS["Zone 6.8 — Governed Loop Service"]
Q["📥 UnifiedIntakeRouter<br/><i>dedup · priority · human-ack</i>"]
FSM["🔄 PreemptionFsmEngine<br/><i>IDLE→ACTIVE→PAUSED→TERMINAL</i>"]
ORCH["🎯 Orchestrator<br/><i>CLASSIFY→ROUTE→EXPAND→GENERATE→VALIDATE→GATE→APPLY→VERIFY→COMPLETE</i>"]
BUS["📡 SagaMessageBus<br/><i>passive observer · max 500 msgs · TTL 300s</i>"]
end
subgraph SAGA["B+ Saga Apply (branch_isolation=True)"]
PRE["1. Preflight: assert clean worktree"]
BR["2. Create ouroboros/saga-<op_id>/<repo>"]
AP["3. Apply patch + git commit"]
LOCK["Two-Tier Lock:<br/>asyncio.Lock + fcntl.flock<br/><i>sorted order: jarvis → prime → reactor</i>"]
PROM["4. promote_all()<br/><i>check_promote_safe → git merge --ff-only</i>"]
COMP["5. On failure: _bplus_compensate_all()<br/><i>restore original_ref · keep forensics branch</i>"]
end
subgraph JPRIME["GCP J-Prime (Golden Image · 136.113.252.164:8000)"]
GEN["🧠 Code Generation<br/><i>schema 2c.1 · multi-repo patches</i>"]
NOOP["⚡ Noop Fast-Path<br/><i>2b.1-noop → GENERATE→COMPLETE</i>"]
end
B & T & M & V --> Q
Q --> FSM --> ORCH
ORCH -->|"GENERATE"| JPRIME
GEN & NOOP --> ORCH
ORCH -->|"APPLY"| PRE
PRE --> LOCK --> BR --> AP
AP -->|"success"| PROM
AP -->|"failure"| COMP
PROM -->|"SAGA_SUCCEEDED"| BUS
PROM -->|"TARGET_MOVED"| BUS
PROM -->|"SAGA_PARTIAL_PROMOTE"| BUS
COMP -->|"SAGA_ROLLED_BACK"| BUS
ORCH -->|"VERIFY fail"| BUS
style INTAKE fill:#0d1117,stroke:#70a5fd,stroke-width:2px,color:#a9b1d6
style GLS fill:#0d1117,stroke:#bf91f3,stroke-width:2px,color:#a9b1d6
style SAGA fill:#0d1117,stroke:#9ece6a,stroke-width:2px,color:#a9b1d6
style JPRIME fill:#0d1117,stroke:#e0af68,stroke-width:2px,color:#a9b1d6
style BUS fill:#1a1b27,stroke:#7dcfff,stroke-width:2px,color:#7dcfff
style LOCK fill:#1a1b27,stroke:#f7768e,stroke-width:2px,color:#f7768e
How it works:
- Zone 6.9 sensors fan out per repo — Each of the three repos (JARVIS, JARVIS-Prime, Reactor-Core) gets its own
BacklogSensor,TestFailureSensor(with realTestWatchersubprocess poller), andOpportunityMinerSensor.VoiceCommandSensoris always-on and event-driven. - TestWatcher polls continuously — Runs
pytestin a subprocess every 300s per repo. Emits a stableintent:test_failureenvelope only afterstreak ≥ 2consecutive failures — preventing false alarms from transient flakes. - B+ branch isolation — Every apply creates an ephemeral branch
ouroboros/saga-<op_id>/<repo>. Patches are committed there. Promote usesgit merge --ff-only— if the target moved (TARGET_MOVED), the gate fails and the saga compensates cleanly. - Two-tier locking —
asyncio.Lock(in-process) +fcntl.flock(cross-process) acquired in sorted repo name order (jarvis → prime → reactor-core) — deterministic, deadlock-free across concurrent ops. - SAGA_PARTIAL_PROMOTE — If promotion succeeds for some repos but fails for others, the new
SAGA_PARTIAL_PROMOTEterminal state triggers a scoped pause (cross_repo_sagascope) until the operator reviews the partial state. - SagaMessageBus — A passive, fault-isolated observer (zero execution authority) records 8 event types:
SAGA_CREATED,SAGA_ADVANCED,SAGA_COMPLETED,SAGA_FAILED,SAGA_ROLLED_BACK,SAGA_PARTIAL_PROMOTE,TARGET_MOVED,ANCESTRY_VIOLATION. Fire-and-forget — a broken bus never blocks an apply. - SagaLedgerArtifact — A 15-field frozen dataclass records every saga op: original_ref, saga_branch, promoted_sha, rollback_reason, kept_forensics_branches, and timestamp_ns. Full audit trail in the durable ledger.
- J-Prime generates patches — GCP golden image at
136.113.252.164:8000generates schema 2c.1 multi-repo patches. Noop fast-path (2b.1-noop) skips directly toCOMPLETEif the change is already present. - Voice narration —
VoiceNarratorannounces intent, decision, and postmortem at each significant phase.OUROBOROS_VOICE_DEBOUNCE_Sprevents over-narration (default 60s).
Activation (v262.0 — all green):
# .env (required for full autonomous operation)
JARVIS_GOVERNANCE_MODE=governed
JARVIS_SAGA_BRANCH_ISOLATION=true
JARVIS_SAGA_KEEP_FORENSICS_BRANCHES=true
# Start
python3 unified_supervisor.py --forceWhat it does:
- Detects opportunities across all 3 repos (test failures, backlog, complexity, voice commands)
- Calls J-Prime on GCP, receives a schema 2c.1 multi-repo patch
- Applies with B+ saga safety — ephemeral branches, two-tier locks, ff-only promote gates, rollback
- Narrates every decision in real time via voice + TUI
- Commits and promotes across jarvis + prime + reactor without human touch
Where it stands vs. Claude Code:
| Capability | Claude Code | Ouroboros v262.0 |
|---|---|---|
| Read arbitrary files during an op | Full Read tool |
Partial — TheOracle + context_expander (10 files max) |
| Run bash commands | Yes | No |
| Search the web | Yes | No |
| Edit code iteratively with feedback | Multi-turn, sees results | One-shot patch + apply |
| Test before committing | Runs tests, reads output, fixes | Applies first, verifies after |
| Persistent strategic goal memory | Deep conversation context | Per-op intent only |
Core difference: Claude Code is a full agentic loop with tool use — reads, runs, observes, revises, converges. Ouroboros is a code generation + automated apply pipeline. J-Prime generates a patch once; the B+ saga applies it. No iterative tool-use loop within an operation yet.
What would close the gap:
- Tool use in the generation loop — J-Prime calls
read_file,run_command,run_testsduring generation - Multi-turn op execution — generate → run → observe → revise → converge
- Persistent goal memory — accumulates your long-running intent across sessions into every op's context
- Sandboxed shell — Ouroboros verifies its own changes before committing
Bottom line: Real, production-grade autonomous code delivery — not a demo. JARVIS will find work, generate patches via J-Prime, and commit across 3 repos without you touching anything. Not yet at Claude Code-level agentic tool use. That is the explicit next evolution.
Purpose, Problem, Challenge, Solution
- Purpose: Run high-throughput inference and training on GCP while preserving local fallback and cost control.
- Problem: On-demand cloud is expensive at scale, while local-only inference cannot absorb peak load or large-model demand.
- Core Challenge: Balance latency, uptime, and spend when Spot VMs can be preempted without warning.
- What This Solves: Introduces hybrid execution with preemption-aware orchestration, checkpoint recovery, and automatic failover to local/API tiers.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%
flowchart LR
REQ["Inference / Training Request"] --> ORCH["Hybrid Orchestrator"]
ORCH --> SPOT["GCP Spot VM Pool<br/><i>primary cost-optimized execution</i>"]
ORCH --> LOCAL["Local Apple Silicon Tier<br/><i>low-latency fallback</i>"]
ORCH --> API["Claude API Tier<br/><i>emergency overflow</i>"]
SPOT --> PREEMPT{"Preempted?"}
PREEMPT -->|"no"| RUN["Run Workload"]
PREEMPT -->|"yes"| RECOVER["Resume From Checkpoint"]
RECOVER --> RUN
RUN --> TELE["Telemetry + Cost Signals"]
TELE --> ORCH
RUN --> RES["Response / Model Artifact"]
LOCAL --> RES
API --> RES
style ORCH fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
style SPOT fill:#1a1b27,stroke:#bf91f3,stroke-width:2px,color:#bf91f3
style LOCAL fill:#1a1b27,stroke:#7dcfff,stroke-width:2px,color:#7dcfff
style API fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
style PREEMPT fill:#24283b,stroke:#545c7e,stroke-width:1px,color:#a9b1d6
Purpose, Problem, Challenge, Solution
- Purpose: Eliminate repeated cold setup by pre-baking model runtimes and dependencies into immutable machine images.
- Problem: Dynamic provisioning causes long startup times, dependency drift, and inconsistent behavior across nodes.
- Core Challenge: Keep images reproducible and secure while continuously shipping model/runtime updates.
- What This Solves: Establishes an immutable golden-image pipeline with validation gates and rollout controls for consistent low-latency boot.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%
flowchart LR
SRC["Model + Runtime Source"] --> BUILD["Image Builder Pipeline"]
BUILD --> BAKE["Bake Golden Image<br/><i>models + deps + startup contracts</i>"]
BAKE --> VALIDATE["Validation Gate<br/><i>health, integrity, startup SLA</i>"]
VALIDATE -->|"pass"| REG["Image Registry"]
VALIDATE -->|"fail"| REJECT["Reject Build"]
REG --> SCALE["Autoscaled GCP Inference Nodes"]
SCALE --> PRIME["JARVIS-Prime Router"]
PRIME --> MON["Observability + Drift Monitoring"]
MON --> BUILD
style BUILD fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
style BAKE fill:#1a1b27,stroke:#bf91f3,stroke-width:2px,color:#bf91f3
style VALIDATE fill:#1a1b27,stroke:#7dcfff,stroke-width:2px,color:#7dcfff
style REG fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
style REJECT fill:#1a1b27,stroke:#f7768e,stroke-width:2px,color:#f7768e
Purpose, Problem, Challenge, Solution
- Purpose: Separate operational concerns into control, data, and model planes for clearer ownership and safer evolution.
- Problem: Without plane separation, policy, state, and model behavior become tightly coupled and brittle during scale-out.
- Core Challenge: Enforce governance and safety globally while allowing model and data pipelines to move quickly.
- What This Solves: Makes architecture auditable and composable: control governs, data persists context, models execute decisions.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%
flowchart TB
subgraph CONTROL["🛡️ Control Plane"]
C1["Policy Engine"]
C2["Auth + Approval Gates"]
C3["Secrets + Key Management"]
C4["Kill Switch + Guardrails"]
end
subgraph DATA["📦 Data Plane"]
D1["JARVIS Runtime Events"]
D2["Redis + Cloud SQL State"]
D3["ChromaDB / FAISS Memory"]
D4["JSONL Telemetry + Lineage"]
end
subgraph MODEL["🧠 Model Plane"]
M1["Prime Inference Router"]
M2["Tiered Execution (GCP/Local/Claude)"]
M3["Reactor Training Pipeline"]
M4["Deployment Gate + Probation"]
end
CONTROL -->|"policy constraints"| DATA
CONTROL -->|"permit / deny"| MODEL
DATA -->|"context + telemetry"| MODEL
MODEL -->|"decisions + artifacts"| DATA
MODEL -->|"health + risk signals"| CONTROL
style CONTROL fill:#0d1117,stroke:#70a5fd,stroke-width:2px,color:#a9b1d6
style DATA fill:#0d1117,stroke:#bf91f3,stroke-width:2px,color:#a9b1d6
style MODEL fill:#0d1117,stroke:#bb9af7,stroke-width:2px,color:#a9b1d6
Purpose, Problem, Challenge, Solution
- Purpose: Govern shared Apple Silicon UMA memory with explicit, lease-based control across model loads, display surfaces, and agent runtime.
- Problem: GPU/compositor pressure is often invisible to process-level memory metrics, so systems can appear healthy while heading into swap thrash.
- Core Challenge: Coordinate memory decisions across heterogeneous consumers while preventing flapping and preserving critical capabilities.
- What This Solves: Introduces deterministic memory governance with pressure-aware lease grants, stepwise shedding, and crash-safe lease reconciliation.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'primaryBorderColor': '#70a5fd', 'lineColor': '#545c7e', 'secondaryColor': '#24283b', 'tertiaryColor': '#1a1b27', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%
flowchart TB
subgraph OBS["📊 UMA Observability"]
Q["MemoryQuantizer<br/><i>system + process sampling</i>"]
S["Frozen MemorySnapshot<br/><i>headroom, pressure tier, thrash state</i>"]
Q --> S
end
subgraph BROKER["🧠 MemoryBudgetBroker"]
B1["Lease Manager<br/><i>grant / deny / preempt</i>"]
B2["Budget Engine<br/><i>tier multipliers + safety reserve</i>"]
B3["Recovery Ledger<br/><i>epoch fencing + stale lease reclaim</i>"]
end
subgraph CONSUMERS["📦 Lease Holders"]
M["Model Loaders<br/><i>LLM, vision, speaker ID</i>"]
A["Agent Runtime<br/><i>mesh workers + queues</i>"]
D["Ghost Display<br/><i>display:ghost@v1</i>"]
end
subgraph CONTROL["🖥️ DisplayPressureController"]
C1["Policy State Machine<br/><i>one-step downgrade invariant</i>"]
C2["Shedding Ladder<br/><i>1080p -> 900p -> 720p -> 576p -> off</i>"]
C3["Flap Guards<br/><i>dwell, cooldown, rate limits</i>"]
end
S -->|"pressure tier + headroom"| B2
B2 --> B1
B3 --> B1
B1 -->|"lease outcomes"| M
B1 -->|"lease outcomes"| A
B1 -->|"lease outcomes"| D
B1 -->|"pressure signal"| C1
C1 --> C2
C2 -->|"resolution action"| D
C1 --> C3
C3 -->|"allow / delay"| C2
D -->|"amend_lease_bytes"| B1
B1 -->|"events + decisions"| T["Telemetry Pipeline"]
T -->|"drift + anomaly feedback"| Q
style OBS fill:#0d1117,stroke:#70a5fd,stroke-width:2px,color:#a9b1d6
style BROKER fill:#0d1117,stroke:#bf91f3,stroke-width:2px,color:#a9b1d6
style CONSUMERS fill:#0d1117,stroke:#bb9af7,stroke-width:2px,color:#a9b1d6
style CONTROL fill:#0d1117,stroke:#7dcfff,stroke-width:2px,color:#a9b1d6
style T fill:#24283b,stroke:#545c7e,stroke-width:1px,color:#a9b1d6
Key design decisions
- Lease-first memory policy — Components must request memory leases before expensive allocations; brokered leases are the source of truth.
- Typed pressure tiers — Budget aggressiveness changes by pressure tier to avoid hardcoded, brittle thresholds.
- Deterministic shedding — Display degradation follows ordered one-step transitions, preventing abrupt multi-level drops.
- Flap prevention controls — Dwell windows, cooldowns, and rate limits stop oscillation under noisy pressure signals.
- Crash-safe reconciliation — Epoch fencing and stale lease recovery reclaim orphaned allocations after process failures.
- Closed-loop observability — Broker and controller events feed telemetry so memory policy can be calibrated over time.
Purpose, Problem, Challenge, Solution
- Purpose: Document the decision policy from risk classification to approval, execution, blocking, and audit.
- Problem: Autonomous systems can perform high-impact actions where incorrect execution is costly or irreversible.
- Core Challenge: Balance autonomy and velocity with explicit human control boundaries for high-risk operations.
- What This Solves: Provides a predictable safety envelope: low-risk auto-exec, medium-risk constrained mode, high-risk human-in-the-loop.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%
flowchart LR
IN["Incoming Action"] --> CLASS["Risk Classifier"]
CLASS -->|"low risk"| AUTO["Auto Execute"]
CLASS -->|"medium risk"| SAFE["Safe Mode + Limits"]
CLASS -->|"high risk"| HITL["Human Approval Required"]
SAFE --> EXEC["Controlled Execution"]
HITL -->|"approved"| EXEC
HITL -->|"denied"| BLOCK["Blocked + Logged"]
EXEC --> MON["Runtime Monitor"]
MON -->|"policy violation"| TRIP["Circuit Breaker Trip"]
TRIP --> FB["Fallback Route / Degrade Gracefully"]
MON -->|"healthy"| OK["Commit Result"]
BLOCK --> AUD["Audit Trail"]
FB --> AUD
OK --> AUD
style IN fill:#24283b,stroke:#545c7e,stroke-width:1px,color:#a9b1d6
style CLASS fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
style HITL fill:#1a1b27,stroke:#ffb86c,stroke-width:2px,color:#ffb86c
style TRIP fill:#1a1b27,stroke:#f7768e,stroke-width:2px,color:#f7768e
style AUD fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
Purpose, Problem, Challenge, Solution
- Purpose: Show how runtime signals become training data, deployment decisions, and measurable model upgrades.
- Problem: Teams often collect telemetry but fail to operationalize it into safe, repeatable improvement cycles.
- Core Challenge: Detect regressions early, gate bad models, and continuously retrain without destabilizing production.
- What This Solves: Establishes a true learning loop: observe -> detect -> curate -> train -> gate/probation -> deploy or rollback.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%
flowchart LR
RUN["Live Inference + Agent Runtime"] --> OTEL["OpenTelemetry Traces/Metrics"]
RUN --> LOGS["Structured JSONL Logs"]
RUN --> COST["LangFuse + Helicone + PostHog"]
OTEL --> HUB["Unified Observability Hub"]
LOGS --> HUB
COST --> HUB
HUB --> ALERT["Anomaly/Regression Detection"]
ALERT -->|"critical"| ROLLBACK["Auto Rollback / Gate Fail"]
ALERT -->|"acceptable"| CURATE["Telemetry Curation"]
CURATE --> TRAIN["Reactor Training (LoRA/DPO/RLHF)"]
TRAIN --> GATE["Deployment Gate + Probation"]
GATE -->|"pass"| PRIME["Prime Model Registry"]
GATE -->|"fail"| ROLLBACK
PRIME --> RUN
style RUN fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
style HUB fill:#1a1b27,stroke:#bf91f3,stroke-width:2px,color:#bf91f3
style TRAIN fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
style ROLLBACK fill:#1a1b27,stroke:#f7768e,stroke-width:2px,color:#f7768e
Agent Architecture
- Neural Mesh — 16+ specialized agents (activity recognition, adaptive resource governor, context tracker, error analyzer, goal inference, Google Workspace, health monitor, memory, pattern recognition, predictive planning, spatial awareness, visual monitor, web search, coordinator) with asynchronous message passing, capability-based routing, and cross-agent data flow
- Autonomous Agent Runtime — multi-step goal decomposition, agentic task execution, tool orchestration, error recovery, and intervention decision engine with human-in-the-loop approval for destructive actions
- AGI OS Coordinator — proactive event stream, notification bridge, owner identity service, voice approval manager, and intelligent startup announcer
Voice and Authentication
- Real-time voice biometric authentication via ECAPA-TDNN speaker verification with cloud/local hybrid inference and multi-factor fusion (voice + proximity + behavioral)
- Real-time voice conversation — full-duplex audio (simultaneous mic + speaker), acoustic echo cancellation (speexdsp), streaming STT (faster-whisper), adaptive turn detection, barge-in control, and sliding 20-turn context window
- Wake word detection (Porcupine/Picovoice), Apple Watch Bluetooth proximity auth, continuous learning voice profiles
- Unified speech state management — STT hallucination guard, voice pipeline orchestration, parallel model loading
Vision and Spatial Intelligence
- Never-skip screen capture — two-phase monitoring (always-capture + conditional-analysis), self-hosted LLaVA multimodal analysis, Claude Vision escalation
- Ghost Display — virtual macOS display for non-intrusive background automation, Ghost Hands orchestrator for autonomous visual workflows
- Claude Computer Use — automated mouse, keyboard, and screenshot interaction via Anthropic's Computer Use API
- OCR / OmniParser — screen text extraction, window analysis, workspace name detection, multi-monitor and multi-space intelligence via yabai window manager
- YOLO + Claude hybrid vision — object detection with LLM-powered semantic understanding
- Rust vision core — native performance for fast image processing, bloom filter networks, and sliding window analysis
macOS Native Integration (Swift / Objective-C / Rust)
- Swift bridge (203 files) — CommandClassifier, SystemControl (preferences, security, clipboard, filesystem), PerformanceCore, ScreenCapture, WeatherKit, CoreLocation GPS
- Objective-C voice unlock daemon — JARVISVoiceAuthenticator, JARVISVoiceMonitor, permission manager, launchd service integration
- Rust performance layer — PyO3 bindings for memory pool management, quantized ML inference, vision fast processor, command classifier, health predictor; ARM64 SIMD assembly optimizations
- CoreML acceleration — on-device intent classification, voice processing
Infrastructure and Reliability
- Parallel initializer with cooperative cancellation, adaptive EMA-based deadlines, dependency propagation, and atomic state persistence
- CPU-pressure-aware cloud shifting — automatic workload offload to GCP when local resources are constrained
- Enterprise hardening — dependency injection container, enterprise process manager, system hardening, governance, Cloud SQL with race-condition-proof proxy management, TLS-safe connection factories, distributed lock manager
- Three-tier inference routing: GCP Golden Image (primary) → Local Apple Silicon (fallback) → Claude API (emergency)
- Trinity event bus — cross-repo IPC hub, heartbeat publishing, knowledge graph, state management, process coordination
- Cost tracking and rate limiting — GCP cost optimization with Bayesian confidence fusion, intelligent rate orchestration
- File integrity guardian — pre-commit integrity verification across the codebase
Intelligence and Learning
- Google Workspace Agent — Gmail read/search/draft, Google Calendar, natural language intent routing via tiered command router
- Proactive intelligence — predictive suggestions, proactive vision monitoring, proactive communication, emotional intelligence module
- RAG pipeline — ChromaDB vector store, FAISS similarity search, embedding service, long-term memory system
- Chain-of-thought / reasoning graph engine — LangGraph-based multi-step reasoning with conditional routing and reflection loops
- Ouroboros (v262.0 B+ — fully activated) — autonomous self-development across JARVIS, JARVIS-Prime, and Reactor-Core: B+ branch-isolated saga applies (ephemeral branches, two-tier locks, ff-only promote gates, rollback-via-branch-delete), SagaMessageBus passive observer, TestFailureSensor with real TestWatcher per repo, GCP J-Prime code generation (schema 2c.1), voice narration at every decision phase
- Web research service — autonomous web search and information synthesis
- A/B testing framework — vision pipeline experimentation
- Repository intelligence — code ownership analysis, dependency analyzer, API contract analyzer, AST transformer, cross-repo refactoring engine
Inference and Routing
- 11 specialist GGUF models (~40.4 GB) pre-baked into a GCP golden image with ~30-second cold starts
- Task-type routing — math queries hit Qwen2.5-7B, code queries hit DeepCoder, simple queries hit a 2.2 GB fast model, vision hits LLaVA
- GCP Model Swap Coordinator with intelligent hot-swapping, per-model configuration, and inference validation
- Neural Switchboard v98.1 — stable public API facade over routing and orchestration with WebSocket integration contracts
- Hollow Client mode for memory-constrained hardware — strict lazy imports, zero ML dependencies at startup on 16 GB machines
Reasoning and Telemetry
- Continuous learning hook — post-inference experience recording for Elastic Weight Consolidation via ReactorCore
- Reasoning engine activation — chain-of-thought scaffolding (CoT/ToT/self-reflection) for high-complexity requests above configurable thresholds
- APARS protocol (Adaptive Progress-Aware Readiness System) — 6-phase startup with real-time health reporting to the supervisor
- LLaVA vision server — multimodal inference on port 8001 with OpenAI-compatible API, semaphore serialization, queue depth cap
- Telemetry capture — structured JSONL interaction logging with deployment feedback loop and post-deployment probation monitoring
Training Pipeline
- Full training pipeline: telemetry ingestion → active learning selection → gatekeeper evaluation → LoRA SFT → GGUF export → deployment gate → probation monitoring → feedback loop
- DeploymentGate validates model integrity before deployment; rejects corrupt or degenerate outputs
- Post-deployment probation — 30-minute health monitoring window with automatic commit or rollback based on live inference quality
- Model lineage tracking — full provenance chain (hash, parent model, training method, evaluation scores, gate decision) in append-only JSONL
- Tier-2/Tier-3 runtime orchestration — curriculum learning, meta-learning (MAML), causal discovery with correlation-based fallback, world model training
Infrastructure and Integration
- GCP Spot VM auto-recovery with training checkpoint persistence and 60% cost reduction over on-demand instances
- Native C++ training kernels via CMake/pybind11/cpp-httplib for performance-critical operations
- Atomic experience snapshots — buffer drain under async lock, JSONL with DataHash for dataset versioning
- PrimeConnector — WebSocket path rotation, health polling fallback, contract path discovery for cross-repo communication
- Cross-repo integration — Ghost Display state reader, cloud mode detection, Trinity Unified Loop Manager, pipeline event logger with correlation IDs
| Metric | Value |
|---|---|
| Total commits | 3,900+ across 3 repositories |
| Codebase | ~2.5 million lines across 18+ languages |
| Build duration | 12 months, solo |
| Unified kernel | 50,000+ lines in a single orchestration file |
| Neural Mesh agents | 16+ specialized agents with async message passing |
| Models served | 11 specialist GGUF models via task-type routing |
| Inference tiers | GCP Golden Image → Local Metal GPU → Claude API |
| Training pipeline | Automated: telemetry → active learning → gatekeeper → training → GGUF export → deployment gate → probation → feedback |
| Voice auth | Multi-factor: ECAPA-TDNN biometric + Apple Watch proximity + behavioral analysis |
| Vision pipeline | Never-skip capture, LLaVA self-hosted, Claude escalation, YOLO hybrid, OCR/OmniParser |
| Swift components | 203 files — system control, command classifier, screen capture, GPS, weather |
| Rust crates | 5 Cargo workspaces — memory pool, vision processor, ML inference, SIMD optimizations |
| Terraform modules | 7 modules (compute, network, security, storage, monitoring, budget, Spot templates) |
| Dockerfiles | 6 (backend, backend-slim, frontend, training, cloud, GCP inference) |
| GitHub Actions | 20+ workflows (CI/CD, CodeQL, e2e testing, deployment, database validation, file integrity) |
| macOS integration | Native Swift/ObjC daemons, yabai WM, Ghost Display, multi-space/multi-monitor, launchd services |
| Cloud infrastructure | GCP (Compute Engine, Cloud SQL, Cloud Run, Secret Manager, Monitoring), Spot VM auto-recovery |
| Google Workspace | Gmail read/search/draft, Calendar, natural language routing via tiered command router |
I graduated from Cal Poly San Luis Obispo with a B.S. in Computer Engineering after a 10-year non-traditional academic path that started in remedial algebra at community college. I retook courses, studied through the loss of family, and spent most of my twenties earning a degree that others finish in four years. The path was not conventional. The outcome was.
JARVIS is what happens when that level of persistence meets engineering capability. Twelve months of daily commits, architectural decisions at every layer of the stack, and a refusal to ship anything that is not production-grade.





