Skip to content
View drussell23's full-sized avatar

Block or report drussell23

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
drussell23/README.md

For the past 12 months, I have been executing a solo build of JARVIS — a three-repository, multi-process autonomous AI operating system spanning Python, C++, Rust, Go, Swift, Objective-C, and TypeScript. The system orchestrates 60+ asynchronous agents across a neural mesh, routes inference dynamically between local Apple Silicon and GCP, performs real-time voice biometric authentication, controls macOS at the native API level, and continuously trains its own models through a self-improving feedback loop.


  Now Vibing

DS4EVER Album Art



Pushin P on Spotify

DS4EVER  ·  Click to listen on Spotify


Currently Building

Typing SVG


Tech Stack

Languages

Languages

Objective-C ARM64 Assembly Metal Shading SQL AppleScript Protobuf HCL CUDA

ML, Inference and Data

ML

Infrastructure and Cloud

Infra

Backend and Frontend

Stack

Full Stack Inventory (text)
Category Technologies
Languages Python, C, C++, Rust, Go, Swift, Objective-C, Objective-C++, TypeScript, JavaScript, SQL, Shell/Bash, ARM64 Assembly (NEON SIMD), Metal Shading Language, AppleScript, Protobuf, HCL/Terraform, CUDA, HTML/CSS
ML / Inference PyTorch, Transformers, llama.cpp, llama-cpp-python, GGUF quantization, ONNX Runtime, CoreML Tools, SpeechBrain, scikit-learn, SentenceTransformers, HuggingFace Hub, safetensors, tiktoken, Numba (JIT), sympy, LangChain, YOLO
Training LoRA, DPO, RLHF, FSDP, MAML (meta-learning), curriculum learning, federated learning, causal reasoning, world model training, online learning, active learning, EWC
Models / Vision LLaVA (multimodal), ECAPA-TDNN (speaker verification), Whisper (faster-whisper, openai-whisper), Porcupine/Picovoice (wake word), Piper TTS, OmniParser (OCR)
LLM APIs Anthropic Claude API (chat, vision, computer use), OpenAI API (chat completions, embeddings), Google Gemini API, Ollama (local inference)
Rust PyO3, ndarray, rayon, parking_lot, DashMap, crossbeam, serde, mimalloc, image crate, Metal (GPU compute), tokio, zstd, lz4, candle (on-device ML)
Swift / macOS Swift Package Manager, CoreLocation, WeatherKit, AppKit, Foundation, Quartz/CoreGraphics, Accessibility API, AVFoundation, pyobjc, launchd, osascript, yabai
Vector / Data ChromaDB, FAISS, Redis, PostgreSQL (asyncpg, psycopg2), SQLite (aiosqlite), NetworkX, bloom filters
Infrastructure GCP (Compute Engine, Cloud SQL, Cloud Run, Secret Manager, Monitoring), Docker, docker-compose, Terraform, Kubernetes, systemd, CMake, pybind11, cpp-httplib
CI/CD GitHub Actions (30+ workflows), CodeQL, Super-Linter, Dependabot, Gitleaks, Postman/Newman, git worktrees
Backend FastAPI, uvicorn, uvloop, gRPC, Protobuf, asyncio, aiohttp, httpx, WebSocket, Cloud SQL Proxy, circuit breakers, exponential backoff, distributed locks, epoch fencing
Observability OpenTelemetry (tracing + metrics + OTLP/gRPC export), Prometheus, structlog, psutil, Pydantic, JSONL telemetry pipeline, LangFuse, Helicone, PostHog
Frontend React 19, Next.js, Framer Motion, Axios, WebSocket real-time streaming
Audio / Vision OpenCV, sounddevice, PyAudio, webrtcvad (VAD), Silero VAD, speexdsp (AEC), librosa, pyautogui, CoreML VAD, Tesseract OCR
Voice / TTS ElevenLabs, GCP TTS, Piper TTS, Edge-TTS, gTTS, pyttsx3, macOS Say, Wav2Vec2
C++ (ReactorCore) Custom mlforge ML library: KD-trees, graph structures, trie, matrix ops, linear/logistic regression, decision trees, neural nets, model serialization, deployment API
AI Orchestration LangChain, LangGraph, CrewAI, OpenHands, Open Interpreter, OmniParser
Experiment Tracking Weights & Biases (wandb), TensorBoard
Browser Automation Playwright, DuckDuckGo Search, Beautiful Soup
Quality / Linting pytest, Ruff, Black, isort, Flake8, mypy, Pyright, Bandit, ESLint, pre-commit
Notifications Discord, Slack, Telegram, SMTP/Email
External APIs OpenWeather, Alpha Vantage, News API, Wikipedia API, Google Safe Browsing

AI Tools & Development

Claude Cursor Claude Code Gemini OpenAI HuggingFace Ollama W&B LangChain ElevenLabs Playwright Postman

Full AI & Dev Tools Inventory
Category Tools
LLM Platforms Anthropic Claude (chat, vision, computer use), OpenAI (Whisper, embeddings), Google Gemini, Ollama, HuggingFace Transformers, llama.cpp (GGUF), Apple MLX, Candle (Rust ML), ONNX Runtime, CoreML
AI Development Cursor IDE, Claude Code CLI, Claude GitHub Actions (5 workflows: PR analyzer, docs generator, test generator, security analyzer, auto-fix)
AI Orchestration LangChain, LangGraph, CrewAI (multi-agent), OpenHands (coding assistant), Open Interpreter, OmniParser (vision parsing)
Experiment Tracking Weights & Biases (wandb), TensorBoard, LangFuse (LLM observability), Helicone (LLM cost tracking), PostHog (product analytics)
Voice & Audio OpenAI Whisper, Faster-Whisper, SpeechBrain, Wav2Vec2, ElevenLabs TTS, GCP TTS, Piper TTS, Edge-TTS, gTTS, pyttsx3, Picovoice/Porcupine (wake word), WebRTC VAD, Silero VAD, CoreML VAD
Browser Automation Playwright, DuckDuckGo Search, Beautiful Soup, Google Safe Browsing API
Testing & Quality pytest, Ruff, Black, isort, Flake8, mypy, Pyright, Bandit, ESLint, Super-Linter, CodeQL, Dependabot, Gitleaks, Postman/Newman, pre-commit hooks
Notifications Discord, Slack, Telegram, SMTP/Email (Gmail)
External Data APIs OpenWeather, Alpha Vantage (stocks), News API, Wikipedia API, Google NotebookLM

Demo

JARVIS Context Awareness Demo



JARVIS startup interface screenshot
JARVIS Boot Sequence Interface  ·  System startup telemetry and cognitive engine warmup state



Watch Full Demo


Data Structures & Algorithms

Every component below is production code running in the JARVIS ecosystem — not academic exercises.

Data Structures (50+ types)
Category Structures Implementation
Trees Quadtree (spatial indexing), KD-Tree (nearest neighbor + radius search), Trie (prefix search), DAG (startup dependency graph), Scene Graph, Knowledge Graph, Process Tree Python + Rust + C++
Graphs Reasoning Graph, Dependency Graph, Multi-Space Context Graph, Window Relationship Graph, Service Mesh Discovery Graph, LangGraph state machines, Causal Graphs (do-calculus) Python
Hash-Based Bloom Filters (3 languages), LSH Semantic Cache, LRU Cache, TTL Cache, Consistent Hashing, DashMap (lock-free concurrent), Bitmaps/Bitsets Python + Rust + Swift
Heaps & Queues Binary Heap (heapq), Priority Queue, Bounded Queue, Ring Buffer, Circular Buffer, Work-Stealing Queue, Zero-Copy IPC (mmap), Lock-Free SPSC Queue Python + Rust + JS
Concurrent Arc<Mutex<>>, RwLock, DashMap, mpsc channels, Vector Clock, CRDT, Distributed Lock, asyncio.Queue Rust + Python
Matrices & Tensors Matrix2D, Matrix3D (row-major), Sparse Matrices (nalgebra-sparse), PyTorch Tensors, Quantized Tensors (INT8/INT4), Embedding Vectors Rust + C++ + Python
Memory Memory Pool, Slab Allocator, Zero-Copy Buffers, Object Recycler, mmap Ring Buffers Rust + Python
State Finite State Machine, Event Bus, Event Store, Sliding Window, Bounded Collections Python
Algorithms (80+ implementations)
Category Algorithms Where
Resilience Circuit Breaker (5 variants), Exponential Backoff w/ Jitter, Graceful Degradation, Self-Healing, Leader Election, Distributed Locking, Distributed Transactions, Distributed Dedup JARVIS + Prime
Scheduling Round Robin, Token Bucket, Leaky Bucket, Sliding Window Rate Limiter, Work Stealing, Backpressure Control, Adaptive ML-Based Rate Limiting All three repos
Graph / Search Topological Sort (DAG), BFS/DFS, A* Search, Dijkstra's Shortest Path, K-Nearest Neighbor, PageRank (file importance ranking) All three repos
Statistical / Bayesian Bayesian Inference (Beta-Bernoulli, Normal-Normal posteriors), Bayesian Confidence Fusion, Multi-Armed Bandit (Thompson Sampling, epsilon-greedy), Monte Carlo Validation, Kalman Filter (RSSI smoothing), Markov Chain Prediction JARVIS + Prime
ML Training LoRA/QLoRA, DPO (preference optimization), RLHF (PPO pipeline), FSDP (parameter sharding), MAML/Reptile (meta-learning), Federated Learning (FedAvg, FedProx, Byzantine-robust), Curriculum Learning, Causal Reasoning (do-calculus), Online Learning w/ EWC, World Model Training (Dreamer/MuZero-inspired), Knowledge Distillation (Hinton, FitNets, attention transfer, multi-teacher), Gradient Accumulation, Mixed Precision (BF16/FP16) ReactorCore + Prime
ML Inference Quantized INT8/INT4, Cosine Similarity, LSH, Vector Search, Anomaly Detection, Pattern Recognition, Goal Inference, Activity Recognition, Tiered Complexity Routing, Flash Attention JARVIS + Prime
Neural Networks Multi-Head Self-Attention, Dropout, BatchNorm, LayerNorm, LSTM + Attention, Feedforward w/ Backpropagation, Cognitive Layers (cross-attention + residual) All three repos
Clustering & Reduction K-Means, DBSCAN, PCA, Truncated SVD, TF-IDF Vectorization JARVIS + Reactor
Ensemble Methods Random Forest, Gradient Boosting, Isolation Forest, Ensemble STT (multi-model voting), Weighted Model Ensemble (majority/cascade) JARVIS + Reactor
Signal Processing VAD (WebRTC + Silero + CoreML), MFCC/Mel Filterbanks, Spectrogram, Anti-Spoofing, Barge-In Detection, ECAPA-TDNN Speaker Verification JARVIS
Compression Zstd, LZ4, Gzip/Zlib, Custom Vision Compression Rust + Python
Cryptography HMAC, SHA-256, MD5, JWT, Secure Password Hashing, File Integrity Checksums, Checkpoint Verification All three repos
Caching LRU Eviction, TTL Eviction, Predictive Cache Warming (EWMA + time-series), LSH Semantic Cache, Bloom Filter Negative Cache, Memoization (lru_cache) All three repos
Evolutionary Genetic Algorithm (Ouroboros self-programming loop — B+ branch-isolated sagas, v262.0 fully activated) JARVIS
Concurrency Deadlock Prevention, CPU Affinity Pinning, Parallel DAG Initialization, Zero-Copy mmap IPC, Lock-Free Channels JARVIS + Prime
GPU / SIMD Metal Compute Shaders, ARM64 NEON SIMD Intrinsics JARVIS (Rust + C + Assembly)
C++ ML (mlforge) Linear Regression (Ridge/Lasso), Logistic Regression, Decision Tree (Gini), Neural Net (backprop), Matrix Serialization, KD-Tree, Graph (BFS/DFS), Trie ReactorCore

GitHub Stats

github-snake

Metrics Dashboard


The JARVIS Ecosystem

JARVIS is not a chatbot wrapper. It is a distributed AI operating system composed of three interdependent repositories — each a standalone production system, together forming a self-improving autonomous intelligence.

Hero Architecture (TL;DR)

  • Single command control plane: python3 unified_supervisor.py boots Body, Mind, and Forge with deterministic lifecycle ownership
  • Trinity operating model: JARVIS executes, JARVIS-Prime reasons/routes, ReactorCore trains and redeploys
  • Reliability-first inference: policy-based failover from GCP golden image to local Apple Silicon to API fallback
  • Closed learning loop: runtime telemetry flows to Reactor training, then gated deployment returns improved models to Prime
  • Native autonomy stack: async agent mesh, Google Workspace workflows, voice biometrics, and vision-driven macOS control
  • Safety by design: policy gates, contract checks, kill-switch controls, circuit breakers, and probation-based rollback
flowchart TD
    K["UNIFIED SUPERVISOR<br/>single control plane"] --> B["JARVIS (Body)<br/>agents + tools + execution"]
    K --> P["JARVIS-Prime (Mind)<br/>routing + reasoning"]
    K --> R["ReactorCore (Forge)<br/>training + deployment gates"]

    B <--> P
    P --> R
    R --> P
    B --> R

    P --> T1["Tier 1: GCP Golden Image"]
    T1 -->|"degraded"| T2["Tier 2: Local Apple Silicon"]
    T2 -->|"degraded"| T3["Tier 3: API Fallback"]

    R --> G["Gate + Probation"]
    G -->|"pass"| P
    G -->|"fail"| RB["Rollback"]
Loading

Triple Authority Resolution — Status Overview

Three repos previously made independent lifecycle decisions (restart/health/kill), which created restart storms, readiness split-brain, and contract drift. This architecture is now unified under a single root authority model.

flowchart TD
    U["UNIFIED SUPERVISOR<br/>Root Control Plane"] --> W["RootAuthorityWatcher<br/>Policy Brain"]
    U --> O["ProcessOrchestrator<br/>Execution Plane"]
    O --> P["JARVIS-Prime<br/>managed mode"]
    O --> R["Reactor-Core<br/>managed mode"]

    W -->|LifecycleVerdict| O
    O -->|ExecutionResult| W
    P -->|health + drain contract| W
    R -->|health + drain contract| W

    W --> H{"Handshake Gate"}
    H -->|"schema N/N-1 + capability hash pass"| READY["ALIVE/READY"]
    H -->|"contract mismatch"| REJECT["REJECTED"]

    W --> E["Escalation Engine"]
    E --> D["drain"]
    E --> T["SIGTERM"]
    E --> K["process-group SIGKILL"]
Loading
What we built (21 tasks, 5 waves, 3 repos)
  • Wave 0 — Foundation types: canonical lifecycle contracts (LifecycleAction, SubsystemState, ProcessIdentity, LifecycleVerdict, policy/timeout structures) + managed-mode contract + golden conformance tests
  • Wave 1 — Root authority watcher: lifecycle state machine ownership, verdict emission, incident dedup, and policy/execution separation via VerdictExecutor
  • Wave 2 — Prime/Reactor conformance: managed-mode behavior (JARVIS_ROOT_MANAGED), health envelope enrichment, authenticated /lifecycle/drain
  • Wave 3 — Orchestrator integration + shadow mode: ProcessOrchestrator adapter methods wired; active crash watch (proc.wait) + jittered health polling
  • Wave 4 — Activation hardening: active verdict dispatch, contract hash gating at boot handshake, policy delegation hooks for restart/health ownership
What this resolved
  • Restart storms: single restart policy with budgeted windows and deduplication
  • Readiness split-brain: unified two-field liveness/readiness state ownership
  • Contract drift: cross-repo managed-mode parity with conformance tests and compatibility gates
  • Crash blind spots: ms-latency process-exit detection plus health-path observability
  • Competing supervisors: Prime/Reactor demoted to managed mode while root authority owns lifecycle decisions
  • Escalation ambiguity: deterministic kill ladder (drain -> SIGTERM -> process-group SIGKILL)
  • PID reuse risk: identity validation strengthened via multi-factor ProcessIdentity
  • Control-plane auth gaps: HMAC-authenticated lifecycle commands and session-aware checks
Production rollout path (remaining ops work)
  1. Shadow soak: run in shadow mode and verify decision parity against legacy behavior
  2. Per-subsystem activation: promote one subsystem at a time (reactor-core then jarvis-prime)
  3. Final policy cut-wire: fully bypass legacy autonomous monitor decisions when delegation flags are enabled
  4. CI anti-drift: enforce cross-repo parity checks for managed-mode contract files on every PR
Hidden profile bullet packs (copy-ready)

Ultra-short TL;DR

  • Triple Authority Fixed: one root control plane governs restart/readiness/lifecycle
  • Safe by Contract: managed-mode + authenticated lifecycle endpoints + handshake gating
  • Staged Rollout: shadow parity -> subsystem activation -> full active cutover

Recruiter-friendly

  • Architecture leadership: unified three competing supervisors into one production control plane
  • Reliability outcome: removed restart storms and readiness split-brain via centralized lifecycle policy
  • Security hardening: added authenticated lifecycle controls and contract-gated activation
  • Operational rigor: designed staged rollout for safe production adoption

Infra-architect

  • Control-plane convergence: root watcher owns lifecycle state transitions across Body/Prime/Reactor
  • Policy/execution isolation: watcher emits verdicts; orchestrator executes side effects
  • Deterministic escalation: bounded drain -> term -> group-kill with race-safe identity checks
  • Protocol hardening: schema/capability handshake gates + managed-mode health/drain envelopes
  • Progressive activation: shadow validation, per-subsystem enablement, legacy path retirement

Disease 1: God File / Monolith Paradox

unified_supervisor.py grew into a ~96K-line orchestration monolith with multiple high-impact domains in one file. The risk is not just size; it is coupling density: local edits can create non-local regressions.

flowchart TD
    E["Single Entry Point<br/>python3 unified_supervisor.py"] --> S["Kernel Shell (thin)"]
    S --> R["Domain Controller Registry"]

    R --> L["Lifecycle Controller"]
    R --> H["Health Controller"]
    R --> W["Workflow Controller"]
    R --> M["Resource Controller"]
    R --> X["Self-Healing Controller"]
    R --> A["AGI/Training Controller"]

    L --> C["Contract Boundaries<br/>typed interfaces + DTOs"]
    H --> C
    W --> C
    M --> C
    X --> C
    A --> C

    C --> T["Isolated Domain Tests"]
    C --> O["Cross-Domain Observability"]
Loading
Why this is dangerous
  • Reasoning collapse: too many orthogonal responsibilities in one file
  • Test isolation gap: difficult to unit-test a single subsystem without broad kernel context
  • High merge friction: concentrated edit surface increases conflict rate
  • Refactor risk: tooling and human review quality degrade as coupling grows
  • Mandate conflict: monolith bottleneck violates "no single structural choke point"
Structural cure path
  1. Preserve single boot command while shrinking policy from the shell
  2. Extract domain controllers behind protocol boundaries
  3. Replace direct cross-calls with typed contract interfaces
  4. Enforce isolation tests per domain before integration tests
  5. Ship in waves with parity gates to avoid behavioral drift
Hidden profile bullets (copy-ready)

Ultra-short TL;DR

  • Monolith Risk Neutralized (in progress): convert a 96K-line supervisor choke point into contract-bounded controllers
  • Single Entry Point Preserved: one boot command, modular internals
  • Safer Evolution: isolation tests + parity-gated extraction waves

Recruiter-friendly

  • Architecture insight: identified the monolith paradox as the largest systemic reliability and velocity risk
  • Execution strategy: designed a phased decomposition that keeps runtime stable while reducing coupling
  • Engineering rigor: paired extraction with contract boundaries and isolation testing to prevent regressions

Infra-architect

  • Kernel shell model: retain entrypoint authority but move domain policy to controller registry
  • Protocol-first decomposition: typed interfaces replace direct cross-domain invocation
  • Risk-managed migration: parity validation, observability gates, and staged rollout per domain

System Architecture

Purpose, Problem, Challenge, Solution
  • Purpose: Define the three-system operating model (JARVIS, JARVIS-Prime, ReactorCore) under one unified kernel.
  • Problem: Most AI systems stop at a single model endpoint and fail at end-to-end autonomy, coordination, and lifecycle management.
  • Core Challenge: Keep orchestration, inference, and training decoupled enough to scale independently while still behaving like one product.
  • What This Solves: Creates a durable systems contract: JARVIS runs operations, Prime serves intelligence, Reactor continuously improves intelligence.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'primaryBorderColor': '#70a5fd', 'lineColor': '#545c7e', 'secondaryColor': '#24283b', 'tertiaryColor': '#1a1b27', 'fontSize': '14px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%

flowchart TD
    KERNEL["<b>UNIFIED SUPERVISOR KERNEL</b><br/>Single Entry Point · 50K+ LOC<br/>7-Zone Parallel Initialization"]

    KERNEL -->|"orchestrates"| JARVIS
    KERNEL -->|"routes inference"| PRIME
    KERNEL -->|"triggers training"| REACTOR

    subgraph JARVIS["<b>JARVIS — The Body</b> &nbsp; Python / Rust / Swift &nbsp; :8010"]
        direction TB
        J1["🕸️ Neural Mesh<br/><i>16+ async agents · capability routing</i>"]
        J2["🎙️ Voice & Auth<br/><i>ECAPA-TDNN · full-duplex · wake word</i>"]
        J3["👁️ Vision & Spatial<br/><i>LLaVA · YOLO · Ghost Display · OCR</i>"]
        J4["🍎 macOS Native<br/><i>Swift 203 files · ObjC · Rust · CoreML</i>"]
        J5["🧠 Intelligence<br/><i>RAG · Ouroboros · Google Workspace</i>"]
    end

    subgraph PRIME["<b>JARVIS-Prime — The Mind</b> &nbsp; Python / GGUF &nbsp; :8000-8001"]
        direction TB
        P1["📡 Task-Type Router<br/><i>11 specialist models · 40.4 GB</i>"]
        P2["⚡ Neural Switchboard<br/><i>v98.1 · WebSocket contracts</i>"]
        P3["👁️ LLaVA Vision Server<br/><i>multimodal · OpenAI-compatible API</i>"]
        P4["💭 Reasoning Engine<br/><i>CoT / ToT / self-reflection</i>"]
        P5["📊 Telemetry Capture<br/><i>JSONL · deployment feedback loop</i>"]
    end

    subgraph REACTOR["<b>ReactorCore — The Forge</b> &nbsp; C++ / Python &nbsp; :8090"]
        direction TB
        R1["🔥 Training Pipeline<br/><i>LoRA · DPO · RLHF · FSDP</i>"]
        R2["🚪 Deployment Gate<br/><i>integrity validation · probation monitor</i>"]
        R3["🧬 Model Lineage<br/><i>full provenance chain · append-only JSONL</i>"]
        R4["☁️ GCP Spot Recovery<br/><i>checkpoint persistence · 60% cost savings</i>"]
        R5["⚙️ C++ Kernels<br/><i>CMake · pybind11 · native performance</i>"]
    end

    PRIME -.->|"telemetry + experiences"| REACTOR
    REACTOR -.->|"improved GGUF models"| PRIME
    JARVIS <-.->|"inference requests / responses"| PRIME
    REACTOR -.->|"training signals"| JARVIS

    style KERNEL fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
    style JARVIS fill:#0d1117,stroke:#70a5fd,stroke-width:2px,color:#a9b1d6
    style PRIME fill:#0d1117,stroke:#bf91f3,stroke-width:2px,color:#a9b1d6
    style REACTOR fill:#0d1117,stroke:#bb9af7,stroke-width:2px,color:#a9b1d6
Loading

Data Flow

Purpose, Problem, Challenge, Solution
  • Purpose: Show the runtime request path from multimodal inputs to routed inference and back to user-visible action.
  • Problem: Input streams (voice, screen, command) are heterogeneous and require different model strategies and latencies.
  • Core Challenge: Route by task type in real time while capturing high-quality telemetry for future model improvement.
  • What This Solves: Demonstrates a closed execution path where each response both serves the user now and improves the system later.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%

flowchart LR
    A["🎤 Voice Input"] --> B["JARVIS Kernel"]
    C["👁️ Screen Capture"] --> B
    D["⌨️ User Command"] --> B
    B --> E["JARVIS-Prime<br/><i>inference routing</i>"]
    E --> F{"Task Type?"}
    F -->|"math"| G["Qwen2.5-7B"]
    F -->|"code"| H["DeepCoder"]
    F -->|"vision"| I["LLaVA"]
    F -->|"simple"| J["Fast 2.2GB"]
    F -->|"complex"| K["Claude API"]
    G & H & I & J & K --> L["Response"]
    L --> B
    E -->|"telemetry"| M["ReactorCore"]
    M -->|"LoRA/DPO training"| N["Improved Model"]
    N -->|"deploy + probation"| E

    style B fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
    style E fill:#1a1b27,stroke:#bf91f3,stroke-width:2px,color:#bf91f3
    style M fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
    style F fill:#24283b,stroke:#545c7e,stroke-width:1px,color:#a9b1d6
Loading

Three-Tier Inference Routing

Purpose, Problem, Challenge, Solution
  • Purpose: Define a deterministic fallback ladder for reliability under changing infrastructure and hardware conditions.
  • Problem: A single inference backend is a single point of failure (downtime, cold starts, local resource pressure, API outages).
  • Core Challenge: Preserve quality and uptime while controlling cost and avoiding hard dependency on any one execution tier.
  • What This Solves: Guarantees service continuity through policy-based failover: GCP -> Local Metal -> Claude API.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%

flowchart LR
    REQ["Inference Request"] --> T1
    T1["☁️ Tier 1: GCP Golden Image<br/><i>11 models · ~30s cold start</i>"]
    T1 -->|"unavailable"| T2["💻 Tier 2: Local Apple Silicon<br/><i>M1 Metal GPU · on-device</i>"]
    T2 -->|"resource constrained"| T3["🔑 Tier 3: Claude API<br/><i>emergency fallback</i>"]
    T1 -->|"✅ success"| RES["Response"]
    T2 -->|"✅ success"| RES
    T3 -->|"✅ success"| RES

    style T1 fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
    style T2 fill:#1a1b27,stroke:#bf91f3,stroke-width:2px,color:#bf91f3
    style T3 fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
    style REQ fill:#24283b,stroke:#545c7e,stroke-width:1px,color:#a9b1d6
    style RES fill:#24283b,stroke:#545c7e,stroke-width:1px,color:#a9b1d6
Loading

Trinity Autonomy Wiring (Phase 2)

Purpose, Problem, Challenge, Solution
  • Purpose: Wire autonomy lifecycle events through the Trinity loop so the system can learn from its own autonomous actions.
  • Problem: JARVIS Body performs autonomous actions (Google Workspace agent) but the outcomes are not captured as structured training signals.
  • Core Challenge: Events must be strictly validated, deduplicated, and classified before reaching the training pipeline — malformed or replayed events would corrupt model weights.
  • What This Solves: Creates a closed feedback loop where autonomous actions generate training data, improving future autonomy decisions without manual intervention.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%

flowchart TD
    AGENT["🤖 Google Workspace Agent<br/><i>execute_task()</i>"]

    AGENT -->|"7 event types"| EMIT["📡 _emit_autonomy_event()<br/><i>strict metadata schema</i>"]
    EMIT -->|"token-bucket<br/>rate limiter"| FWD["🔀 CrossRepoExperienceForwarder<br/><i>forward_autonomy_event()</i>"]

    FWD -->|"ExperienceEvent<br/>(type=METRIC)"| ING["🔬 AutonomyEventIngestor"]

    ING --> V{"Validate<br/>7 required keys?"}
    V -->|"❌ malformed"| Q["🗃️ Quarantine<br/><i>disk-based · 7d retention</i>"]
    V -->|"✅ valid"| D{"Deduplicate<br/>composite key?"}
    D -->|"duplicate"| SKIP["⏭️ Skip"]
    D -->|"unique"| CLS["🏷️ AutonomyEventClassifier"]

    CLS -->|"committed / failed"| TRAIN["🔥 UnifiedPipeline<br/><i>DPO / LoRA training</i>"]
    CLS -->|"infrastructure /<br/>excluded"| EXCLUDE["📊 Metrics Only<br/><i>no training</i>"]

    AGENT <-.->|"autonomy_policy /<br/>action_plan"| PRIME["💭 JARVIS-Prime<br/><i>policy gate</i>"]

    SUP["🛡️ Supervisor Boot"] -->|"check_autonomy_contracts()"| COMPAT{"Schema<br/>Compatible?"}
    COMPAT -->|"✅ pass"| FULL["Full Autonomy Mode"]
    COMPAT -->|"❌ mismatch"| RO["Read-Only Mode"]

    style AGENT fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
    style PRIME fill:#1a1b27,stroke:#bf91f3,stroke-width:2px,color:#bf91f3
    style ING fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
    style TRAIN fill:#1a1b27,stroke:#9ece6a,stroke-width:2px,color:#9ece6a
    style Q fill:#1a1b27,stroke:#f7768e,stroke-width:2px,color:#f7768e
    style SUP fill:#1a1b27,stroke:#e0af68,stroke-width:2px,color:#e0af68
Loading

How it works:

  • Body emits 7 canonical events — Every autonomous action (email send, calendar create, doc edit) emits a lifecycle event: intent_written (about to execute), committed (success), failed (error), policy_denied (blocked by Prime), deduplicated (suppressed duplicate), superseded (stale intent), no_journal_lease (fail-closed safety)
  • Strict metadata schema — Each event carries 7 required keys (autonomy_event_type, autonomy_schema_version, idempotency_key, trace_id, correlation_id, action, request_kind). Malformed events are quarantined to disk, never silently coerced
  • Token-bucket rate limiter — Prevents replay storms during startup reconciliation (default: 50 events/second)
  • Effectively-once semantics — Deduplication by composite key (idempotency_key, autonomy_event_type, trace_id) with a 50K sliding window
  • Centralized classificationAutonomyEventClassifier is the single source of truth: only committed and failed are trainable; infrastructure events are excluded from training but retained for observability
  • Boot contract validation — Supervisor checks schema version compatibility across all three repos at startup. Any mismatch degrades to read-only autonomy mode (no autonomous writes)
  • Prime as policy gate — Body attaches autonomy_policy (allowed/denied actions, risk thresholds) to commands; Prime validates and returns structured action_plan with policy_compatible flag

Ouroboros — Autonomous Self-Development (v262.0 B+)

Purpose, Problem, Challenge, Solution
  • Purpose: Enable JARVIS to autonomously detect, generate, validate, and apply code improvements across all three repos (JARVIS, JARVIS-Prime, Reactor-Core) in real time — without human intervention.
  • Problem: Cross-repo code applies without isolation are dangerous: partial failures leave repos in inconsistent states, no rollback exists, TARGET_MOVED (another commit landing mid-apply) goes undetected, and forensics branches are lost on failure.
  • Core Challenge: Production-grade saga apply safety across three independent git repos — ephemeral branch isolation, deterministic lock ordering, ff-only promote gates, and bounded passive observability — all without changing the external execution contract.
  • What This Solves (v262.0 B+): Full activation of the autonomous self-development loop with B+ branch-isolated sagas, passive SagaMessageBus observer, TestFailureSensor with real polling watcher, and all 4 P0 config blockers resolved. JARVIS_SAGA_BRANCH_ISOLATION=true + JARVIS_GOVERNANCE_MODE=governed = fully operational.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%

flowchart TD
    subgraph INTAKE["Zone 6.9 — Intake Layer (per repo × 3)"]
        B["📋 BacklogSensor<br/><i>polls .jarvis/backlog.json · 30s</i>"]
        T["🧪 TestFailureSensor + TestWatcher<br/><i>pytest subprocess · streak ≥ 2 · 300s</i>"]
        M["⛏️ OpportunityMiner<br/><i>complexity ≥ 10 · 300s</i>"]
        V["🎤 VoiceCommandSensor<br/><i>event-driven · always on</i>"]
    end

    subgraph GLS["Zone 6.8 — Governed Loop Service"]
        Q["📥 UnifiedIntakeRouter<br/><i>dedup · priority · human-ack</i>"]
        FSM["🔄 PreemptionFsmEngine<br/><i>IDLE→ACTIVE→PAUSED→TERMINAL</i>"]
        ORCH["🎯 Orchestrator<br/><i>CLASSIFY→ROUTE→EXPAND→GENERATE→VALIDATE→GATE→APPLY→VERIFY→COMPLETE</i>"]
        BUS["📡 SagaMessageBus<br/><i>passive observer · max 500 msgs · TTL 300s</i>"]
    end

    subgraph SAGA["B+ Saga Apply (branch_isolation=True)"]
        PRE["1. Preflight: assert clean worktree"]
        BR["2. Create ouroboros/saga-&lt;op_id&gt;/&lt;repo&gt;"]
        AP["3. Apply patch + git commit"]
        LOCK["Two-Tier Lock:<br/>asyncio.Lock + fcntl.flock<br/><i>sorted order: jarvis → prime → reactor</i>"]
        PROM["4. promote_all()<br/><i>check_promote_safe → git merge --ff-only</i>"]
        COMP["5. On failure: _bplus_compensate_all()<br/><i>restore original_ref · keep forensics branch</i>"]
    end

    subgraph JPRIME["GCP J-Prime (Golden Image · 136.113.252.164:8000)"]
        GEN["🧠 Code Generation<br/><i>schema 2c.1 · multi-repo patches</i>"]
        NOOP["⚡ Noop Fast-Path<br/><i>2b.1-noop → GENERATE→COMPLETE</i>"]
    end

    B & T & M & V --> Q
    Q --> FSM --> ORCH
    ORCH -->|"GENERATE"| JPRIME
    GEN & NOOP --> ORCH
    ORCH -->|"APPLY"| PRE
    PRE --> LOCK --> BR --> AP
    AP -->|"success"| PROM
    AP -->|"failure"| COMP
    PROM -->|"SAGA_SUCCEEDED"| BUS
    PROM -->|"TARGET_MOVED"| BUS
    PROM -->|"SAGA_PARTIAL_PROMOTE"| BUS
    COMP -->|"SAGA_ROLLED_BACK"| BUS
    ORCH -->|"VERIFY fail"| BUS

    style INTAKE fill:#0d1117,stroke:#70a5fd,stroke-width:2px,color:#a9b1d6
    style GLS fill:#0d1117,stroke:#bf91f3,stroke-width:2px,color:#a9b1d6
    style SAGA fill:#0d1117,stroke:#9ece6a,stroke-width:2px,color:#a9b1d6
    style JPRIME fill:#0d1117,stroke:#e0af68,stroke-width:2px,color:#a9b1d6
    style BUS fill:#1a1b27,stroke:#7dcfff,stroke-width:2px,color:#7dcfff
    style LOCK fill:#1a1b27,stroke:#f7768e,stroke-width:2px,color:#f7768e
Loading

How it works:

  • Zone 6.9 sensors fan out per repo — Each of the three repos (JARVIS, JARVIS-Prime, Reactor-Core) gets its own BacklogSensor, TestFailureSensor (with real TestWatcher subprocess poller), and OpportunityMinerSensor. VoiceCommandSensor is always-on and event-driven.
  • TestWatcher polls continuously — Runs pytest in a subprocess every 300s per repo. Emits a stable intent:test_failure envelope only after streak ≥ 2 consecutive failures — preventing false alarms from transient flakes.
  • B+ branch isolation — Every apply creates an ephemeral branch ouroboros/saga-<op_id>/<repo>. Patches are committed there. Promote uses git merge --ff-only — if the target moved (TARGET_MOVED), the gate fails and the saga compensates cleanly.
  • Two-tier lockingasyncio.Lock (in-process) + fcntl.flock (cross-process) acquired in sorted repo name order (jarvis → prime → reactor-core) — deterministic, deadlock-free across concurrent ops.
  • SAGA_PARTIAL_PROMOTE — If promotion succeeds for some repos but fails for others, the new SAGA_PARTIAL_PROMOTE terminal state triggers a scoped pause (cross_repo_saga scope) until the operator reviews the partial state.
  • SagaMessageBus — A passive, fault-isolated observer (zero execution authority) records 8 event types: SAGA_CREATED, SAGA_ADVANCED, SAGA_COMPLETED, SAGA_FAILED, SAGA_ROLLED_BACK, SAGA_PARTIAL_PROMOTE, TARGET_MOVED, ANCESTRY_VIOLATION. Fire-and-forget — a broken bus never blocks an apply.
  • SagaLedgerArtifact — A 15-field frozen dataclass records every saga op: original_ref, saga_branch, promoted_sha, rollback_reason, kept_forensics_branches, and timestamp_ns. Full audit trail in the durable ledger.
  • J-Prime generates patches — GCP golden image at 136.113.252.164:8000 generates schema 2c.1 multi-repo patches. Noop fast-path (2b.1-noop) skips directly to COMPLETE if the change is already present.
  • Voice narrationVoiceNarrator announces intent, decision, and postmortem at each significant phase. OUROBOROS_VOICE_DEBOUNCE_S prevents over-narration (default 60s).

Activation (v262.0 — all green):

# .env (required for full autonomous operation)
JARVIS_GOVERNANCE_MODE=governed
JARVIS_SAGA_BRANCH_ISOLATION=true
JARVIS_SAGA_KEEP_FORENSICS_BRANCHES=true

# Start
python3 unified_supervisor.py --force

Ouroboros: Honest Capability Assessment

What it does:

  • Detects opportunities across all 3 repos (test failures, backlog, complexity, voice commands)
  • Calls J-Prime on GCP, receives a schema 2c.1 multi-repo patch
  • Applies with B+ saga safety — ephemeral branches, two-tier locks, ff-only promote gates, rollback
  • Narrates every decision in real time via voice + TUI
  • Commits and promotes across jarvis + prime + reactor without human touch

Where it stands vs. Claude Code:

Capability Claude Code Ouroboros v262.0
Read arbitrary files during an op Full Read tool Partial — TheOracle + context_expander (10 files max)
Run bash commands Yes No
Search the web Yes No
Edit code iteratively with feedback Multi-turn, sees results One-shot patch + apply
Test before committing Runs tests, reads output, fixes Applies first, verifies after
Persistent strategic goal memory Deep conversation context Per-op intent only

Core difference: Claude Code is a full agentic loop with tool use — reads, runs, observes, revises, converges. Ouroboros is a code generation + automated apply pipeline. J-Prime generates a patch once; the B+ saga applies it. No iterative tool-use loop within an operation yet.

What would close the gap:

  1. Tool use in the generation loop — J-Prime calls read_file, run_command, run_tests during generation
  2. Multi-turn op execution — generate → run → observe → revise → converge
  3. Persistent goal memory — accumulates your long-running intent across sessions into every op's context
  4. Sandboxed shell — Ouroboros verifies its own changes before committing

Bottom line: Real, production-grade autonomous code delivery — not a demo. JARVIS will find work, generate patches via J-Prime, and commit across 3 repos without you touching anything. Not yet at Claude Code-level agentic tool use. That is the explicit next evolution.

GCP Hybrid Cloud Spot Architecture

Purpose, Problem, Challenge, Solution
  • Purpose: Run high-throughput inference and training on GCP while preserving local fallback and cost control.
  • Problem: On-demand cloud is expensive at scale, while local-only inference cannot absorb peak load or large-model demand.
  • Core Challenge: Balance latency, uptime, and spend when Spot VMs can be preempted without warning.
  • What This Solves: Introduces hybrid execution with preemption-aware orchestration, checkpoint recovery, and automatic failover to local/API tiers.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%

flowchart LR
    REQ["Inference / Training Request"] --> ORCH["Hybrid Orchestrator"]
    ORCH --> SPOT["GCP Spot VM Pool<br/><i>primary cost-optimized execution</i>"]
    ORCH --> LOCAL["Local Apple Silicon Tier<br/><i>low-latency fallback</i>"]
    ORCH --> API["Claude API Tier<br/><i>emergency overflow</i>"]

    SPOT --> PREEMPT{"Preempted?"}
    PREEMPT -->|"no"| RUN["Run Workload"]
    PREEMPT -->|"yes"| RECOVER["Resume From Checkpoint"]
    RECOVER --> RUN

    RUN --> TELE["Telemetry + Cost Signals"]
    TELE --> ORCH
    RUN --> RES["Response / Model Artifact"]
    LOCAL --> RES
    API --> RES

    style ORCH fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
    style SPOT fill:#1a1b27,stroke:#bf91f3,stroke-width:2px,color:#bf91f3
    style LOCAL fill:#1a1b27,stroke:#7dcfff,stroke-width:2px,color:#7dcfff
    style API fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
    style PREEMPT fill:#24283b,stroke:#545c7e,stroke-width:1px,color:#a9b1d6
Loading

Golden Image Architecture (Model-Ready Compute)

Purpose, Problem, Challenge, Solution
  • Purpose: Eliminate repeated cold setup by pre-baking model runtimes and dependencies into immutable machine images.
  • Problem: Dynamic provisioning causes long startup times, dependency drift, and inconsistent behavior across nodes.
  • Core Challenge: Keep images reproducible and secure while continuously shipping model/runtime updates.
  • What This Solves: Establishes an immutable golden-image pipeline with validation gates and rollout controls for consistent low-latency boot.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%

flowchart LR
    SRC["Model + Runtime Source"] --> BUILD["Image Builder Pipeline"]
    BUILD --> BAKE["Bake Golden Image<br/><i>models + deps + startup contracts</i>"]
    BAKE --> VALIDATE["Validation Gate<br/><i>health, integrity, startup SLA</i>"]
    VALIDATE -->|"pass"| REG["Image Registry"]
    VALIDATE -->|"fail"| REJECT["Reject Build"]

    REG --> SCALE["Autoscaled GCP Inference Nodes"]
    SCALE --> PRIME["JARVIS-Prime Router"]
    PRIME --> MON["Observability + Drift Monitoring"]
    MON --> BUILD

    style BUILD fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
    style BAKE fill:#1a1b27,stroke:#bf91f3,stroke-width:2px,color:#bf91f3
    style VALIDATE fill:#1a1b27,stroke:#7dcfff,stroke-width:2px,color:#7dcfff
    style REG fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
    style REJECT fill:#1a1b27,stroke:#f7768e,stroke-width:2px,color:#f7768e
Loading

Execution Planes (Control / Data / Model)

Purpose, Problem, Challenge, Solution
  • Purpose: Separate operational concerns into control, data, and model planes for clearer ownership and safer evolution.
  • Problem: Without plane separation, policy, state, and model behavior become tightly coupled and brittle during scale-out.
  • Core Challenge: Enforce governance and safety globally while allowing model and data pipelines to move quickly.
  • What This Solves: Makes architecture auditable and composable: control governs, data persists context, models execute decisions.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%

flowchart TB
    subgraph CONTROL["🛡️ Control Plane"]
        C1["Policy Engine"]
        C2["Auth + Approval Gates"]
        C3["Secrets + Key Management"]
        C4["Kill Switch + Guardrails"]
    end

    subgraph DATA["📦 Data Plane"]
        D1["JARVIS Runtime Events"]
        D2["Redis + Cloud SQL State"]
        D3["ChromaDB / FAISS Memory"]
        D4["JSONL Telemetry + Lineage"]
    end

    subgraph MODEL["🧠 Model Plane"]
        M1["Prime Inference Router"]
        M2["Tiered Execution (GCP/Local/Claude)"]
        M3["Reactor Training Pipeline"]
        M4["Deployment Gate + Probation"]
    end

    CONTROL -->|"policy constraints"| DATA
    CONTROL -->|"permit / deny"| MODEL
    DATA -->|"context + telemetry"| MODEL
    MODEL -->|"decisions + artifacts"| DATA
    MODEL -->|"health + risk signals"| CONTROL

    style CONTROL fill:#0d1117,stroke:#70a5fd,stroke-width:2px,color:#a9b1d6
    style DATA fill:#0d1117,stroke:#bf91f3,stroke-width:2px,color:#a9b1d6
    style MODEL fill:#0d1117,stroke:#bb9af7,stroke-width:2px,color:#a9b1d6
Loading

Memory Control Plane (UMA-Aware Resource Governance)

Purpose, Problem, Challenge, Solution
  • Purpose: Govern shared Apple Silicon UMA memory with explicit, lease-based control across model loads, display surfaces, and agent runtime.
  • Problem: GPU/compositor pressure is often invisible to process-level memory metrics, so systems can appear healthy while heading into swap thrash.
  • Core Challenge: Coordinate memory decisions across heterogeneous consumers while preventing flapping and preserving critical capabilities.
  • What This Solves: Introduces deterministic memory governance with pressure-aware lease grants, stepwise shedding, and crash-safe lease reconciliation.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'primaryBorderColor': '#70a5fd', 'lineColor': '#545c7e', 'secondaryColor': '#24283b', 'tertiaryColor': '#1a1b27', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%

flowchart TB
    subgraph OBS["📊 UMA Observability"]
        Q["MemoryQuantizer<br/><i>system + process sampling</i>"]
        S["Frozen MemorySnapshot<br/><i>headroom, pressure tier, thrash state</i>"]
        Q --> S
    end

    subgraph BROKER["🧠 MemoryBudgetBroker"]
        B1["Lease Manager<br/><i>grant / deny / preempt</i>"]
        B2["Budget Engine<br/><i>tier multipliers + safety reserve</i>"]
        B3["Recovery Ledger<br/><i>epoch fencing + stale lease reclaim</i>"]
    end

    subgraph CONSUMERS["📦 Lease Holders"]
        M["Model Loaders<br/><i>LLM, vision, speaker ID</i>"]
        A["Agent Runtime<br/><i>mesh workers + queues</i>"]
        D["Ghost Display<br/><i>display:ghost@v1</i>"]
    end

    subgraph CONTROL["🖥️ DisplayPressureController"]
        C1["Policy State Machine<br/><i>one-step downgrade invariant</i>"]
        C2["Shedding Ladder<br/><i>1080p -> 900p -> 720p -> 576p -> off</i>"]
        C3["Flap Guards<br/><i>dwell, cooldown, rate limits</i>"]
    end

    S -->|"pressure tier + headroom"| B2
    B2 --> B1
    B3 --> B1
    B1 -->|"lease outcomes"| M
    B1 -->|"lease outcomes"| A
    B1 -->|"lease outcomes"| D

    B1 -->|"pressure signal"| C1
    C1 --> C2
    C2 -->|"resolution action"| D
    C1 --> C3
    C3 -->|"allow / delay"| C2

    D -->|"amend_lease_bytes"| B1
    B1 -->|"events + decisions"| T["Telemetry Pipeline"]
    T -->|"drift + anomaly feedback"| Q

    style OBS fill:#0d1117,stroke:#70a5fd,stroke-width:2px,color:#a9b1d6
    style BROKER fill:#0d1117,stroke:#bf91f3,stroke-width:2px,color:#a9b1d6
    style CONSUMERS fill:#0d1117,stroke:#bb9af7,stroke-width:2px,color:#a9b1d6
    style CONTROL fill:#0d1117,stroke:#7dcfff,stroke-width:2px,color:#a9b1d6
    style T fill:#24283b,stroke:#545c7e,stroke-width:1px,color:#a9b1d6
Loading
Key design decisions
  • Lease-first memory policy — Components must request memory leases before expensive allocations; brokered leases are the source of truth.
  • Typed pressure tiers — Budget aggressiveness changes by pressure tier to avoid hardcoded, brittle thresholds.
  • Deterministic shedding — Display degradation follows ordered one-step transitions, preventing abrupt multi-level drops.
  • Flap prevention controls — Dwell windows, cooldowns, and rate limits stop oscillation under noisy pressure signals.
  • Crash-safe reconciliation — Epoch fencing and stale lease recovery reclaim orphaned allocations after process failures.
  • Closed-loop observability — Broker and controller events feed telemetry so memory policy can be calibrated over time.

Safety & Governance Path

Purpose, Problem, Challenge, Solution
  • Purpose: Document the decision policy from risk classification to approval, execution, blocking, and audit.
  • Problem: Autonomous systems can perform high-impact actions where incorrect execution is costly or irreversible.
  • Core Challenge: Balance autonomy and velocity with explicit human control boundaries for high-risk operations.
  • What This Solves: Provides a predictable safety envelope: low-risk auto-exec, medium-risk constrained mode, high-risk human-in-the-loop.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%

flowchart LR
    IN["Incoming Action"] --> CLASS["Risk Classifier"]
    CLASS -->|"low risk"| AUTO["Auto Execute"]
    CLASS -->|"medium risk"| SAFE["Safe Mode + Limits"]
    CLASS -->|"high risk"| HITL["Human Approval Required"]

    SAFE --> EXEC["Controlled Execution"]
    HITL -->|"approved"| EXEC
    HITL -->|"denied"| BLOCK["Blocked + Logged"]

    EXEC --> MON["Runtime Monitor"]
    MON -->|"policy violation"| TRIP["Circuit Breaker Trip"]
    TRIP --> FB["Fallback Route / Degrade Gracefully"]
    MON -->|"healthy"| OK["Commit Result"]

    BLOCK --> AUD["Audit Trail"]
    FB --> AUD
    OK --> AUD

    style IN fill:#24283b,stroke:#545c7e,stroke-width:1px,color:#a9b1d6
    style CLASS fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
    style HITL fill:#1a1b27,stroke:#ffb86c,stroke-width:2px,color:#ffb86c
    style TRIP fill:#1a1b27,stroke:#f7768e,stroke-width:2px,color:#f7768e
    style AUD fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
Loading

Observability & Closed-Loop Learning

Purpose, Problem, Challenge, Solution
  • Purpose: Show how runtime signals become training data, deployment decisions, and measurable model upgrades.
  • Problem: Teams often collect telemetry but fail to operationalize it into safe, repeatable improvement cycles.
  • Core Challenge: Detect regressions early, gate bad models, and continuously retrain without destabilizing production.
  • What This Solves: Establishes a true learning loop: observe -> detect -> curate -> train -> gate/probation -> deploy or rollback.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1b27', 'primaryTextColor': '#a9b1d6', 'lineColor': '#545c7e', 'fontSize': '13px', 'fontFamily': 'JetBrains Mono, monospace' }}}%%

flowchart LR
    RUN["Live Inference + Agent Runtime"] --> OTEL["OpenTelemetry Traces/Metrics"]
    RUN --> LOGS["Structured JSONL Logs"]
    RUN --> COST["LangFuse + Helicone + PostHog"]

    OTEL --> HUB["Unified Observability Hub"]
    LOGS --> HUB
    COST --> HUB

    HUB --> ALERT["Anomaly/Regression Detection"]
    ALERT -->|"critical"| ROLLBACK["Auto Rollback / Gate Fail"]
    ALERT -->|"acceptable"| CURATE["Telemetry Curation"]

    CURATE --> TRAIN["Reactor Training (LoRA/DPO/RLHF)"]
    TRAIN --> GATE["Deployment Gate + Probation"]
    GATE -->|"pass"| PRIME["Prime Model Registry"]
    GATE -->|"fail"| ROLLBACK

    PRIME --> RUN

    style RUN fill:#1a1b27,stroke:#70a5fd,stroke-width:2px,color:#70a5fd
    style HUB fill:#1a1b27,stroke:#bf91f3,stroke-width:2px,color:#bf91f3
    style TRAIN fill:#1a1b27,stroke:#bb9af7,stroke-width:2px,color:#bb9af7
    style ROLLBACK fill:#1a1b27,stroke:#f7768e,stroke-width:2px,color:#f7768e
Loading

Repository Breakdown





Port 8010
60+ Agent Neural Mesh
Voice Biometrics
Ghost Display + Vision
macOS Native (203 Swift files)
RAG + Ouroboros Self-Programming




Port 8000-8001
11 Specialist GGUF Models (40.4 GB)
Task-Type Inference Routing
LLaVA Vision Server
CoT/ToT Reasoning Engine
Neural Switchboard v98.1




Port 8090
LoRA / DPO / RLHF Training
Deployment Gate + Probation
Model Lineage Tracking
GCP Spot VM Auto-Recovery
Native C++ Training Kernels

  Deep Dive

Agent Architecture
  • Neural Mesh — 16+ specialized agents (activity recognition, adaptive resource governor, context tracker, error analyzer, goal inference, Google Workspace, health monitor, memory, pattern recognition, predictive planning, spatial awareness, visual monitor, web search, coordinator) with asynchronous message passing, capability-based routing, and cross-agent data flow
  • Autonomous Agent Runtime — multi-step goal decomposition, agentic task execution, tool orchestration, error recovery, and intervention decision engine with human-in-the-loop approval for destructive actions
  • AGI OS Coordinator — proactive event stream, notification bridge, owner identity service, voice approval manager, and intelligent startup announcer
Voice and Authentication
  • Real-time voice biometric authentication via ECAPA-TDNN speaker verification with cloud/local hybrid inference and multi-factor fusion (voice + proximity + behavioral)
  • Real-time voice conversation — full-duplex audio (simultaneous mic + speaker), acoustic echo cancellation (speexdsp), streaming STT (faster-whisper), adaptive turn detection, barge-in control, and sliding 20-turn context window
  • Wake word detection (Porcupine/Picovoice), Apple Watch Bluetooth proximity auth, continuous learning voice profiles
  • Unified speech state management — STT hallucination guard, voice pipeline orchestration, parallel model loading
Vision and Spatial Intelligence
  • Never-skip screen capture — two-phase monitoring (always-capture + conditional-analysis), self-hosted LLaVA multimodal analysis, Claude Vision escalation
  • Ghost Display — virtual macOS display for non-intrusive background automation, Ghost Hands orchestrator for autonomous visual workflows
  • Claude Computer Use — automated mouse, keyboard, and screenshot interaction via Anthropic's Computer Use API
  • OCR / OmniParser — screen text extraction, window analysis, workspace name detection, multi-monitor and multi-space intelligence via yabai window manager
  • YOLO + Claude hybrid vision — object detection with LLM-powered semantic understanding
  • Rust vision core — native performance for fast image processing, bloom filter networks, and sliding window analysis
macOS Native Integration (Swift / Objective-C / Rust)
  • Swift bridge (203 files) — CommandClassifier, SystemControl (preferences, security, clipboard, filesystem), PerformanceCore, ScreenCapture, WeatherKit, CoreLocation GPS
  • Objective-C voice unlock daemon — JARVISVoiceAuthenticator, JARVISVoiceMonitor, permission manager, launchd service integration
  • Rust performance layer — PyO3 bindings for memory pool management, quantized ML inference, vision fast processor, command classifier, health predictor; ARM64 SIMD assembly optimizations
  • CoreML acceleration — on-device intent classification, voice processing
Infrastructure and Reliability
  • Parallel initializer with cooperative cancellation, adaptive EMA-based deadlines, dependency propagation, and atomic state persistence
  • CPU-pressure-aware cloud shifting — automatic workload offload to GCP when local resources are constrained
  • Enterprise hardening — dependency injection container, enterprise process manager, system hardening, governance, Cloud SQL with race-condition-proof proxy management, TLS-safe connection factories, distributed lock manager
  • Three-tier inference routing: GCP Golden Image (primary) → Local Apple Silicon (fallback) → Claude API (emergency)
  • Trinity event bus — cross-repo IPC hub, heartbeat publishing, knowledge graph, state management, process coordination
  • Cost tracking and rate limiting — GCP cost optimization with Bayesian confidence fusion, intelligent rate orchestration
  • File integrity guardian — pre-commit integrity verification across the codebase
Intelligence and Learning
  • Google Workspace Agent — Gmail read/search/draft, Google Calendar, natural language intent routing via tiered command router
  • Proactive intelligence — predictive suggestions, proactive vision monitoring, proactive communication, emotional intelligence module
  • RAG pipeline — ChromaDB vector store, FAISS similarity search, embedding service, long-term memory system
  • Chain-of-thought / reasoning graph engine — LangGraph-based multi-step reasoning with conditional routing and reflection loops
  • Ouroboros (v262.0 B+ — fully activated) — autonomous self-development across JARVIS, JARVIS-Prime, and Reactor-Core: B+ branch-isolated saga applies (ephemeral branches, two-tier locks, ff-only promote gates, rollback-via-branch-delete), SagaMessageBus passive observer, TestFailureSensor with real TestWatcher per repo, GCP J-Prime code generation (schema 2c.1), voice narration at every decision phase
  • Web research service — autonomous web search and information synthesis
  • A/B testing framework — vision pipeline experimentation
  • Repository intelligence — code ownership analysis, dependency analyzer, API contract analyzer, AST transformer, cross-repo refactoring engine

  Deep Dive

Inference and Routing
  • 11 specialist GGUF models (~40.4 GB) pre-baked into a GCP golden image with ~30-second cold starts
  • Task-type routing — math queries hit Qwen2.5-7B, code queries hit DeepCoder, simple queries hit a 2.2 GB fast model, vision hits LLaVA
  • GCP Model Swap Coordinator with intelligent hot-swapping, per-model configuration, and inference validation
  • Neural Switchboard v98.1 — stable public API facade over routing and orchestration with WebSocket integration contracts
  • Hollow Client mode for memory-constrained hardware — strict lazy imports, zero ML dependencies at startup on 16 GB machines
Reasoning and Telemetry
  • Continuous learning hook — post-inference experience recording for Elastic Weight Consolidation via ReactorCore
  • Reasoning engine activation — chain-of-thought scaffolding (CoT/ToT/self-reflection) for high-complexity requests above configurable thresholds
  • APARS protocol (Adaptive Progress-Aware Readiness System) — 6-phase startup with real-time health reporting to the supervisor
  • LLaVA vision server — multimodal inference on port 8001 with OpenAI-compatible API, semaphore serialization, queue depth cap
  • Telemetry capture — structured JSONL interaction logging with deployment feedback loop and post-deployment probation monitoring

  Deep Dive

Training Pipeline
  • Full training pipeline: telemetry ingestion → active learning selection → gatekeeper evaluation → LoRA SFT → GGUF export → deployment gate → probation monitoring → feedback loop
  • DeploymentGate validates model integrity before deployment; rejects corrupt or degenerate outputs
  • Post-deployment probation — 30-minute health monitoring window with automatic commit or rollback based on live inference quality
  • Model lineage tracking — full provenance chain (hash, parent model, training method, evaluation scores, gate decision) in append-only JSONL
  • Tier-2/Tier-3 runtime orchestration — curriculum learning, meta-learning (MAML), causal discovery with correlation-based fallback, world model training
Infrastructure and Integration
  • GCP Spot VM auto-recovery with training checkpoint persistence and 60% cost reduction over on-demand instances
  • Native C++ training kernels via CMake/pybind11/cpp-httplib for performance-critical operations
  • Atomic experience snapshots — buffer drain under async lock, JSONL with DataHash for dataset versioning
  • PrimeConnector — WebSocket path rotation, health polling fallback, contract path discovery for cross-repo communication
  • Cross-repo integration — Ghost Display state reader, cloud mode detection, Trinity Unified Loop Manager, pipeline event logger with correlation IDs

Technical Footprint

Metric Value
Total commits 3,900+ across 3 repositories
Codebase ~2.5 million lines across 18+ languages
Build duration 12 months, solo
Unified kernel 50,000+ lines in a single orchestration file
Neural Mesh agents 16+ specialized agents with async message passing
Models served 11 specialist GGUF models via task-type routing
Inference tiers GCP Golden Image → Local Metal GPU → Claude API
Training pipeline Automated: telemetry → active learning → gatekeeper → training → GGUF export → deployment gate → probation → feedback
Voice auth Multi-factor: ECAPA-TDNN biometric + Apple Watch proximity + behavioral analysis
Vision pipeline Never-skip capture, LLaVA self-hosted, Claude escalation, YOLO hybrid, OCR/OmniParser
Swift components 203 files — system control, command classifier, screen capture, GPS, weather
Rust crates 5 Cargo workspaces — memory pool, vision processor, ML inference, SIMD optimizations
Terraform modules 7 modules (compute, network, security, storage, monitoring, budget, Spot templates)
Dockerfiles 6 (backend, backend-slim, frontend, training, cloud, GCP inference)
GitHub Actions 20+ workflows (CI/CD, CodeQL, e2e testing, deployment, database validation, file integrity)
macOS integration Native Swift/ObjC daemons, yabai WM, Ghost Display, multi-space/multi-monitor, launchd services
Cloud infrastructure GCP (Compute Engine, Cloud SQL, Cloud Run, Secret Manager, Monitoring), Spot VM auto-recovery
Google Workspace Gmail read/search/draft, Calendar, natural language routing via tiered command router

Background

I graduated from Cal Poly San Luis Obispo with a B.S. in Computer Engineering after a 10-year non-traditional academic path that started in remedial algebra at community college. I retook courses, studied through the loss of family, and spent most of my twenties earning a degree that others finish in four years. The path was not conventional. The outcome was.

JARVIS is what happens when that level of persistence meets engineering capability. Twelve months of daily commits, architectural decisions at every layer of the stack, and a refusal to ship anything that is not production-grade.


LinkedIn Article

Pinned Loading

  1. JARVIS JARVIS Public

    Trinity AGI OS -- Autonomous self-evolving AI operating system. Ouroboros + Venom proactive self-development engine, 16 sensors, 10-phase governance pipeline, Trinity consciousness, VLA vision pipe…

    Python 9 1

  2. JARVIS-Prime JARVIS-Prime Public

    Specialized PRIME models for JARVIS. Production-ready models with quantization, M1 Mac support, and seamless integration. Powered by Reactor Core.

    Python

  3. JARVIS-Reactor JARVIS-Reactor Public

    High-performance Hybrid C++/Python ML training engine featuring GCP Spot VM auto-recovery, LoRA/DPO/FSDP support, and 60% cost reduction. The training backbone for the JARVIS ecosystem.

    Python

  4. JARVIS-Portfolio JARVIS-Portfolio Public

    Portfolio & technical blog built with Next.js, featuring my JARVIS AI Agent project.

    TypeScript

  5. Ironman-Suit-Simulation Ironman-Suit-Simulation Public

    An advanced simulation of Iron Man suit engineering, integrating AI, Systems Programming, Robotics, Nanotechnology, Cybersecurity, and more.

    Python 1