Skip to content

Auto-standby#183

Open
sjmiller609 wants to merge 16 commits intomainfrom
codex/auto-standby-e2e
Open

Auto-standby#183
sjmiller609 wants to merge 16 commits intomainfrom
codex/auto-standby-e2e

Conversation

@sjmiller609
Copy link
Copy Markdown
Collaborator

@sjmiller609 sjmiller609 commented Apr 4, 2026

Summary

  • add Linux-only auto-standby built around host conntrack state in a new lib/autostandby package
  • persist and expose per-instance auto_standby policy through instance metadata and API surfaces
  • start the auto-standby controller from the API process and add a default-skipped VM-level E2E test for host->guest inbound TCP activity

Testing

  • go test -count=1 ./lib/autostandby
  • go test -count=1 -run "Test(ValidateUpdateInstanceRequest|CloneStoredMetadataForFork_DeepCopiesReferenceFields)$" ./lib/instances
  • go test -count=1 -run "Test(CreateInstance_MapsAutoStandbyPolicy|UpdateInstance_MapsAutoStandbyPatch)$" ./cmd/api/api
  • go test -run "^$" ./cmd/api
  • sudo -n env PATH=/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:$PATH HYPEMAN_RUN_AUTO_STANDBY_E2E=1 go test -count=1 -run ^TestAutoStandbyCloudHypervisorActiveInboundTCP$ ./lib/instances on deft-kernel-dev

Integration test coverage

The default-skipped Linux integration test exercised a real Cloud Hypervisor VM with networking enabled and a real conntrack-backed auto-standby controller.

It verified that:

  • a host-to-guest TCP connection to nginx appears as qualifying inbound activity in conntrack
  • the instance stays Running while that inbound TCP connection remains open
  • once the final inbound TCP connection closes, the controller allows the configured idle timeout to elapse and then transitions the instance to Standby
  • the test uses the real Linux conntrack path instead of ingress state or TAP byte counters

Note

Medium Risk
Introduces a new background controller that can automatically transition running instances to Standby based on host conntrack state, plus new persisted metadata fields and API surfaces; incorrect classification or lifecycle wiring could cause unexpected standby behavior.

Overview
Adds Linux-only auto-standby driven by host IPv4 TCP conntrack: a new lib/autostandby controller tracks inbound connections via snapshots + netlink events, persists idle/runtime timestamps, and triggers StandbyInstance after the configured idle timeout (with metrics/tracing).

Extends instance metadata and lifecycle plumbing to support this: instances now persist an auto_standby policy and controller-owned auto_standby_runtime, emit global lifecycle events, and allow UpdateInstance to modify auto_standby without requiring a running instance (env updates keep prior constraints).

Exposes the feature through the API by mapping auto_standby on create/update/list/get, adding GetAutoStandbyStatus for per-instance diagnostics, wiring the controller via Wire, and starting it in cmd/api when available; includes unit tests plus a default-skipped Linux E2E test using real conntrack and Cloud Hypervisor.

Reviewed by Cursor Bugbot for commit 789a08c. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 4, 2026

✱ Stainless preview builds

This PR will update the hypeman SDKs with the following commit message.

feat: Add Linux auto-standby controller and E2E coverage

Edit this comment to update it. It will appear in the SDK's changelogs.

hypeman-typescript studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅build ✅lint ✅test ✅

npm install https://pkg.stainless.com/s/hypeman-typescript/b530c7ff500b8e15f13a2ef313c7b21d5b798a74/dist.tar.gz
hypeman-openapi studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅

hypeman-go studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅build ✅lint ✅test ✅

go get github.com/stainless-sdks/hypeman-go@2b39732a7d5f889b40745336f45a9e3a722c8dc8

This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-04-06 21:52:19 UTC

@sjmiller609 sjmiller609 changed the title Add Linux auto-standby controller and E2E coverage Auto-standby Apr 4, 2026
@sjmiller609 sjmiller609 marked this pull request as ready for review April 4, 2026 20:15
@sjmiller609 sjmiller609 requested a review from hiroTamada April 4, 2026 20:15
@sjmiller609 sjmiller609 marked this pull request as draft April 4, 2026 20:16
@sjmiller609 sjmiller609 removed the request for review from hiroTamada April 5, 2026 15:31
sjmiller609

This comment was marked as resolved.

@sjmiller609 sjmiller609 marked this pull request as ready for review April 6, 2026 15:11
@sjmiller609 sjmiller609 requested a review from hiroTamada April 6, 2026 15:11
@sjmiller609 sjmiller609 requested a review from hiroTamada April 6, 2026 15:43
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 789a08c. Configure here.

@sjmiller609 sjmiller609 marked this pull request as draft April 6, 2026 17:37
import "sync"

// LifecycleEventAction identifies which instance lifecycle action occurred.
type LifecycleEventAction string
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we already have a lifecycle subscription system, checking

@sjmiller609 sjmiller609 requested a review from hiroTamada April 6, 2026 21:53
@sjmiller609 sjmiller609 marked this pull request as ready for review April 6, 2026 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants