raftadmin: add phase 3 native admin surface by bootjp · Pull Request #472 · bootjp/elastickv

bootjp · 2026-04-09T16:14:46Z

No description provided.

gemini-code-assist

Code Review

This pull request replaces the external raftadmin dependency with a native implementation within the repository. It introduces a new cmd/raftadmin CLI tool, updates the internal raft administration services, and modifies the existing codebase to use these new internal services instead of the external ones. The reviewer suggests improving the CLI tool's usability by providing command-specific usage messages rather than a generic one.

cmd/raftadmin/main.go

bootjp · 2026-04-09T16:46:05Z

/gemini review

gemini-code-assist

Code Review

This pull request replaces external dependencies for Raft administration and health monitoring with internal implementations. It introduces the internal/raftadmin package, a dedicated cmd/raftadmin CLI tool, and updates the protobuf definitions to include a RaftAdmin service. Feedback focuses on improving the health monitoring efficiency by making the polling interval configurable and addressing security concerns regarding the use of insecure gRPC credentials in the administrative tool.

gemini-code-assist · 2026-04-09T16:57:45Z

internal/raftadmin/health.go

+func observeLeaderHealth(ctx context.Context, engine raftengine.Engine, healthSrv *health.Server, serviceNames []string) {
+	services := dedupeServices(serviceNames)
+	status := currentHealthStatus(ctx, engine)
+	setHealthStatus(healthSrv, services, status)
+
+	ticker := time.NewTicker(defaultHealthPollInterval)
+	defer ticker.Stop()
+
+	for {
+		select {
+		case <-ctx.Done():
+			setHealthStatus(healthSrv, services, healthpb.HealthCheckResponse_NOT_SERVING)
+			return
+		case <-ticker.C:
+			next := currentHealthStatus(ctx, engine)
+			if next != status {
+				setHealthStatus(healthSrv, services, next)
+				status = next
+			}
+		}
+	}
+}


The observeLeaderHealth goroutine uses a ticker to poll the health status. While the interval is small (250ms), it might be more efficient to use a notification mechanism if the underlying raft engine supports it (e.g., a channel for state changes). If polling is necessary, consider making the interval configurable via environment variables or configuration, as 250ms might be too aggressive for some environments or too slow for others.

gemini-code-assist · 2026-04-09T16:57:45Z

cmd/raftadmin/main.go

+	conn, err := grpc.NewClient(addr, grpc.WithTransportCredentials(insecure.NewCredentials()))
+	if err != nil {
+		return nil, func() {}, errors.Wrap(err, "dial raft admin")
+	}


The dialClient function uses grpc.NewClient with insecure.NewCredentials(). While this matches the current project state, it's important to ensure that administrative tools don't accidentally encourage insecure deployments in production. As this is a limited-scope implementation, explicitly document this limitation and consider adding runtime validation or a flag to support TLS credentials in the future to prevent misuse in a general context.

References

For prototype or limited-scope implementations, explicitly document the limitations (e.g., single-node only, blocking startup) and add runtime validation to prevent misuse in a general context.

bootjp · 2026-04-09T18:09:04Z

/gemini review

gemini-code-assist

Code Review

This pull request replaces external Raft administration and health-check dependencies with a native internal implementation, including a new RaftAdmin gRPC service and a CLI tool. Review feedback suggests enhancing error reporting in the CLI, enforcing a minimum health poll interval to protect system performance, and ensuring the background health-monitoring goroutine terminates correctly to avoid potential resource leaks.

gemini-code-assist · 2026-04-09T18:17:20Z

cmd/raftadmin/main.go

+	case "0", "false", "no", "off":
+		return false, nil
+	default:
+		return false, errors.Errorf("parse %s", name)


The error message here is quite generic and doesn't provide much context to the user about what went wrong during parsing. Including the invalid value and the environment variable name would be more helpful.

Suggested change

return false, errors.Errorf("parse %s", name)

return false, errors.Errorf("invalid boolean value for %s: %q", name, raw)

gemini-code-assist · 2026-04-09T18:17:20Z

internal/raftadmin/health.go

+	}
+	millis, err := strconv.Atoi(raw)


Allowing a health poll interval as low as 1ms could lead to excessive CPU usage, especially in clusters with many nodes or when the system is under load. It's generally safer to enforce a reasonable minimum interval (e.g., 10ms or 50ms) to prevent accidental performance degradation.

millis, err := strconv.Atoi(raw) if err != nil || millis <= 0 { return defaultHealthPollInterval } if millis < 10 { millis = 10 }

References

For low-latency systems, a short duration like 10ms is generally the acceptable minimum for repetitive operations or backoffs to prevent performance degradation.

gemini-code-assist · 2026-04-09T18:17:20Z

internal/raftadmin/health.go

+
+	healthSrv := health.NewServer()
+	healthpb.RegisterHealthServer(gs, healthSrv)
+	go observeLeaderHealth(ctx, engine, healthSrv, serviceNames, healthPollInterval())


The goroutine started here depends on the provided ctx for termination. If the caller passes context.Background() (which is the fallback at line 27), this goroutine will leak if the gRPC server is stopped but the process continues. While the current usage in main.go and tests seems to handle this via application-level contexts, it's a potential pitfall for future integrations.

References

When designing interfaces and background tasks, use context.Context for managing deadlines and cancellation to ensure proper resource cleanup and prevent goroutine leaks.

bootjp · 2026-04-10T05:26:41Z

/gemini review

gemini-code-assist

Code Review

This pull request replaces external Raft administration and health dependencies with a new internal raftadmin package. It introduces a RaftAdmin gRPC service, a corresponding CLI tool, and integrated health checking. The changes also include updates to the Raft engine interfaces and implementations (Etcd and Hashicorp) to support these administrative operations. Feedback suggests enhancing the CLI's TLS configuration to support custom CA certificates and ensuring consistent context usage in the health monitoring logic to avoid static status states.

gemini-code-assist · 2026-04-10T05:33:00Z

cmd/raftadmin/main.go

+		return nil, err
+	}
+	if useTLS {
+		return credentials.NewTLS(&tls.Config{MinVersion: tls.VersionTLS12}), nil


The TLS configuration uses MinVersion: tls.VersionTLS12 but does not specify any RootCAs. While this works for publicly trusted certificates, internal Raft administration endpoints often use self-signed certificates or internal CAs. Consider adding a flag to allow users to provide a custom CA certificate file for verification.

gemini-code-assist · 2026-04-10T05:33:00Z

internal/raftadmin/health.go

+	if ctx == nil {
+		registerStaticHealthService(healthSrv, engine, serviceNames)
+		return
+	}


When ctx is nil, the health status is set once but never updated. It would be safer to enforce a non-nil context to ensure proper lifecycle management and cancellation, as context.Context should be the primary mechanism for managing such operations in this repository.

References

When designing interfaces, use context.Context for managing deadlines and cancellation.

github-actions · 2026-04-10T05:40:11Z

cmd/raftadmin/main.go

+		return cfg, nil
+	}
+
+	pemBytes, err := os.ReadFile(caPath)


🚫 [golangci] _{reported by reviewdog 🐶}
G703: Path traversal via taint analysis (gosec)

bootjp · 2026-04-10T05:56:30Z

/gemini review

gemini-code-assist

Code Review

This pull request replaces the external raftadmin and leaderhealth dependencies with a custom internal implementation. It introduces a new RaftAdmin gRPC service, a corresponding server implementation in internal/raftadmin, and a new CLI tool in cmd/raftadmin. The raftengine interfaces have been expanded to include administrative operations such as adding voters and transferring leadership, with concrete implementations added for the Hashicorp Raft engine. Jepsen tests and deployment scripts were also updated to integrate these changes. One review comment suggests verifying the Go version or restoring a loop variable capture in adapter/s3_test.go to prevent potential race conditions in parallel tests.

gemini-code-assist · 2026-04-10T05:59:09Z

adapter/s3_test.go

 	}

 	for _, tc := range cases {
-		tc := tc


The tc := tc line has been removed from this for loop, which contains a parallel test (t.Parallel()). This is safe only if the project is built with Go 1.22 or newer, as loop variable semantics changed in that version to prevent this kind of data race in sub-tests.

If the project's Go version is older than 1.22, this change introduces a race condition where multiple parallel sub-tests might operate on the same test case data from the slice. Please ensure the project's minimum Go version is 1.22, or restore this line to prevent potential test flakiness.

bootjp added 2 commits April 10, 2026 01:12

adapter: drop redundant loop capture in s3 test

f1b87b2

raftadmin: add phase 3 native admin surface

81c141a

gemini-code-assist bot reviewed Apr 9, 2026

View reviewed changes

cmd/raftadmin/main.go Outdated Show resolved Hide resolved

cmd/raftadmin: make usage command-specific

e9071a3

gemini-code-assist bot reviewed Apr 9, 2026

View reviewed changes

raftadmin: harden phase 3 admin config

1b260e9

bootjp closed this Apr 9, 2026

bootjp reopened this Apr 9, 2026

gemini-code-assist bot reviewed Apr 9, 2026

View reviewed changes

raftadmin: tighten health polling defaults

8cd0526

raftadmin: avoid nil context in static health test

fd15c78

gemini-code-assist bot reviewed Apr 10, 2026

View reviewed changes

raftadmin: require context and support custom CAs

0eac591

github-actions bot reviewed Apr 10, 2026

View reviewed changes

gemini-code-assist bot reviewed Apr 10, 2026

View reviewed changes

bootjp merged commit 02c6d2b into main Apr 10, 2026
6 of 7 checks passed

bootjp deleted the feat/replace-raft-phase3 branch April 10, 2026 06:09

	return false, errors.Errorf("parse %s", name)
	return false, errors.Errorf("invalid boolean value for %s: %q", name, raw)

Conversation

bootjp commented Apr 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

bootjp commented Apr 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Apr 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Apr 10, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Apr 10, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant