Skip to content

feat: multi-host RAG service provisioning#335

Merged
tsivaprasad merged 6 commits intomainfrom
PLAT-493-rag-service-multi-node-provisioning
Apr 12, 2026
Merged

feat: multi-host RAG service provisioning#335
tsivaprasad merged 6 commits intomainfrom
PLAT-493-rag-service-multi-node-provisioning

Conversation

@tsivaprasad
Copy link
Copy Markdown
Contributor

Summary

This PR provisions independent RAG service instances across multiple hosts defined in host_ids. Each host receives its own container, config file, API key files, and Postgres service user connected to its co-located Patroni primary.

Changes

  • Forward TargetSessionAttrs into the RAG ServiceInstanceSpecResource so all instances correctly use prefer-standby for read-only connections

  • Add explicit case "rag" to resolveTargetSessionAttrs to document intent and guard against future default changes

  • Add TestGenerateRAGInstanceResources_TargetSessionAttrs to verify TargetSessionAttrs is propagated correctly into the spec resource

Testing

Verification:

  1. Created cluster
  2. Created DB with config
    rag_create_multi_host_db.json
  3. DB Created successfully
docker ps --filter label=pgedge.database.id=storefrontmulti \
  --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"

NAMES                                                              STATUS                   PORTS
rag-storefrontmulti-rag-9ptayhma.1.ct99czvbhjoc5xrimgg2yve7p       Up 2 minutes (healthy)   0.0.0.0:55005->8080/tcp
rag-storefrontmulti-rag-689qacsi.1.paj022yf1nsqx8tiu24sr8w8d       Up 2 minutes (healthy)   0.0.0.0:55004->8080/tcp
rag-storefrontmulti-rag-ant97dj4.1.6dg5wtzwy5xn0pgkrqwwlo320       Up 2 minutes (healthy)   0.0.0.0:55002->8080/tcp

# For each host container (host-1, host-2, host-3):
for h in host-1 host-2 host-3; do
  echo "=== $h ==="
  docker exec control-plane-dev-${h}-1 \
    ls /Users/sivat/projects/control-plane/control-plane/docker/control-plane-dev/data/${h}/services/storefrontmulti-rag-${h}/
  docker exec control-plane-dev-${h}-1 \
    ls /Users/sivat/projects/control-plane/control-plane/docker/control-plane-dev/data/${h}/services/storefrontmulti-rag-${h}/keys/
done

=== host-1 ===
keys
pgedge-rag-server.yaml
default_embedding.key
default_rag.key
=== host-2 ===
keys
pgedge-rag-server.yaml
default_embedding.key
default_rag.key
=== host-3 ===
keys
pgedge-rag-server.yaml
default_embedding.key
default_rag.key

docker exec control-plane-dev-host-1-1 curl -s http://127.0.0.1:55005/v1/health
docker exec control-plane-dev-host-1-1 curl -s http://127.0.0.1:55004/v1/health
docker exec control-plane-dev-host-1-1 curl -s http://127.0.0.1:55002/v1/health
{"status":"healthy"}
{"status":"healthy"}
{"status":"healthy"}


docker exec control-plane-dev-host-1-1 curl -s -X POST \
  http://127.0.0.1:55005/v1/pipelines/default \
  -H "Content-Type: application/json" \
  -d '{"query": "test"}'

{"answer":"No relevant information found in the available documents.","tokens_used":0}

Checklist

  • Tests added

PLAT-493

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 8, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 55af2ac2-7bbf-444b-9683-7c9c8663be46

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The changes propagate TargetSessionAttrs configuration through RAG service instance generation in the orchestrator layer and add logic to resolve RAG services to prefer-standby session mode by default, with corresponding test coverage for validation.

Changes

Cohort / File(s) Summary
RAG Service Orchestration
server/internal/orchestrator/swarm/orchestrator.go, server/internal/orchestrator/swarm/rag_service_user_role_test.go
Added TargetSessionAttrs field population in generateRAGInstanceResources, with a new unit test validating that the attribute is correctly propagated to generated ServiceInstanceSpecResource instances.
RAG Service Attribute Resolution
server/internal/workflows/plan_update.go, server/internal/workflows/plan_update_test.go
Extended resolveTargetSessionAttrs function to explicitly handle RAG service types by returning TargetSessionAttrsPreferStandby as default, with two new test cases covering both default and explicit override scenarios.

Poem

🐰 A rabbit hops through sessions fine,
RAG attributes now align,
With prefer-standby as the base,
Tests ensure there's no lost trace! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The description includes Summary, Changes, and Testing sections with detailed verification steps, but the Checklist and Notes for Reviewers sections are incomplete or missing most items, and no Changelog entry is mentioned. Complete the Checklist by checking the remaining items (Documentation, Issue link, Changelog entry, Breaking changes), and add a Notes for Reviewers section if needed for context.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding multi-host RAG service provisioning capability, which aligns with the core objective of provisioning independent RAG instances across multiple hosts.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch PLAT-493-rag-service-multi-node-provisioning

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tsivaprasad tsivaprasad requested a review from jason-lynch April 8, 2026 11:36
@codacy-production
Copy link
Copy Markdown

codacy-production bot commented Apr 8, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 0 complexity · -2 duplication

Metric Results
Complexity 0
Duplication -2

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

@tsivaprasad
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 8, 2026

@tsivaprasad I'll review the changes in this PR now.


🧠 Learnings used
Learnt from: tsivaprasad
Repo: pgEdge/control-plane PR: 302
File: server/internal/orchestrator/swarm/orchestrator.go:686-704
Timestamp: 2026-04-04T13:55:26.501Z
Learning: In `server/internal/orchestrator/swarm/orchestrator.go` (Go), the `generateRAGInstanceResources` function intentionally omits the `ServiceInstanceSpecResource`/`ServiceInstanceResource` pair (and the corresponding `KeysPath` mount wiring). The deployable service spec/instance resources for RAG are deferred to a subsequent ticket. Do not flag the absence of `ServiceInstanceSpecResource`/`ServiceInstanceResource` in `generateRAGInstanceResources` as a missing implementation during code review.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
server/internal/workflows/plan_update.go (1)

169-183: Make postgrest explicit in resolveTargetSessionAttrs.

postgrest is a supported service type but currently falls through default. Adding an explicit case keeps intent stable if the fallback behavior changes later.

♻️ Suggested refactor
 	switch serviceSpec.ServiceType {
 	case "mcp":
 		// MCP maps allow_writes → primary/prefer-standby
 		if allowWrites, ok := serviceSpec.Config["allow_writes"].(bool); ok && allowWrites {
 			return database.TargetSessionAttrsPrimary
 		}
 		return database.TargetSessionAttrsPreferStandby
+	case "postgrest":
+		// PostgREST defaults to read-only routing unless explicitly overridden.
+		return database.TargetSessionAttrsPreferStandby
 	case "rag":
 		// RAG is read-only; always prefer a standby when available.
 		return database.TargetSessionAttrsPreferStandby
 	default:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/workflows/plan_update.go` around lines 169 - 183, The switch
on serviceSpec.ServiceType inside resolveTargetSessionAttrs currently treats
"postgrest" via the default branch; add an explicit case for "postgrest" so its
intent is preserved if the default behavior changes. Modify
resolveTargetSessionAttrs to include case "postgrest": returning
database.TargetSessionAttrsPreferStandby (same behavior as current default),
keeping "mcp" and "rag" cases unchanged and ensuring serviceSpec.ServiceType is
compared against the literal "postgrest".
server/internal/orchestrator/swarm/rag_service_user_role_test.go (1)

370-385: Avoid index-coupled lookup in this test.

Using result.Resources[5] makes the test brittle to unrelated resource-order changes. Locate ServiceInstanceSpecResource by identifier type, and prefer the constant over a string literal.

♻️ Suggested refactor
-		TargetSessionAttrs: "prefer-standby",
+		TargetSessionAttrs: database.TargetSessionAttrsPreferStandby,
 	}
@@
-	// ServiceInstanceSpecResource is at index 5 (single node: Network + RO + dir + keys + config + instanceSpec + instance).
-	sis, err := resource.ToResource[*ServiceInstanceSpecResource](result.Resources[5])
-	if err != nil {
-		t.Fatalf("ToResource ServiceInstanceSpecResource: %v", err)
-	}
-	if sis.TargetSessionAttrs != "prefer-standby" {
-		t.Errorf("TargetSessionAttrs = %q, want %q", sis.TargetSessionAttrs, "prefer-standby")
-	}
+	var sis *ServiceInstanceSpecResource
+	for _, rd := range result.Resources {
+		if rd.Identifier.Type != ResourceTypeServiceInstanceSpec {
+			continue
+		}
+		sis, err = resource.ToResource[*ServiceInstanceSpecResource](rd)
+		if err != nil {
+			t.Fatalf("ToResource ServiceInstanceSpecResource: %v", err)
+		}
+		break
+	}
+	if sis == nil {
+		t.Fatal("ServiceInstanceSpecResource not found")
+	}
+	if sis.TargetSessionAttrs != database.TargetSessionAttrsPreferStandby {
+		t.Errorf("TargetSessionAttrs = %q, want %q", sis.TargetSessionAttrs, database.TargetSessionAttrsPreferStandby)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/rag_service_user_role_test.go` around
lines 370 - 385, The test is brittle because it picks
ServiceInstanceSpecResource by fixed index; change it to find the resource by
type/identifier instead: iterate over result.Resources, use
resource.ToResource[*ServiceInstanceSpecResource] (or check the resource's
Type()/Kind() constant) to locate the ServiceInstanceSpecResource instance,
prefer the package constant for the resource identifier rather than a string
literal, then assert on sis.TargetSessionAttrs as before; update the test around
generateRAGInstanceResources/result.Resources to perform this type-based lookup.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@server/internal/orchestrator/swarm/rag_service_user_role_test.go`:
- Around line 370-385: The test is brittle because it picks
ServiceInstanceSpecResource by fixed index; change it to find the resource by
type/identifier instead: iterate over result.Resources, use
resource.ToResource[*ServiceInstanceSpecResource] (or check the resource's
Type()/Kind() constant) to locate the ServiceInstanceSpecResource instance,
prefer the package constant for the resource identifier rather than a string
literal, then assert on sis.TargetSessionAttrs as before; update the test around
generateRAGInstanceResources/result.Resources to perform this type-based lookup.

In `@server/internal/workflows/plan_update.go`:
- Around line 169-183: The switch on serviceSpec.ServiceType inside
resolveTargetSessionAttrs currently treats "postgrest" via the default branch;
add an explicit case for "postgrest" so its intent is preserved if the default
behavior changes. Modify resolveTargetSessionAttrs to include case "postgrest":
returning database.TargetSessionAttrsPreferStandby (same behavior as current
default), keeping "mcp" and "rag" cases unchanged and ensuring
serviceSpec.ServiceType is compared against the literal "postgrest".

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f6781a41-3554-4b5e-bf79-53c0c3aaa128

📥 Commits

Reviewing files that changed from the base of the PR and between aebe98f and 59fecc6.

📒 Files selected for processing (4)
  • server/internal/orchestrator/swarm/orchestrator.go
  • server/internal/orchestrator/swarm/rag_service_user_role_test.go
  • server/internal/workflows/plan_update.go
  • server/internal/workflows/plan_update_test.go

// Future service types add cases here.
case "rag":
// RAG is read-only; always prefer a standby when available.
return database.TargetSessionAttrsPreferStandby
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question since I'm unfamiliar with how RAG is used: is the RAG server typically used as a standalone application or is it a component in a larger application? The reason I'm asking is that if it's used alongside something that writes to the database, then this would introduce a delay between when the data is written and when it's available for the RAG server. That can cause odd behavior if you're expecting "read-your-writes" consistency. But, none of that is an issue if the RAG server is just a standalone service.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RAG server is read-only — it queries embeddings and document chunks to answer prompts, but never writes to Postgres. Writes (embedding) happen separately through manual process and we have separate ticket for that documentation how to do it. So there's no read-your-writes concern and prefer-standby is the right choice.

Base automatically changed from PLAT-492-rag-service-container-deployment to main April 12, 2026 11:50
Resolve conflict in orchestrator.go by retaining the
`TargetSessionAttrs` field added by this branch.

Also cleans up the test file rename (rag_service_user_role_test.go
→ rag_instance_resources_test.go) introduced earlier in this branch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@tsivaprasad tsivaprasad merged commit 8b87c6b into main Apr 12, 2026
3 checks passed
@tsivaprasad tsivaprasad deleted the PLAT-493-rag-service-multi-node-provisioning branch April 12, 2026 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants