Skip to content

fix: add allowlist validation for RAG pipeline names#337

Merged
tsivaprasad merged 3 commits intomainfrom
PLAT-536-add-allowlist-validation-for-rag-pipeline-names
Apr 12, 2026
Merged

fix: add allowlist validation for RAG pipeline names#337
tsivaprasad merged 3 commits intomainfrom
PLAT-536-add-allowlist-validation-for-rag-pipeline-names

Conversation

@tsivaprasad
Copy link
Copy Markdown
Contributor

@tsivaprasad tsivaprasad commented Apr 9, 2026

Summary

This PR adds allowlist validation for RAG pipeline names, rejecting names that don't match ^[a-z0-9_-]+$ at the API layer before any resources are provisioned.

Changes

  • Add ragPipelineNamePattern regex (^[a-z0-9_-]+$) to server/internal/database/rag_service_config.go
  • Validate pipeline names against the allowlist in validateRAGPipeline, producing a clear error message on rejection
  • Add TestParseRAGServiceConfig_PipelineNameAllowlist covering 7 valid and 7 invalid name variants (spaces, uppercase, slashes, path traversal, dots, Unicode)

Testing

  1. Create cluster
  2. Create DB with invalid pipeline name
restish control-plane-local-1 create-database '{
    "spec": {
      "database_name": "test_invalid_name",
      "database_users": [{"username":"admin","password":"pw","db_owner":true,"attributes":["LOGIN","SUPERUSER"]}],
      "port": 0,
      "nodes": [{"name":"n1","host_ids":["host-1"]}],
      "services": [{
        "service_id": "rag", "service_type": "rag", "version": "latest",
        "host_ids": ["host-1"], "port": 0,
        "config": {
          "pipelines": [{
            "name": "My Pipeline",
            "tables": [{"table":"docs","text_column":"content","vector_column":"embedding"}],
            "embedding_llm": {"provider":"openai","model":"text-embedding-3-small","api_key":"sk-test"},
            "rag_llm": {"provider":"anthropic","model":"claude-haiku-4-5-20251001","api_key":"sk-ant-test"}
          }]
        }
      }]
    }
  }' | jq .

{
  "message": "services[0].config: pipelines[0].name \"My Pipeline\" is invalid: must match ^[a-z0-9_-]+$",
  "name": "invalid_input"
}

Checklist

  • Tests added

PLAT-536

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 9, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 99d3dc05-08dd-45c0-9a77-b084e8d370ce

📥 Commits

Reviewing files that changed from the base of the PR and between fe33b5a and 90d25b9.

📒 Files selected for processing (2)
  • server/internal/database/rag_service_config.go
  • server/internal/database/rag_service_config_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • server/internal/database/rag_service_config_test.go

📝 Walkthrough

Walkthrough

A compiled regex allowlist (ragPipelineNamePattern) and its source text (ragPipelineNamePatternText) were added to enforce pipeline names matching ^[a-z0-9_][a-z0-9_-]*$ (lowercase/number/underscore, first char not a hyphen). validateRAGPipeline now checks the name against this regex (in addition to required and uniqueness checks). A unit test was added for accepted and rejected names.

Changes

Cohort / File(s) Summary
Pipeline Name Validation
server/internal/database/rag_service_config.go
Added ragPipelineNamePatternText and compiled ragPipelineNamePattern (^[a-z0-9_][a-z0-9_-]*$). Updated validateRAGPipeline to emit an explicit regex-mismatch error when a name fails the allowlist before the duplicate-name uniqueness check.
Validation Tests
server/internal/database/rag_service_config_test.go
Added TestParseRAGServiceConfig_PipelineNameAllowlist. Verifies a set of valid names pass and a variety of invalid inputs (uppercase, spaces, slashes, path traversal, unicode/emoji, dots, leading hyphen, empty string handled separately) produce errors containing the regex constraint.

Poem

🐰 I hop through tests and tidy names,
A tiny regex guards the lanes.
Lowercase, digits, underscore, dash—
No stray caps, emoji, or slash.
Pipelines safe, I munch my greens.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely describes the main change: adding allowlist validation for RAG pipeline names, which is the primary focus of the changeset.
Description check ✅ Passed The description includes all key template sections (Summary, Changes, Testing, Checklist) with substantive content, though Documentation and Breaking Changes sections are not explicitly addressed (not applicable here).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch PLAT-536-add-allowlist-validation-for-rag-pipeline-names

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production bot commented Apr 9, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 0 duplication

Metric Results
Duplication 0

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
server/internal/database/rag_service_config.go (1)

13-17: Single-source the regex text to avoid drift between validation and error output.

The pattern is currently duplicated in MustCompile(...) and in the error string. If one changes later, tests/messages can silently diverge.

♻️ Proposed refactor
+const ragPipelineNamePatternText = `^[a-z0-9_-]+$`
-var ragPipelineNamePattern = regexp.MustCompile(`^[a-z0-9_-]+$`)
+var ragPipelineNamePattern = regexp.MustCompile(ragPipelineNamePatternText)
...
-		errs = append(errs, fmt.Errorf("%s.name %q is invalid: must match ^[a-z0-9_-]+$", prefix, p.Name))
+		errs = append(errs, fmt.Errorf("%s.name %q is invalid: must match %s", prefix, p.Name, ragPipelineNamePatternText))

Also applies to: 138-140

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/database/rag_service_config.go` around lines 13 - 17, The
regex text is duplicated; extract the pattern string into a single const (e.g.,
ragPipelineNamePatternText) and use that constant when creating
ragPipelineNamePattern with regexp.MustCompile(ragPipelineNamePatternText) and
when building any validation error messages so they remain in sync (also update
the other duplicated occurrences that reference the same literal around the rag
pipeline validation). Ensure you reference ragPipelineNamePatternText in all
places instead of retyping the literal.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@server/internal/database/rag_service_config.go`:
- Around line 13-17: The regex text is duplicated; extract the pattern string
into a single const (e.g., ragPipelineNamePatternText) and use that constant
when creating ragPipelineNamePattern with
regexp.MustCompile(ragPipelineNamePatternText) and when building any validation
error messages so they remain in sync (also update the other duplicated
occurrences that reference the same literal around the rag pipeline validation).
Ensure you reference ragPipelineNamePatternText in all places instead of
retyping the literal.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 47336a12-65a1-4da9-a25c-d2e9cab89f4a

📥 Commits

Reviewing files that changed from the base of the PR and between f26aa71 and 3d9ee78.

📒 Files selected for processing (2)
  • server/internal/database/rag_service_config.go
  • server/internal/database/rag_service_config_test.go

Copy link
Copy Markdown
Member

@jason-lynch jason-lynch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I do have one suggestion, but it's non-blocking if you're certain that these filenames won't be used as a command line argument.

@tsivaprasad tsivaprasad merged commit b79ea7b into main Apr 12, 2026
3 checks passed
@tsivaprasad tsivaprasad deleted the PLAT-536-add-allowlist-validation-for-rag-pipeline-names branch April 12, 2026 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants