Skip to content

feat(proxy): move chat completion to new provider integration#37

Merged
bzp2010 merged 3 commits intomainfrom
bzp/feat-move-chat-to-new-gateway
Apr 13, 2026
Merged

feat(proxy): move chat completion to new provider integration#37
bzp2010 merged 3 commits intomainfrom
bzp/feat-move-chat-to-new-gateway

Conversation

@bzp2010
Copy link
Copy Markdown
Collaborator

@bzp2010 bzp2010 commented Apr 13, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Fixed duplicate version prefixes when building provider URLs.
    • Added clearer internal error mapping for gateway failures.
  • Improvements

    • Consistent token-total derivation and propagation for requests and streams.
    • Simplified streaming flow with a dedicated usage observer for more reliable accounting.
    • Router now routes through the gateway for unified provider handling.
  • Tests

    • Added unit tests for token-total derivation and URL behavior; updated CI test script.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 13, 2026

📝 Walkthrough

Walkthrough

Switched Usage total-token derivation to centralized helpers and updated callers to call with_derived_total(); adjusted provider URL construction; threaded a shared Gateway into proxy state and reworked provider-instance creation; replaced ResponseData/TokenUsage plumbing with direct Usage usage across hooks and handlers; added related tests.

Changes

Cohort / File(s) Summary
Usage core & converters
src/gateway/types/common.rs, src/gateway/formats/anthropic_messages.rs, src/gateway/streams/bridged.rs
Added Usage::resolved_total_tokens() and Usage::with_derived_total(); removed local total-token arithmetic in Anthropic converters and bridged stream and now call .with_derived_total(); added unit tests ensuring cached input tokens are included.
Provider meta & instance creation
src/gateway/traits/provider.rs, src/proxy/provider.rs
ProviderMeta::build_url now parses base_url into reqwest::Url and merges path segments to avoid duplicated version prefixes with fallback to prior concat behavior; added create_legacy_provider() and create_provider_instance() to build provider/provider-instance with auth and optional base URL parsing/validation.
Proxy app state & wiring
src/main.rs, src/proxy/mod.rs
Constructed shared Arc<Gateway> from default provider registry, threaded it into AppState::new(), added AppState::gateway() accessor, and introduced provider module.
Chat completions handler & types
src/proxy/handlers/chat_completions/mod.rs, src/proxy/handlers/chat_completions/types.rs, src/proxy/types.rs
Routed chat requests through gateway/provider-instance resolution; removed inline handler DTOs and consolidated legacy DTOs into src/proxy/types.rs; changed error mapping to use GatewayError and added MissingModelInContext.
Embeddings handler changes
src/proxy/handlers/embeddings/mod.rs
Switched to create_legacy_provider(); added embedding_usage() helper to produce a Usage from embedding responses; updated rate-limit and observability calls to accept &Usage.
Hooks: observability & rate limiting
src/proxy/hooks/mod.rs, src/proxy/hooks/observability/mod.rs, src/proxy/hooks/rate_limit/mod.rs
Removed ResponseData/TokenUsage wrappers; updated record_usage/record_streaming_usage and post_check/post_check_streaming to accept &Usage and use resolved_total_tokens() for totals; updated call sites accordingly.
Gateway error
src/gateway/error.rs
Added GatewayError::Internal(String) variant mapped to HTTP 500 and marked non-retryable; added tests.
Tests & tooling
tests/package.json
Changed test script to vitest run and added test:dev script.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Handler as Chat Handler
    participant Gateway
    participant Provider as Provider Instance
    participant RateLimit as Rate Limiter
    participant Observability

    Client->>Handler: POST /chat/completions
    Handler->>Gateway: resolve provider instance (model)
    Gateway-->>Handler: ProviderInstance
    Handler->>Provider: chat(request, provider_instance)
    alt Streaming
        Provider-->>Handler: ChatResponseStream
        Handler->>Handler: spawn_stream_usage_observer()
        Handler-->>Client: SSE stream chunks
        Handler->>RateLimit: post_check_streaming(usage)
        Handler->>Observability: record_streaming_usage(usage)
    else Non-Streaming
        Provider-->>Handler: Complete response (includes Usage)
        Handler->>RateLimit: post_check(usage)
        Handler->>Observability: record_usage(usage)
        Handler-->>Client: JSON response
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning)

Check name Status Explanation Resolution
Security Check ❌ Error Critical credential leakage vulnerability: error messages in provider_auth_and_base_url expose full api_base URLs with embedded credentials to API clients Sanitize error messages by stripping userinfo from URLs before inclusion; implement custom Debug for ProviderAuth to redact ApiKey values
E2e Test Quality Review ⚠️ Warning PR introduces silent error handling defect in handle_regular_request() where post_check failures are logged but ignored, returning HTTP 200 instead of error responses. Return error responses (ChatCompletionError::Internal) on post_check failures instead of logging and continuing; add E2E tests for incomplete usage data.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately describes the main objective: migrating chat completion handling to use the new gateway provider integration instead of direct provider construction.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bzp/feat-move-chat-to-new-gateway

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
src/gateway/traits/provider.rs (1)

324-332: Test covers the primary use case.

Consider adding edge case tests for robustness (optional):

  • build_url("https://example.com", "ignored") → no overlap
  • build_url("https://example.com/v1/", "ignored") → trailing slash handling
  • build_url("not-a-valid-url", "ignored") → fallback path
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gateway/traits/provider.rs` around lines 324 - 332, Add additional unit
tests for build_url on DummyProvider to cover edge cases: assert
build_url("https://example.com", "ignored") returns
"https://example.com/chat/completions" (no overlapping version), assert
build_url("https://example.com/v1/", "ignored") handles trailing slash and
returns "https://example.com/v1/chat/completions", and assert
build_url("not-a-valid-url", "ignored") exercises the fallback path behavior;
add these alongside the existing build_url_avoids_duplicate_version_prefixes
test and ensure the tests reference DummyProvider::build_url to locate the
implementation.
src/proxy/hooks/observability/mod.rs (1)

61-90: Trace the public observability hooks.

record_start_time, record_usage, record_first_token_latency, and record_streaming_usage are public entrypoints under src/, but none has #[fastrace::trace].

As per coding guidelines, src/**/*.rs: Use #[fastrace::trace] decorator for distributed tracing on public functions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/proxy/hooks/observability/mod.rs` around lines 61 - 90, Add the
#[fastrace::trace] attribute to each public observability hook to enable
distributed tracing: place #[fastrace::trace] directly above the async fn
declarations for record_start_time, record_usage, record_first_token_latency,
and record_streaming_usage; ensure the fastrace macro is in scope (import if
necessary) so the attribute compiles cleanly.
src/proxy/provider.rs (1)

14-21: Document and trace the new public provider helpers.

create_legacy_provider and create_provider_instance are exported from src/, but they are missing /// docs and #[fastrace::trace].

As per coding guidelines, src/**/*.rs: Use #[fastrace::trace] decorator for distributed tracing on public functions, and **/*.rs: Use /// for doc comments on public items in Rust.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/proxy/provider.rs` around lines 14 - 21, Add missing documentation
comments and fastrace decorators to the public helper functions: annotate
create_legacy_provider and create_provider_instance with a brief triple-slash
doc comment describing purpose, parameters (e.g., model, gateway) and return
value, and add the #[fastrace::trace] attribute above each function to enable
distributed tracing; ensure the docs mention that create_legacy_provider returns
a Box<dyn Provider> created from model.provider_config and that
create_provider_instance returns Result<ProviderInstance> for the given Gateway
and Model so callers know expected behavior.
src/proxy/types.rs (1)

7-122: Add /// docs to the exported proxy types.

This module now exposes a fairly large public request/response surface via crate::proxy::types, but none of the new structs is documented. A short doc comment per exported type would make the legacy boundary much easier to consume and maintain.

As per coding guidelines, **/*.rs: Use /// for doc comments on public items in Rust.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/proxy/types.rs` around lines 7 - 122, Add concise /// doc comments for
each exported public type to document the proxy request/response surface: add
one-line summaries (and optional brief notes about important fields) using ///
above ChatMessage, ChatCompletionResponseFormat, ChatCompletionRequest,
ChatCompletionChoice, ChatCompletionUsage, ChatCompletionResponse,
ChatCompletionChunkDelta, ChatCompletionChunkToolCall,
ChatCompletionChunkToolCallFunction, ChatCompletionChunkChoice, and
ChatCompletionChunk; keep comments short (one sentence) describing the role of
the struct (e.g., "Represents a chat message with role and content"), place them
immediately above the pub struct declarations, and preserve existing derives
(Serialize/Deserialize/Debug/Clone/Default).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/proxy/handlers/chat_completions/types.rs`:
- Around line 33-39: The current mapping in ChatCompletionError::GatewayError
(in types.rs) leaks raw upstream/transport text via err.to_string() into the
public JSON; change the produced "message" so that for server errors
(err.status_code().is_server_error() / 5xx) you return a generic message like
"Upstream provider error" and for non-server/client-safe cases only include a
sanitized/whitelisted excerpt (e.g., truncated or stripped of URLs/headers)
while preserving the existing "type" and "code" fields; update the Json
construction where ChatCompletionError::GatewayError is matched to conditionally
choose the generic message for 5xx and a sanitized err string for safe cases
(use err.status_code() to branch and a small sanitizer helper to truncate/remove
sensitive patterns).

In `@src/proxy/provider.rs`:
- Around line 23-28: The provider lookup and stored config parsing are
server-side/config errors, not client validation failures; replace the
GatewayError::Validation construction in the provider registry lookup (the let
def = gateway.registry().get(provider_name)... block) with the gateway/internal
error variant (e.g., GatewayError::Internal or the project’s internal/gateway
error variant) and do the same for the similar occurrence around the api_base
parsing (the code at the other occurrence referenced in the comment). Ensure the
new error includes the same contextual message (provider name or malformed
api_base) so logs retain useful info while returning a 5xx/internal error
classification.

---

Nitpick comments:
In `@src/gateway/traits/provider.rs`:
- Around line 324-332: Add additional unit tests for build_url on DummyProvider
to cover edge cases: assert build_url("https://example.com", "ignored") returns
"https://example.com/chat/completions" (no overlapping version), assert
build_url("https://example.com/v1/", "ignored") handles trailing slash and
returns "https://example.com/v1/chat/completions", and assert
build_url("not-a-valid-url", "ignored") exercises the fallback path behavior;
add these alongside the existing build_url_avoids_duplicate_version_prefixes
test and ensure the tests reference DummyProvider::build_url to locate the
implementation.

In `@src/proxy/hooks/observability/mod.rs`:
- Around line 61-90: Add the #[fastrace::trace] attribute to each public
observability hook to enable distributed tracing: place #[fastrace::trace]
directly above the async fn declarations for record_start_time, record_usage,
record_first_token_latency, and record_streaming_usage; ensure the fastrace
macro is in scope (import if necessary) so the attribute compiles cleanly.

In `@src/proxy/provider.rs`:
- Around line 14-21: Add missing documentation comments and fastrace decorators
to the public helper functions: annotate create_legacy_provider and
create_provider_instance with a brief triple-slash doc comment describing
purpose, parameters (e.g., model, gateway) and return value, and add the
#[fastrace::trace] attribute above each function to enable distributed tracing;
ensure the docs mention that create_legacy_provider returns a Box<dyn Provider>
created from model.provider_config and that create_provider_instance returns
Result<ProviderInstance> for the given Gateway and Model so callers know
expected behavior.

In `@src/proxy/types.rs`:
- Around line 7-122: Add concise /// doc comments for each exported public type
to document the proxy request/response surface: add one-line summaries (and
optional brief notes about important fields) using /// above ChatMessage,
ChatCompletionResponseFormat, ChatCompletionRequest, ChatCompletionChoice,
ChatCompletionUsage, ChatCompletionResponse, ChatCompletionChunkDelta,
ChatCompletionChunkToolCall, ChatCompletionChunkToolCallFunction,
ChatCompletionChunkChoice, and ChatCompletionChunk; keep comments short (one
sentence) describing the role of the struct (e.g., "Represents a chat message
with role and content"), place them immediately above the pub struct
declarations, and preserve existing derives
(Serialize/Deserialize/Debug/Clone/Default).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6ec5da92-480e-40ed-8e96-ad18704b8348

📥 Commits

Reviewing files that changed from the base of the PR and between 7de3d35 and bc2f386.

📒 Files selected for processing (15)
  • src/gateway/formats/anthropic_messages.rs
  • src/gateway/streams/bridged.rs
  • src/gateway/traits/provider.rs
  • src/gateway/types/common.rs
  • src/main.rs
  • src/proxy/handlers/chat_completions/mod.rs
  • src/proxy/handlers/chat_completions/types.rs
  • src/proxy/handlers/embeddings/mod.rs
  • src/proxy/hooks/mod.rs
  • src/proxy/hooks/observability/mod.rs
  • src/proxy/hooks/rate_limit/mod.rs
  • src/proxy/mod.rs
  • src/proxy/provider.rs
  • src/proxy/types.rs
  • tests/package.json
💤 Files with no reviewable changes (1)
  • src/proxy/hooks/mod.rs

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/gateway/error.rs (1)

168-179: Use pretty_assertions::assert_eq in these new unit tests.

These assertions are in src/**, so the repo convention is to import the pretty-assertions macro for clearer diffs on failures.

As per coding guidelines, "Use pretty_assertions::assert_eq and assert_matches::assert_matches for better test output".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gateway/error.rs` around lines 168 - 179, Replace the standard assert_eq
usage in the new unit tests with the repo-preferred pretty_assertions::assert_eq
to get better diffs: either add a use pretty_assertions::assert_eq; at the top
of the test module or change the two assert_eq calls in the tests (including the
one in internal_error_is_not_retryable_and_maps_to_500 and the preceding test
that asserts GatewayError::Internal(...).to_string()) to call
pretty_assertions::assert_eq!(...) so the tests follow the project's testing
convention.
src/gateway/traits/provider.rs (1)

226-229: Switch these tests to pretty_assertions::assert_eq.

The new URL-construction coverage is useful; it should use the repo’s preferred assertion macro so failures show an actual diff.

As per coding guidelines, "Use pretty_assertions::assert_eq and assert_matches::assert_matches for better test output".

Also applies to: 355-392

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gateway/traits/provider.rs` around lines 226 - 229, Replace the standard
assert_eq usages in the tests covering the URL-construction sections with the
repo-preferred pretty_assertions::assert_eq: add use
pretty_assertions::assert_eq; inside the test module (near the existing use
statements like std::borrow::Cow, http::HeaderMap, serde_json::json) and change
the assertion calls in the blocks around the provided ranges (the
URL-construction tests and the tests at 355-392) from assert_eq!(...) to
assert_eq!(...) from pretty_assertions so test failures produce diffs; ensure
the import is only in cfg(test) scope if the module is shared with non-test
code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/proxy/handlers/chat_completions/types.rs`:
- Around line 35-43: The match arm in the error mapping in types.rs is
incorrectly classifying GatewayError::Internal as a provider_error; update the
match in the error handling (the tuple assignment for (message, error_type,
code) where GatewayError::Provider, GatewayError::Http, GatewayError::Stream are
grouped) to remove GatewayError::Internal from that provider_error branch and
add a separate branch for GatewayError::Internal that returns a distinct
message/type/code (e.g., indicating an internal/gateway configuration error) so
create_provider_instance() errors are not reported as provider errors; locate
the mapping near the error match and adjust the arms accordingly.

In `@src/proxy/provider.rs`:
- Around line 52-55: The parsed api_base value currently only checks URL syntax
via Url::parse in the base_url_override construction; add a scheme validation
after parsing to reject non-HTTP schemes by returning a GatewayError::Internal
(e.g., check url.scheme() and ensure it is "http" or "https"), so when
Some(api_base) is provided the code parses the URL and then validates scheme
(rejecting file:, mailto:, etc.) before wrapping it in Some(...).

---

Nitpick comments:
In `@src/gateway/error.rs`:
- Around line 168-179: Replace the standard assert_eq usage in the new unit
tests with the repo-preferred pretty_assertions::assert_eq to get better diffs:
either add a use pretty_assertions::assert_eq; at the top of the test module or
change the two assert_eq calls in the tests (including the one in
internal_error_is_not_retryable_and_maps_to_500 and the preceding test that
asserts GatewayError::Internal(...).to_string()) to call
pretty_assertions::assert_eq!(...) so the tests follow the project's testing
convention.

In `@src/gateway/traits/provider.rs`:
- Around line 226-229: Replace the standard assert_eq usages in the tests
covering the URL-construction sections with the repo-preferred
pretty_assertions::assert_eq: add use pretty_assertions::assert_eq; inside the
test module (near the existing use statements like std::borrow::Cow,
http::HeaderMap, serde_json::json) and change the assertion calls in the blocks
around the provided ranges (the URL-construction tests and the tests at 355-392)
from assert_eq!(...) to assert_eq!(...) from pretty_assertions so test failures
produce diffs; ensure the import is only in cfg(test) scope if the module is
shared with non-test code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0cdb7906-a566-4119-a657-9600076b6b55

📥 Commits

Reviewing files that changed from the base of the PR and between bc2f386 and 11208d9.

📒 Files selected for processing (5)
  • src/gateway/error.rs
  • src/gateway/traits/provider.rs
  • src/proxy/handlers/chat_completions/types.rs
  • src/proxy/provider.rs
  • src/proxy/types.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/proxy/types.rs

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/proxy/provider.rs (1)

16-18: Consider unifying config normalization between legacy and gateway paths.

create_provider_instance uses centralized auth/base-url parsing + validation, while create_legacy_provider delegates directly to create_provider. A shared normalization path would reduce long-term behavior drift between embeddings/chat integrations.

Also applies to: 44-72

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/proxy/provider.rs` around lines 16 - 18, The legacy path currently calls
create_provider(&model.provider_config) directly causing divergence from
create_provider_instance's centralized auth/base-url parsing and validation;
update create_legacy_provider to reuse the same normalization/validation logic
used by create_provider_instance (or call a shared helper that both
create_provider_instance and create_legacy_provider use) so provider config is
parsed, validated and normalized consistently; specifically, refactor or call
the common normalization routine used by create_provider_instance and then pass
the normalized config into create_provider (or into the unified provider
factory) to avoid behavior drift between embeddings/chat integrations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/proxy/provider.rs`:
- Around line 16-18: The legacy path currently calls
create_provider(&model.provider_config) directly causing divergence from
create_provider_instance's centralized auth/base-url parsing and validation;
update create_legacy_provider to reuse the same normalization/validation logic
used by create_provider_instance (or call a shared helper that both
create_provider_instance and create_legacy_provider use) so provider config is
parsed, validated and normalized consistently; specifically, refactor or call
the common normalization routine used by create_provider_instance and then pass
the normalized config into create_provider (or into the unified provider
factory) to avoid behavior drift between embeddings/chat integrations.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9fe94502-a13b-4eee-863d-bb3a9b0f29a4

📥 Commits

Reviewing files that changed from the base of the PR and between 11208d9 and 597c7d1.

📒 Files selected for processing (2)
  • src/proxy/handlers/chat_completions/types.rs
  • src/proxy/provider.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/proxy/handlers/chat_completions/types.rs

@bzp2010 bzp2010 merged commit 7e7fb5f into main Apr 13, 2026
10 checks passed
@bzp2010 bzp2010 deleted the bzp/feat-move-chat-to-new-gateway branch April 13, 2026 06:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant