RSPEED-2827: validate model exists before rlsapi_v1 inference#1471
RSPEED-2827: validate model exists before rlsapi_v1 inference#1471major wants to merge 1 commit intolightspeed-core:mainfrom
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 20 minutes and 17 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
WalkthroughThis pull request introduces model validation to the inference endpoint. A new helper function Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Line 695: The OpenAPI responses for infer_endpoint are missing the 404 path
that can be returned by _resolve_validated_model_id(); update the
infer_responses declaration to include a 404 response (e.g., status code "404"
with an appropriate description and error schema or reference used by other
endpoints) so the infer_endpoint contract advertises "model not found" responses
returned by _resolve_validated_model_id().
In `@tests/unit/app/endpoints/test_rlsapi_v1.py`:
- Around line 563-565: The test currently asserts on a lowercased string of the
exception detail which is brittle; update the assertions in test_rlsapi_v1
(around the failing block using exc_info) to treat exc_info.value.detail as a
structured dict and assert on specific fields: check that detail["response"]
equals status.HTTP_404_NOT_FOUND (or the expected response text) and assert
detail["cause"] contains or equals the expected cause message (use
case-insensitive comparison if needed). Reference the existing exc_info variable
and the detail attribute to locate and replace the two string-based asserts with
direct assertions on detail["response"] and detail["cause"].
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: ca91af11-5b3a-416c-8a7b-84a0c1291f51
📒 Files selected for processing (2)
src/app/endpoints/rlsapi_v1.pytests/unit/app/endpoints/test_rlsapi_v1.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: build-pr
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E: library mode / ci
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async unit tests
Files:
tests/unit/app/endpoints/test_rlsapi_v1.py
tests/unit/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
pytest-mockfor AsyncMock objects in unit tests
Files:
tests/unit/app/endpoints/test_rlsapi_v1.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules (e.g.,from authentication import get_auth_dependency)
Usefrom llama_stack_client import AsyncLlamaStackClientfor Llama Stack imports
Checkconstants.pyfor shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax:str | intinstead ofUnion[str, int]
UseOptional[Type]for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Useasync deffor I/O operations and external API calls
HandleAPIConnectionErrorfrom Llama Stack in error handling
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Use complete type annotations for all class attributes; avoid usingAny
Follow Google Python docstring conventions for all modules, classes, and functions
IncludeParameters:,Returns:,Raises:sections in function docstrings as needed
Files:
src/app/endpoints/rlsapi_v1.py
src/app/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/app/**/*.py: Usefrom fastapi import APIRouter, HTTPException, Request, status, Dependsfor FastAPI dependencies
Use FastAPIHTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/rlsapi_v1.py
🧠 Learnings (3)
📓 Common learnings
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1469
File: src/models/config.py:1928-1933
Timestamp: 2026-04-07T14:44:42.022Z
Learning: In lightspeed-core/lightspeed-stack, `allow_verbose_infer` (previously `customization.allow_verbose_infer`, now `rlsapi_v1.allow_verbose_infer`) is only used internally by the `rlsapi_v1` `/infer` endpoint and has a single known consumer (the PR author). Backward compatibility for this config field relocation is intentionally not required and should not be flagged in future reviews.
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to tests/unit/**/*.py : Use `pytest-mock` for AsyncMock objects in unit tests
Applied to files:
tests/unit/app/endpoints/test_rlsapi_v1.py
📚 Learning: 2026-04-06T20:18:07.852Z
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.
Applied to files:
src/app/endpoints/rlsapi_v1.py
🪛 GitHub Actions: Integration tests
src/app/endpoints/rlsapi_v1.py
[error] 256-256: infer_endpoint failed while resolving validated model id: 'if not await check_model_configured(client, model_id)' raised TypeError from mocked client.
[warning] authentication.noop: No-op authentication dependency in insecure mode (development-only).
🔇 Additional comments (2)
tests/unit/app/endpoints/test_rlsapi_v1.py (1)
149-160: Good default isolation for model validation path.The autouse fixture is a solid baseline and correctly uses
pytest-mockAsyncMock, making endpoint tests deterministic after the new validation call.src/app/endpoints/rlsapi_v1.py (1)
254-257: No changes needed. The code already properly handles validation-time connection failures.
check_model_configured()explicitly catchesAPIConnectionErrorand converts it toHTTPExceptionwithServiceUnavailableResponse(503). When_resolve_validated_model_id()calls this function, the exception is already handled withincheck_model_configured(), so anyAPIConnectionErrorwill propagate as anHTTPExceptionbefore reaching the caller. The docstring correctly documents thatHTTPException: 503can be raised during validation, and the endpoint's response schema includes the 503 status code.The implementation is correct and compliant with the coding guidelines for handling
APIConnectionErrorfrom Llama Stack.
153d6ae to
6c6a083
Compare
Add check_model_configured() call to the /infer handler so a misconfigured default_model/default_provider gets a clear 404 instead of an opaque 500 from the inference call. Matches the existing pattern in responses.py. Extract validation into _resolve_validated_model_id() to keep infer_endpoint complexity at B(9). Signed-off-by: Major Hayden <major@redhat.com>
6c6a083 to
0984fb5
Compare
Description
The rlsapi_v1
/inferendpoint skips model existence validation whendefault_modelanddefault_providerare both configured. If eithervalue has a typo or the model was removed from Llama Stack, inference
fails with an opaque 500 instead of a useful error.
This adds
check_model_configured()before inference, returning aclean 404 with
NotFoundResponsewhen the model isn't found. Matchesthe existing pattern in
responses.py(line 208).Validation is extracted into
_resolve_validated_model_id()soinfer_endpointcomplexity stays at B(9).Type of change
Tools used to create PR
Related Tickets & Documents
Checklist before requesting a review
Testing
uv run pytest tests/unit/app/endpoints/test_rlsapi_v1.py- 64 tests passuv run make verify- all linters cleanuv run radon cc src/app/endpoints/rlsapi_v1.py -s -n C- no C+ ratingstest_infer_model_not_found_returns_404verifies the 404 pathSummary by CodeRabbit
Release Notes