feat: GPU flavour preferences (interconnect, region, spot)#380
Conversation
… SDK and CLI Add filtering and scheduling support for GPU interconnect type (SXM/PCIe), geographic region (US/EU/CA/APAC), spot instance preference, and InfiniBand networking across the full stack: - Rust SDK: GpuPriceQuery with region filter, GpuRequirementsSpec with interconnect/geo/spot/infiniband fields - Python SDK: expose flavour params on create_deployment, deploy, deploy_vllm, deploy_sglang, and the @deployment decorator - CLI: --interconnect, --region, --spot, --exclude-spot flags on gpu list, gpu up, and deploy commands - Updated .pyi type stubs, README, examples, and tests
WalkthroughThis PR adds GPU flavour filtering capabilities (interconnect, region, spot instance preferences) across the CLI and Python SDK by introducing new Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI User
participant CliCmd as CLI Commands
participant Handler as GPU Rental Handler
participant Helpers as Rental Helpers
participant SDK as Basilica SDK
participant API as Backend API
CLI->>CliCmd: Parse options (interconnect, region, spot, etc.)
CliCmd->>Handler: Call handle_up() with UpOptions
Handler->>Helpers: Create FlavourFilters from options
Handler->>Helpers: Call resolve_offering_unified(filters)
Helpers->>Helpers: Build GpuPriceQuery from FlavourFilters
Helpers->>SDK: Call list_secure_cloud_gpus_filtered(query)
SDK->>API: GET /gpu/prices?interconnect=...®ion=...&spot_only=...
API-->>SDK: ListSecureCloudGpusResponse with filtered offerings
SDK-->>Helpers: Vec<GpuOffering> (filtered)
Helpers-->>Handler: Selected offering
Handler-->>CLI: Deployment created with flavour preferences
CLI-->>CLI User: Success confirmation
sequenceDiagram
participant Python as Python User
participant Client as BasilicaClient
participant PyTypes as Python Types Layer
participant RustBind as Rust Bindings
participant SDK as SDK Client
Python->>Client: create GpuPriceQuery(interconnect, region, spot_only)
Python->>Client: list_secure_cloud_gpus(query)
Client->>PyTypes: Convert query to Rust GpuPriceQuery
PyTypes->>RustBind: Pass GpuPriceQuery to binding
RustBind->>SDK: Call list_secure_cloud_gpus_filtered(query)
SDK-->>RustBind: Vec<GpuOffering>
RustBind->>PyTypes: Convert to Python GpuOffering list
PyTypes-->>Client: List of filtered offerings
Client-->>Python: Return offerings with matching flavour
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/basilica-cli/src/cli/handlers/gpu_rental.rs (1)
1114-1131:⚠️ Potential issue | 🟡 MinorPrevent silent no-op flavour flags for community-only
upflows.When
--compute bourseis selected, Line 1115-1131 still accepts/passes--interconnect/--region/--spot/--exclude-spot, but those flags are not applied to community offerings. This creates confusing UX because user intent is silently ignored.🔧 Proposed fix
// Convert compute arg to cloud filter let cloud_filter = compute.map(|c| match c { ComputeCategoryArg::SecureCloud => ComputeCategory::SecureCloud, ComputeCategoryArg::CommunityCloud => ComputeCategory::CommunityCloud, }); + + if matches!(cloud_filter, Some(ComputeCategory::CommunityCloud)) + && (options.interconnect.is_some() + || options.region.is_some() + || options.spot + || options.exclude_spot) + { + return Err(CliError::Internal( + eyre!("--interconnect, --region, --spot, and --exclude-spot are only supported for The Citadel") + .suggestion("Use --compute citadel, or remove flavour flags for Bourse"), + )); + } // Build flavour filters from CLI options let flavour = crate::cli::handlers::gpu_rental_helpers::FlavourFilters { interconnect: options.interconnect.clone(), region: options.region.clone(),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/basilica-cli/src/cli/handlers/gpu_rental.rs` around lines 1114 - 1131, If the CLI was invoked for community-only "bourse" flows, don't silently accept user flavour flags: detect the community-only path by checking options.compute (match the enum/string used for "bourse"), and when it is a community-only run, clear or neutralize the flavour fields built for FlavourFilters (options.interconnect, options.region -> None; options.spot/options.exclude_spot -> false) before calling resolve_offering_unified, and emit a short warning that those flags are ignored for community offerings; update the code around the FlavourFilters construction and the resolve_offering_unified call to use the neutralized flavour when options.compute indicates bourse.
🧹 Nitpick comments (2)
crates/basilica-sdk-python/src/types.rs (1)
1852-1896:GpuPriceQuery: Consider validating mutually exclusivespot_onlyandexclude_spot.A user could construct
GpuPriceQuery(spot_only=True, exclude_spot=True), which is contradictory and would silently return zero results. Consider adding a validation check in the constructor or documenting the expected behavior.Optional: constructor validation
fn new( interconnect: Option<String>, region: Option<String>, spot_only: Option<bool>, exclude_spot: Option<bool>, - ) -> Self { + ) -> PyResult<Self> { + if spot_only == Some(true) && exclude_spot == Some(true) { + return Err(pyo3::exceptions::PyValueError::new_err( + "spot_only and exclude_spot cannot both be true", + )); + } - Self { + Ok(Self { interconnect, region, spot_only, exclude_spot, - } + }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/basilica-sdk-python/src/types.rs` around lines 1852 - 1896, GpuPriceQuery currently allows contradictory options (spot_only and exclude_spot) which can silently produce no results; update the constructor GpuPriceQuery::new (and/or implement a validator before conversion in impl From<GpuPriceQuery> for SdkGpuPriceQuery) to detect when both spot_only and exclude_spot are Some(true) and return a Python error (or panic with a clear message) instead of constructing an invalid query; ensure the error provides which fields are contradictory and stops creation when both are set.crates/basilica-sdk-python/tests/test_gpu_flavour_preferences.py (1)
105-159: Hardcoded deployment names risk collisions in parallel CI.
"pytest-flavour","pytest-no-flavour", and"pytest-no-spot"are static names. If CI runs tests in parallel or a previous run fails to clean up, these will conflict. Consider appending a short random suffix (e.g., UUID fragment) to each name.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/basilica-sdk-python/tests/test_gpu_flavour_preferences.py` around lines 105 - 159, Replace the hardcoded instance_name values in the tests test_create_deployment_with_flavour, test_create_deployment_no_flavour, and test_create_deployment_exclude_spot by appending a short random suffix to avoid CI collisions: generate a small unique token (e.g., uuid.uuid4().hex[:8] or similar) and concatenate it to the instance_name passed into client.create_deployment so each call (the instance_name parameter) becomes unique per run, ensuring cleanup and assertions still use the returned resp.instance_name for teardown.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/basilica-sdk-python/README.md`:
- Line 607: Update the table row for `35_deploy_with_flavour.py` so the
human-readable label uses "geo" instead of "region" to match the deployment API
parameter naming; specifically change the label text "Deploy with interconnect,
region, spot preferences" to "Deploy with interconnect, geo, spot preferences"
(reference the example script name `35_deploy_with_flavour.py` and ensure any
other nearby deployment example labels use `geo` consistently).
In `@crates/basilica-sdk-python/tests/test_gpu_flavour_preferences.py`:
- Around line 72-82: The test test_region_filter_reduces_results is brittle
because it assumes non-US offerings exist; change it to call
client.list_secure_cloud_gpus() and
client.list_secure_cloud_gpus(query=GpuPriceQuery(region="US")), then if
len(us_gpus) == len(all_gpus) call pytest.skip("no non-US offerings to verify
region filter") otherwise assert len(us_gpus) < len(all_gpus); update references
to the functions GpuPriceQuery, client.list_secure_cloud_gpus, and the test name
test_region_filter_reduces_results accordingly so the test soft-skips when
equality occurs instead of failing.
In `@crates/basilica-sdk/src/client.rs`:
- Around line 433-446: The URL is built by concatenating raw query values (see
the local variable url and the fields query.interconnect, query.region,
query.spot_only, query.exclude_spot) which can introduce unencoded reserved
characters; change the code to construct the request with proper URL-encoding
instead of string interpolation—either build a Url with
url::Url::parse_with_params or use the HTTP client's .query(...) mechanism and
pass a serializable struct/map of params, then call self.get(...) with the
encoded URL/Request so the ListSecureCloudGpusResponse call remains unchanged.
In `@examples/35_deploy_with_flavour.py`:
- Around line 30-45: The example "deploy with flavour" calls client.deploy for
"flavour-hello" but omits flavour-related parameters; update the client.deploy
call (function client.deploy, deployment name "flavour-hello") to include
flavour fields such as interconnect, geo, and spot (and gpu fields if
applicable) to demonstrate flavour usage—add appropriate keys like
interconnect=<value>, geo=<value>, spot=<bool> (and gpu_type/gpu_count if
supported) so the example actually exercises the flavour parameters.
---
Outside diff comments:
In `@crates/basilica-cli/src/cli/handlers/gpu_rental.rs`:
- Around line 1114-1131: If the CLI was invoked for community-only "bourse"
flows, don't silently accept user flavour flags: detect the community-only path
by checking options.compute (match the enum/string used for "bourse"), and when
it is a community-only run, clear or neutralize the flavour fields built for
FlavourFilters (options.interconnect, options.region -> None;
options.spot/options.exclude_spot -> false) before calling
resolve_offering_unified, and emit a short warning that those flags are ignored
for community offerings; update the code around the FlavourFilters construction
and the resolve_offering_unified call to use the neutralized flavour when
options.compute indicates bourse.
---
Nitpick comments:
In `@crates/basilica-sdk-python/src/types.rs`:
- Around line 1852-1896: GpuPriceQuery currently allows contradictory options
(spot_only and exclude_spot) which can silently produce no results; update the
constructor GpuPriceQuery::new (and/or implement a validator before conversion
in impl From<GpuPriceQuery> for SdkGpuPriceQuery) to detect when both spot_only
and exclude_spot are Some(true) and return a Python error (or panic with a clear
message) instead of constructing an invalid query; ensure the error provides
which fields are contradictory and stops creation when both are set.
In `@crates/basilica-sdk-python/tests/test_gpu_flavour_preferences.py`:
- Around line 105-159: Replace the hardcoded instance_name values in the tests
test_create_deployment_with_flavour, test_create_deployment_no_flavour, and
test_create_deployment_exclude_spot by appending a short random suffix to avoid
CI collisions: generate a small unique token (e.g., uuid.uuid4().hex[:8] or
similar) and concatenate it to the instance_name passed into
client.create_deployment so each call (the instance_name parameter) becomes
unique per run, ensuring cleanup and assertions still use the returned
resp.instance_name for teardown.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
crates/basilica-cli/src/cli/commands.rscrates/basilica-cli/src/cli/handlers/deploy/create.rscrates/basilica-cli/src/cli/handlers/deploy/templates/sglang.rscrates/basilica-cli/src/cli/handlers/deploy/templates/vllm.rscrates/basilica-cli/src/cli/handlers/gpu_rental.rscrates/basilica-cli/src/cli/handlers/gpu_rental_helpers.rscrates/basilica-sdk-python/README.mdcrates/basilica-sdk-python/python/basilica/__init__.pycrates/basilica-sdk-python/python/basilica/_basilica.pyicrates/basilica-sdk-python/python/basilica/decorators.pycrates/basilica-sdk-python/python/basilica/spec.pycrates/basilica-sdk-python/src/lib.rscrates/basilica-sdk-python/src/types.rscrates/basilica-sdk-python/tests/test_gpu_flavour_preferences.pycrates/basilica-sdk/src/client.rscrates/basilica-sdk/src/types.rsexamples/34_gpu_flavour_preferences.pyexamples/35_deploy_with_flavour.py
| | `13_lobe_chat_vllm.py` | Full AI stack (LobeChat + vLLM) | | ||
| | `14_streamlit.py` | Interactive Streamlit app | | ||
| | `34_gpu_flavour_preferences.py` | Query GPU offerings with flavour filters | | ||
| | `35_deploy_with_flavour.py` | Deploy with interconnect, region, spot preferences | |
There was a problem hiding this comment.
Use geo wording for deployment example label to match API parameters.
At Line [607], “region” is ambiguous in deployment context because deployment methods use geo while listing filters use region.
📝 Suggested wording change
-| `35_deploy_with_flavour.py` | Deploy with interconnect, region, spot preferences |
+| `35_deploy_with_flavour.py` | Deploy with interconnect, geo, spot preferences |📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| | `35_deploy_with_flavour.py` | Deploy with interconnect, region, spot preferences | | |
| | `35_deploy_with_flavour.py` | Deploy with interconnect, geo, spot preferences | |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/basilica-sdk-python/README.md` at line 607, Update the table row for
`35_deploy_with_flavour.py` so the human-readable label uses "geo" instead of
"region" to match the deployment API parameter naming; specifically change the
label text "Deploy with interconnect, region, spot preferences" to "Deploy with
interconnect, geo, spot preferences" (reference the example script name
`35_deploy_with_flavour.py` and ensure any other nearby deployment example
labels use `geo` consistently).
| def test_region_filter_reduces_results(self, client, can_list_gpus): | ||
| """Verify region=US actually filters out non-US offerings.""" | ||
| if not can_list_gpus: | ||
| pytest.skip("token lacks secure-cloud permission") | ||
| all_gpus = client.list_secure_cloud_gpus() | ||
| us_gpus = client.list_secure_cloud_gpus( | ||
| query=GpuPriceQuery(region="US") | ||
| ) | ||
| assert len(us_gpus) < len(all_gpus), ( | ||
| f"region=US returned {len(us_gpus)} offerings, same as unfiltered {len(all_gpus)}" | ||
| ) |
There was a problem hiding this comment.
Brittle assertion: test_region_filter_reduces_results assumes non-US offerings always exist.
If the API ever returns only US-based offerings (e.g., during a provider outage or inventory change), len(us_gpus) < len(all_gpus) fails with a confusing message. Consider using <= with a pytest.skip when counts are equal, or at minimum softening the assertion.
Suggested fix
all_gpus = client.list_secure_cloud_gpus()
us_gpus = client.list_secure_cloud_gpus(
query=GpuPriceQuery(region="US")
)
- assert len(us_gpus) < len(all_gpus), (
- f"region=US returned {len(us_gpus)} offerings, same as unfiltered {len(all_gpus)}"
- )
+ if len(us_gpus) == len(all_gpus):
+ pytest.skip(
+ f"All {len(all_gpus)} offerings are US-based; cannot verify region filter reduces results"
+ )
+ assert len(us_gpus) < len(all_gpus)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_region_filter_reduces_results(self, client, can_list_gpus): | |
| """Verify region=US actually filters out non-US offerings.""" | |
| if not can_list_gpus: | |
| pytest.skip("token lacks secure-cloud permission") | |
| all_gpus = client.list_secure_cloud_gpus() | |
| us_gpus = client.list_secure_cloud_gpus( | |
| query=GpuPriceQuery(region="US") | |
| ) | |
| assert len(us_gpus) < len(all_gpus), ( | |
| f"region=US returned {len(us_gpus)} offerings, same as unfiltered {len(all_gpus)}" | |
| ) | |
| def test_region_filter_reduces_results(self, client, can_list_gpus): | |
| """Verify region=US actually filters out non-US offerings.""" | |
| if not can_list_gpus: | |
| pytest.skip("token lacks secure-cloud permission") | |
| all_gpus = client.list_secure_cloud_gpus() | |
| us_gpus = client.list_secure_cloud_gpus( | |
| query=GpuPriceQuery(region="US") | |
| ) | |
| if len(us_gpus) == len(all_gpus): | |
| pytest.skip( | |
| f"All {len(all_gpus)} offerings are US-based; cannot verify region filter reduces results" | |
| ) | |
| assert len(us_gpus) < len(all_gpus) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/basilica-sdk-python/tests/test_gpu_flavour_preferences.py` around
lines 72 - 82, The test test_region_filter_reduces_results is brittle because it
assumes non-US offerings exist; change it to call
client.list_secure_cloud_gpus() and
client.list_secure_cloud_gpus(query=GpuPriceQuery(region="US")), then if
len(us_gpus) == len(all_gpus) call pytest.skip("no non-US offerings to verify
region filter") otherwise assert len(us_gpus) < len(all_gpus); update references
to the functions GpuPriceQuery, client.list_secure_cloud_gpus, and the test name
test_region_filter_reduces_results accordingly so the test soft-skips when
equality occurs instead of failing.
| let mut url = String::from("/secure-cloud/gpu-prices?available_only=true"); | ||
| if let Some(ref interconnect) = query.interconnect { | ||
| url.push_str(&format!("&interconnect={}", interconnect)); | ||
| } | ||
| if let Some(ref region) = query.region { | ||
| url.push_str(&format!("®ion={}", region)); | ||
| } | ||
| if let Some(true) = query.spot_only { | ||
| url.push_str("&spot_only=true"); | ||
| } | ||
| if let Some(true) = query.exclude_spot { | ||
| url.push_str("&exclude_spot=true"); | ||
| } | ||
| let response: crate::types::ListSecureCloudGpusResponse = self.get(&url).await?; |
There was a problem hiding this comment.
URL query values are not encoded before request construction.
Line 435 and Line 438 interpolate raw values into the URL, so filter values containing reserved characters can corrupt query parsing and change request behavior. Build the request with .query(...) (or encode values) instead of manual concatenation.
🔧 Proposed fix
pub async fn list_secure_cloud_gpus_filtered(
&self,
query: &crate::types::GpuPriceQuery,
) -> Result<Vec<crate::types::GpuOffering>> {
- let mut url = String::from("/secure-cloud/gpu-prices?available_only=true");
- if let Some(ref interconnect) = query.interconnect {
- url.push_str(&format!("&interconnect={}", interconnect));
- }
- if let Some(ref region) = query.region {
- url.push_str(&format!("®ion={}", region));
- }
- if let Some(true) = query.spot_only {
- url.push_str("&spot_only=true");
- }
- if let Some(true) = query.exclude_spot {
- url.push_str("&exclude_spot=true");
- }
- let response: crate::types::ListSecureCloudGpusResponse = self.get(&url).await?;
+ let url = format!("{}/secure-cloud/gpu-prices", self.base_url);
+ let mut params: Vec<(&str, String)> = vec![("available_only", "true".to_string())];
+
+ if let Some(interconnect) = &query.interconnect {
+ params.push(("interconnect", interconnect.clone()));
+ }
+ if let Some(region) = &query.region {
+ params.push(("region", region.clone()));
+ }
+ if let Some(true) = query.spot_only {
+ params.push(("spot_only", "true".to_string()));
+ }
+ if let Some(true) = query.exclude_spot {
+ params.push(("exclude_spot", "true".to_string()));
+ }
+
+ let request = self.http_client.get(&url).query(¶ms);
+ let request = self.apply_auth(request).await?;
+ let response = request.send().await.map_err(ApiError::HttpClient)?;
+ let response: crate::types::ListSecureCloudGpusResponse = self.handle_response(response).await?;
Ok(response.nodes)
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/basilica-sdk/src/client.rs` around lines 433 - 446, The URL is built
by concatenating raw query values (see the local variable url and the fields
query.interconnect, query.region, query.spot_only, query.exclude_spot) which can
introduce unencoded reserved characters; change the code to construct the
request with proper URL-encoding instead of string interpolation—either build a
Url with url::Url::parse_with_params or use the HTTP client's .query(...)
mechanism and pass a serializable struct/map of params, then call self.get(...)
with the encoded URL/Request so the ListSecureCloudGpusResponse call remains
unchanged.
| # --- deploy with flavour (waits until ready) --- | ||
| print("\ndeploy with flavour...") | ||
| deployment = client.deploy( | ||
| name="flavour-hello", | ||
| source=""" | ||
| from http.server import HTTPServer, BaseHTTPRequestHandler | ||
| class H(BaseHTTPRequestHandler): | ||
| def do_GET(self): | ||
| self.send_response(200) | ||
| self.end_headers() | ||
| self.wfile.write(b'hello from flavour deploy') | ||
| HTTPServer(('', 8000), H).serve_forever() | ||
| """, | ||
| port=8000, | ||
| ttl_seconds=120, | ||
| ) |
There was a problem hiding this comment.
deploy with flavour example currently omits flavour parameters.
At Line [30], the section title says “deploy with flavour,” but the call at Line [32] doesn’t pass interconnect/geo/spot (or GPU fields), so it doesn’t actually demonstrate flavour usage.
✏️ Proposed fix
deployment = client.deploy(
name="flavour-hello",
source="""
from http.server import HTTPServer, BaseHTTPRequestHandler
class H(BaseHTTPRequestHandler):
@@
HTTPServer(('', 8000), H).serve_forever()
""",
port=8000,
+ gpu_count=1,
+ gpu_models=["H100"],
+ interconnect="SXM",
+ geo="US",
+ spot=False,
ttl_seconds=120,
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/35_deploy_with_flavour.py` around lines 30 - 45, The example "deploy
with flavour" calls client.deploy for "flavour-hello" but omits flavour-related
parameters; update the client.deploy call (function client.deploy, deployment
name "flavour-hello") to include flavour fields such as interconnect, geo, and
spot (and gpu fields if applicable) to demonstrate flavour usage—add appropriate
keys like interconnect=<value>, geo=<value>, spot=<bool> (and gpu_type/gpu_count
if supported) so the example actually exercises the flavour parameters.
Summary
GpuPriceQuerywith region filter for listing offerings,GpuRequirementsSpecwith interconnect/geo/spot/infiniband for deploymentscreate_deployment,deploy,deploy_vllm,deploy_sglang,@deploymentdecorator, and async variants--interconnect,--region,--spot,--exclude-spotflags ongpu list,gpu up, anddeploycommands.pyitype stubs, README documentation, examples, and testsTest plan
test_gpu_flavour_preferences.pybasilica.ai/interconnect=SXM5label via kubectlSummary by CodeRabbit
Release Notes
New Features
Documentation