Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/week3 #2

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Feature/week3 #2

wants to merge 15 commits into from

Conversation

srues2
Copy link
Collaborator

@srues2 srues2 commented Nov 18, 2024

Release Notes

New Features

  • Introduced A/B testing framework for credit default prediction.
  • Added functionality for creating and serving feature tables in Databricks.
  • Implemented model serving endpoints for real-time predictions.

Improvements

  • Enhanced error handling and logging in data preprocessing and cleaning processes.
  • Updated configuration files to reflect new catalog and schema names.
  • Improved README documentation to clarify recent changes and new features.

Bug Fixes

  • Resolved compatibility issues with pyarrow and mlflow.

  • Chores

  • Updated project dependencies and versioning in pyproject.toml.

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced functionality for serving feature tables and machine learning models in Databricks.
    • Added A/B testing framework for machine learning models, allowing for performance comparison.
  • Enhancements

    • Improved feature engineering by including "AverageTemperature" in datasets.
    • Updated model registration to reflect a focus on sleep efficiency metrics.
    • Enhanced error handling and performance measurement in model serving scripts.
  • Documentation

    • Comprehensive README added for Week 3, detailing feature and model serving processes.
    • Updated project configuration to include A/B testing parameters.
  • Bug Fixes

    • Simplified installation commands and environment restart processes for clarity.

@srues2 srues2 requested a review from a team as a code owner November 18, 2024 12:54
Copy link

coderabbitai bot commented Nov 18, 2024

Walkthrough

This pull request introduces several enhancements across multiple scripts in a Databricks environment. Key changes include the addition of a feature engineering model for sleep efficiency, the creation of feature and model serving scripts, and the implementation of A/B testing for machine learning models. The modifications also encompass updates to configuration files to support new features and parameters, improving the overall functionality and clarity of the codebase.

Changes

File Path Change Summary
notebooks/week1&2/05.log_and_register_fe_model.py Updated model registration name from "house_prices_model_fe" to "sleep_efficiencies_model_fe"; added "AverageTemperature" feature to datasets.
notebooks/week3/01.feature_serving.py Introduced a script for creating and serving a feature table using Delta tables; added functionality for predictions and serving endpoints.
notebooks/week3/02.model_serving.py Created a script for serving a machine learning model; includes endpoint creation and load testing functionality.
notebooks/week3/03.model_serving_feature_lookup.py Added functionality for creating an online table and serving a model endpoint; includes real-time updates and API request handling.
notebooks/week3/04.AB_test_model_serving.py Implemented A/B testing framework for models; includes model training, logging, and serving endpoint creation.
notebooks/week3/README.md Updated documentation to provide an overview of Week 3 materials, covering feature serving, model serving, and A/B testing.
project_config.yml Added A/B testing parameters under the ab_test key; existing parameters remain unchanged.
src/sleep_efficiency/config.py Introduced a new ab_test attribute in the ProjectConfig class to hold A/B test parameters.

Possibly related PRs

Suggested reviewers

  • basakeskili

🐰 In the garden, where data flows,
A model for sleep efficiency grows.
With features and tests, we hop with delight,
Serving predictions, both day and night.
A/B testing brings joy to our quest,
In the world of data, we strive for the best! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Outside diff range and nitpick comments (22)
src/sleep_efficiency/config.py (2)

Line range hint 7-16: Add validation for A/B test parameters.

Consider adding specific validation rules for A/B test parameters to ensure required fields are present and values are within expected ranges.

Example implementation:

from typing import Any, Dict, List
from pydantic import BaseModel, Field

class ABTestConfig(BaseModel):
    learning_rate_a: float = Field(gt=0, lt=1)
    learning_rate_b: float = Field(gt=0, lt=1)
    max_depth_a: int = Field(gt=0)
    max_depth_b: int = Field(gt=0)

class ProjectConfig(BaseModel):
    # ... existing fields ...
    ab_test: ABTestConfig

Line range hint 7-24: Consider adding configuration versioning and documentation.

As this configuration is critical for the A/B testing framework and used across multiple notebooks, consider:

  1. Adding a version field to track configuration schema changes
  2. Including a docstring describing the expected structure of the YAML configuration
  3. Adding validation for backward compatibility

Example implementation:

class ProjectConfig(BaseModel):
    """Project configuration for sleep efficiency model.
    
    Expected YAML structure:
    ```yaml
    version: "1.0"
    ab_test:
      learning_rate_a: float  # Learning rate for model A (0-1)
      learning_rate_b: float  # Learning rate for model B (0-1)
      max_depth_a: int       # Max depth for model A (>0)
      max_depth_b: int       # Max depth for model B (>0)
    ```
    """
    version: str = "1.0"  # Configuration schema version
    # ... rest of the fields ...
notebooks/week3/README.md (6)

9-10: Fix grammatical issues in the overview section.

Add a comma after "Last week" and consider rephrasing for better clarity:

-Last week we demonstrated model training and registering for different use cases.
+Last week, we demonstrated model training and registering for different use cases.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~9-~9: Possible missing comma found.
Context: ...red in this lecture. ## Overview Last week we demonstrated model training and regi...

(AI_HYDRA_LEO_MISSING_COMMA)


15-17: Specify language for code blocks.

Add Python language specification to the code block:

-```
+```python
01.feature_serving.py
🧰 Tools
🪛 Markdownlint

15-15: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


27-28: Fix grammar in feature serving explanation.

Correct the subject-verb agreement:

-The subsequent code examples shows how to invoke this endpoint and get responses.
+The subsequent code examples show how to invoke this endpoint and get responses.
🧰 Tools
🪛 LanguageTool

[grammar] ~27-~27: The verb form ‘shows’ does not seem to match the subject ‘examples’.
Context: ...e lookups. The subsequent code examples shows how to invoke this endpoint and get res...

(SUBJECT_VERB_AGREEMENT_PLURAL)


31-32: Fix formatting and grammar in model serving section.

  1. Add Python language specification to the code block
  2. Add missing articles:
-Model serving is a process of creating a model serving endpoint that can be used for inference. Endpoint creation process is similar to feature serving
+Model serving is a process of creating a model serving endpoint that can be used for inference. The endpoint creation process is similar to feature serving

-We also added an example piece of code for simple load test to get average latency.
+We also added an example piece of code for a simple load test to get average latency.

Also applies to: 33-33, 40-40


44-45: Fix formatting and spelling in feature lookup section.

  1. Add Python language specification to the code block
  2. Use consistent list style (asterisks instead of dashes)
  3. Fix spelling errors:
-- We start with creating an online table for existing offline feature table
+* We start with creating an online table for existing offline feature table

-Next is the same as in the previous notebook, we create an endpoint using the model we registred
+Next is the same as in the previous notebook, we create an endpoint using the model we registered

Also applies to: 50-54


1-70: Consider adding additional sections to enhance documentation.

The documentation provides good coverage of the implementation details. Consider adding:

  1. A section on error handling and logging improvements mentioned in PR objectives
  2. Compatibility notes regarding pyarrow and mlflow integration
  3. A troubleshooting guide for common issues

Would you like me to help draft these additional sections?

🧰 Tools
🪛 LanguageTool

[uncategorized] ~9-~9: Possible missing comma found.
Context: ...red in this lecture. ## Overview Last week we demonstrated model training and regi...

(AI_HYDRA_LEO_MISSING_COMMA)


[grammar] ~27-~27: The verb form ‘shows’ does not seem to match the subject ‘examples’.
Context: ...e lookups. The subsequent code examples shows how to invoke this endpoint and get res...

(SUBJECT_VERB_AGREEMENT_PLURAL)


[uncategorized] ~33-~33: You might be missing the article “the” here.
Context: ...ndpoint that can be used for inference. Endpoint creation process is similar to feature ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~37-~37: You might be missing the article “the” here.
Context: ... the model. It's important to note that entity name we pass is a registered model name...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~40-~40: You might be missing the article “a” here.
Context: ...also added an example piece of code for simple load test to get average latency. ### ...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~47-~47: The preposition ‘to’ seems more likely in this position.
Context: ...rained model and create a feature table for look up. Then we create a model serving...

(AI_HYDRA_LEO_REPLACE_FOR_TO)


[typographical] ~50-~50: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...he table we created last week on week 2 - 05.log_and_register_fe_model.py noteboo...

(DASH_RULE)


[uncategorized] ~51-~51: You might be missing the article “the” here.
Context: ...ed for our model to look up features at serving endpoint. - Next is the same as in the ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[typographical] ~52-~52: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e registred in the same notebook week 2 - 05.log_and_register_fe_model.p. This is...

(DASH_RULE)


[uncategorized] ~53-~53: You might be missing the article “a” here.
Context: ...lookup and feature func. - When we send request to the model endpoint, this time, we wo...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[typographical] ~64-~64: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e the same approach as we did in week 2 - 03.log_and_register_model.py. - We trai...

(DASH_RULE)


[uncategorized] ~70-~70: This verb may not be in the correct form. Consider using a different form for this context.
Context: ..., similar to our previous examples, and shows how to invoke it.

(AI_EN_LECTOR_REPLACEMENT_VERB_FORM)

🪛 Markdownlint

50-50: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


51-51: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


52-52: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


53-53: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


63-63: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


64-64: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


65-65: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


66-66: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


67-67: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


68-68: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


69-69: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


70-70: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


15-15: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


30-30: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


43-43: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


57-57: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

notebooks/week1&2/05.log_and_register_fe_model.py (3)

179-187: Document feature selection rationale.

The addition of "AverageTemperature" as a feature lacks documentation explaining its significance for sleep efficiency prediction. Additionally, there's commented code about sleep_hours_duration that should be cleaned up.

Consider:

  1. Adding a comment explaining why temperature affects sleep efficiency
  2. Removing the commented code about sleep_hours_duration since it's not being used

229-229: Consider adding version information to the model name.

While the model name change from house prices to sleep efficiencies is appropriate, consider incorporating version information in the model name for better tracking and A/B testing support.

Example:

-    model_uri=f"runs:/{run_id}/lightgbm-pipeline-model-fe", name=f"{catalog_name}.{schema_name}.sleep_efficiencies_model_fe"
+    model_uri=f"runs:/{run_id}/lightgbm-pipeline-model-fe", name=f"{catalog_name}.{schema_name}.sleep_efficiencies_model_fe_v1"

Line range hint 1-230: Consider splitting responsibilities and adding error handling.

The script currently handles multiple concerns (feature engineering, model training, and registration) which could be separated for better maintainability. Additionally, error handling should be added for feature lookups and model training.

Consider:

  1. Splitting into separate notebooks for feature engineering and model training
  2. Adding try-catch blocks around feature lookups and model training
  3. Adding logging for better observability
notebooks/week3/03.model_serving_feature_lookup.py (5)

94-94: Inconsistent feature names in the comment

The comment mentions excluding "OverallQual", "GrLivArea", and "GarageCars", which seem unrelated to sleep efficiency. These features appear to be from a different dataset, possibly housing data. Please update the comment to reflect the correct features relevant to sleep efficiency.


120-120: Useless expression train_set.dtypes

The expression train_set.dtypes is evaluated but not assigned or used elsewhere in the code. If this was intended for debugging, consider removing it or assigning the result to a variable for further use.

🧰 Tools
🪛 Ruff

120-120: Found useless expression. Either assign it to a variable or remove it.

(B018)


124-124: Redundant evaluation of dataframe_records[0]

The expression dataframe_records[0] is evaluated but not used or assigned. If this was meant for debugging purposes, consider removing it or utilizing it meaningfully within the code.


128-128: Module-level import not at the top of the file

The import statement import pandas as pd should be placed at the top of the file according to PEP 8 guidelines. This improves code readability and maintains consistency.

Apply this diff to move the import statement:

17a18
+ import pandas as pd
128d128
- import pandas as pd
🧰 Tools
🪛 Ruff

128-128: Module level import not at top of file

(E402)


158-158: Useless expression sleep_features.dtypes

The expression sleep_features.dtypes is evaluated but not used or assigned to any variable. If this was intended for debugging, consider removing it or using the result appropriately.

🧰 Tools
🪛 Ruff

158-158: Found useless expression. Either assign it to a variable or remove it.

(B018)

notebooks/week3/02.model_serving.py (3)

133-133: Correct typo in print statement.

There's a typo in the print statement: "Reponse" should be "Response".

Apply this diff to fix the typo:

-print("Reponse text:", response.text)
+print("Response text:", response.text)

93-109: Remove irrelevant example payload in the docstring.

The multiline string contains an example payload unrelated to the current context, which may cause confusion.

Apply this diff to remove the unnecessary code:

-"""
-Each body should be list of json with columns
-
-[{'LotFrontage': 78.0,
-  'LotArea': 9317,
-  'OverallQual': 6,
-  'OverallCond': 5,
-  'YearBuilt': 2006,
-  'Exterior1st': 'VinylSd',
-  'Exterior2nd': 'VinylSd',
-  'MasVnrType': 'None',
-  'Foundation': 'PConc',
-  'Heating': 'GasA',
-  'CentralAir': 'Y',
-  'SaleType': 'WD',
-  'SaleCondition': 'Normal'}]
-"""

42-45: Parameterize the model version instead of hardcoding.

Hardcoding entity_version=3 and the model name with version can lead to maintenance issues. Parameterizing the model version makes the code more adaptable to changes.

Introduce a variable for the model version:

model_version = 3  # Update as necessary or retrieve dynamically

Update the code to use this variable:

             entity_name=f"{catalog_name}.{schema_name}.sleep_efficiency_model_basic",
             scale_to_zero_enabled=True,
             workload_size="Small",
-            entity_version=3,
+            entity_version=model_version,
         traffic_config=TrafficConfig(
-            routes=[Route(served_model_name="sleep_efficiency_model_basic-3", traffic_percentage=100)]
+            routes=[Route(served_model_name=f"sleep_efficiency_model_basic-{model_version}", traffic_percentage=100)]
         ),

Also applies to: 50-50

notebooks/week3/04.AB_test_model_serving.py (2)

269-269: Remove or utilize the unnecessary expression at line 269

The standalone expression predictions doesn't affect the program's execution. If the intention is to display the predictions, consider using a print statement or logging.

Apply this change to display the predictions:

-predictions
+print(predictions)
🧰 Tools
🪛 Ruff

269-269: Found useless expression. Either assign it to a variable or remove it.

(B018)


340-344: Add error handling for the HTTP request to the serving endpoint

Currently, the requests.post call lacks error handling, which may lead to unhandled exceptions in case of network failures or unexpected server responses. Implementing error handling will make the code more robust.

Here's a suggested refactor:

+try:
     response = requests.post(
         f"{model_serving_endpoint}",
         headers={"Authorization": f"Bearer {token}"},
         json={"dataframe_records": dataframe_records[0]},
     )
+    response.raise_for_status()
+except requests.exceptions.HTTPError as http_err:
+    print(f"HTTP error occurred: {http_err}")
+except Exception as err:
+    print(f"An error occurred: {err}")
notebooks/week3/01.feature_serving.py (1)

253-254: Verify the response status codes during load testing

In the load test, the status codes of the responses are retrieved but not checked. It's recommended to verify that all requests were successful and handle any failed requests to ensure accurate performance measurements.

Consider updating the code to check for successful status codes:

for future in as_completed(futures):
    status_code, latency = future.result()
    if status_code == 200:
        latencies.append(latency)
    else:
        print(f"Request failed with status code: {status_code}")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2d7c333 and 50c6938.

📒 Files selected for processing (8)
  • notebooks/week1&2/05.log_and_register_fe_model.py (3 hunks)
  • notebooks/week3/01.feature_serving.py (1 hunks)
  • notebooks/week3/02.model_serving.py (1 hunks)
  • notebooks/week3/03.model_serving_feature_lookup.py (1 hunks)
  • notebooks/week3/04.AB_test_model_serving.py (1 hunks)
  • notebooks/week3/README.md (1 hunks)
  • project_config.yml (1 hunks)
  • src/sleep_efficiency/config.py (1 hunks)
🧰 Additional context used
🪛 Ruff
notebooks/week1&2/05.log_and_register_fe_model.py

3-3: SyntaxError: Expected a statement


3-3: SyntaxError: Simple statements must be separated by newlines or semicolons


3-3: SyntaxError: Expected an identifier


3-3: SyntaxError: Expected an identifier


3-3: SyntaxError: Simple statements must be separated by newlines or semicolons

notebooks/week3/01.feature_serving.py

174-174: Undefined name dbutils

(F821)


183-183: Undefined name display

(F821)

notebooks/week3/02.model_serving.py

62-62: Undefined name dbutils

(F821)

notebooks/week3/03.model_serving_feature_lookup.py

89-89: Undefined name dbutils

(F821)


120-120: Found useless expression. Either assign it to a variable or remove it.

(B018)


128-128: Module level import not at top of file

(E402)


158-158: Found useless expression. Either assign it to a variable or remove it.

(B018)

notebooks/week3/04.AB_test_model_serving.py

269-269: Found useless expression. Either assign it to a variable or remove it.

(B018)


301-301: Undefined name dbutils

(F821)

🪛 LanguageTool
notebooks/week3/README.md

[uncategorized] ~9-~9: Possible missing comma found.
Context: ...red in this lecture. ## Overview Last week we demonstrated model training and regi...

(AI_HYDRA_LEO_MISSING_COMMA)


[grammar] ~27-~27: The verb form ‘shows’ does not seem to match the subject ‘examples’.
Context: ...e lookups. The subsequent code examples shows how to invoke this endpoint and get res...

(SUBJECT_VERB_AGREEMENT_PLURAL)


[uncategorized] ~33-~33: You might be missing the article “the” here.
Context: ...ndpoint that can be used for inference. Endpoint creation process is similar to feature ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~37-~37: You might be missing the article “the” here.
Context: ... the model. It's important to note that entity name we pass is a registered model name...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~40-~40: You might be missing the article “a” here.
Context: ...also added an example piece of code for simple load test to get average latency. ### ...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~47-~47: The preposition ‘to’ seems more likely in this position.
Context: ...rained model and create a feature table for look up. Then we create a model serving...

(AI_HYDRA_LEO_REPLACE_FOR_TO)


[typographical] ~50-~50: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...he table we created last week on week 2 - 05.log_and_register_fe_model.py noteboo...

(DASH_RULE)


[uncategorized] ~51-~51: You might be missing the article “the” here.
Context: ...ed for our model to look up features at serving endpoint. - Next is the same as in the ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[typographical] ~52-~52: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e registred in the same notebook week 2 - 05.log_and_register_fe_model.p. This is...

(DASH_RULE)


[uncategorized] ~53-~53: You might be missing the article “a” here.
Context: ...lookup and feature func. - When we send request to the model endpoint, this time, we wo...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[typographical] ~64-~64: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e the same approach as we did in week 2 - 03.log_and_register_model.py. - We trai...

(DASH_RULE)


[uncategorized] ~70-~70: This verb may not be in the correct form. Consider using a different form for this context.
Context: ..., similar to our previous examples, and shows how to invoke it.

(AI_EN_LECTOR_REPLACEMENT_VERB_FORM)

🪛 Markdownlint
notebooks/week3/README.md

50-50: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


51-51: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


52-52: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


53-53: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


63-63: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


64-64: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


65-65: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


66-66: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


67-67: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


68-68: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


69-69: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


70-70: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


15-15: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


30-30: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


43-43: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


57-57: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (10)
project_config.yml (3)

10-11: Review A/B test parameter values

The learning rates for both variants are identical (0.02), which defeats the purpose of A/B testing for this parameter. Consider using different values to effectively test the impact of learning rate on model performance.


13-14: Verify max_depth parameter implications

The significant difference in max_depth values (6 vs 10) could lead to notably different model complexities:

  • Variant A (depth=6): More regularized, less prone to overfitting
  • Variant B (depth=10): More complex, potentially better at capturing intricate patterns but higher risk of overfitting

Ensure this difference aligns with your experimental objectives and consider monitoring validation metrics closely.


12-12: 🛠️ Refactor suggestion

Restructure A/B test parameters for consistency

The n_estimators parameter in the ab_test section should follow the same naming pattern as other parameters with variant suffixes (a/b). Additionally, consider removing the duplicate definition as it's already defined in the main parameters section.

Apply this change:

ab_test:
  learning_rate_a: 0.02
  learning_rate_b: 0.02
- n_estimators: 1000
+ n_estimators_a: 1000
+ n_estimators_b: 1000
  max_depth_a: 6
  max_depth_b: 10

Likely invalid or redundant comment.

src/sleep_efficiency/config.py (1)

15-15: LGTM! Verify A/B test parameters in config files.

The new ab_test attribute is well-defined and properly typed.

Let's verify the A/B test parameters are properly defined in the YAML config files:

✅ Verification successful

A/B test parameters are properly configured in YAML

The ab_test configuration in project_config.yml is well-structured with:

  • Variant-specific parameters (A/B) for learning rate and max depth
  • Common parameters shared across variants (n_estimators)
  • Values that align with the base model parameters
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if A/B test parameters are defined in YAML config files
# Expected: Find YAML files containing ab_test configuration

# Search for YAML files containing ab_test configuration
rg -l "ab_test:" -g "*.yml" -g "*.yaml"

Length of output: 58


Script:

#!/bin/bash
# Let's check the content of project_config.yml to verify the A/B test parameters
cat project_config.yml

Length of output: 608

notebooks/week3/README.md (1)

58-59: Fix formatting and verify A/B testing implementation.

  1. Add Python language specification to the code block
  2. Use consistent list style (asterisks instead of dashes)

The A/B testing implementation using hash function needs verification to ensure fair traffic distribution.

Also applies to: 63-70

✅ Verification successful

A/B testing implementation is well-implemented with fair traffic distribution

The implementation in notebooks/week3/04.AB_test_model_serving.py shows a robust A/B testing setup:

  • Uses MD5 hash of user ID for consistent model assignment
  • Modulo 2 operation ensures 50-50 split between models
  • Wrapper class properly handles input validation and model selection

The original review comment's concerns about traffic distribution are addressed by the implementation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify A/B testing implementation in the codebase

# Search for hash function implementation in A/B testing
rg -A 10 "hash.*function" "04.AB_test_model_serving.py"

# Look for model wrapper class implementation
ast-grep --pattern 'class $_ {
  $$$
  predict($_, $_) {
    $$$
  }
}'

Length of output: 197


Script:

#!/bin/bash
# Search for the A/B testing related Python files in notebooks directory
fd -e py "AB.*test" notebooks/

# Check content of any A/B testing related notebooks
fd -e ipynb "AB.*test" notebooks/ -x cat {}

# Search for hash function implementation in Python files under notebooks
rg -A 10 "hash.*function" notebooks/

Length of output: 587


Script:

#!/bin/bash
# Check the content of the A/B testing implementation file
cat notebooks/week3/04.AB_test_model_serving.py

# Search for any other files that might reference A/B testing implementation
rg -l "hash|wrapper.*class|AB.*test" notebooks/week3/

Length of output: 11528

notebooks/week3/02.model_serving.py (1)

35-35: Verify dataset size before converting to Pandas DataFrame.

Converting a large Spark DataFrame to a Pandas DataFrame using .toPandas() can cause memory issues if the dataset is substantial.

Ensure that the train_set is small enough to fit into memory. If it's large, consider sampling or processing the data in Spark before conversion.

notebooks/week3/04.AB_test_model_serving.py (1)

301-301: Verify the availability of dbutils in the execution environment

The variable dbutils might be undefined if this script runs outside a Databricks notebook environment. Ensure that dbutils is available or import it appropriately.

If necessary, you can import dbutils using:

from pyspark.dbutils import DBUtils
dbutils = DBUtils(spark)
🧰 Tools
🪛 Ruff

301-301: Undefined name dbutils

(F821)

notebooks/week3/01.feature_serving.py (3)

174-174: Undefined name dbutils outside of notebook context

The variable dbutils is used without being defined or imported. In Databricks notebooks, dbutils is provided implicitly. However, if this code is intended to run outside of a notebook environment, this will result in a NameError.

Ensure that dbutils is appropriately defined or imported if the code will be executed outside of a notebook. If running in a notebook, you can disregard the static analysis warning.

🧰 Tools
🪛 Ruff

174-174: Undefined name dbutils

(F821)


183-183: Undefined name display outside of notebook context

The function display() is used without being imported or defined. In Databricks notebooks, display() is available by default. If this script is intended to run outside of a notebook environment, consider replacing display(id_list) with print(id_list) or importing the necessary module.

Ensure compatibility with the intended execution environment.

🧰 Tools
🪛 Ruff

183-183: Undefined name display

(F821)


90-90: Confirm preprocessing of categorical features for prediction

The line pipeline.predict(df[cat_features + num_features]) uses both categorical and numerical features directly for prediction. Ensure that the pipeline includes preprocessing steps to handle categorical variables appropriately (e.g., encoding). If the model expects preprocessed inputs, verify that the data matches the expected format.

Confirm that the pipeline can handle the provided categorical features without additional preprocessing.

@@ -1,10 +1,10 @@
# Databricks notebook source
# The 2 cells below is only when you are running from databricks UI, because of 'possible' not working locally in VS
# %pip install mlops_with_databricks-0.0.1-py3-none-any.whl
%pip install ../mlops_with_databricks-0.0.1-py3-none-any.whl
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

The relative path is incorrect and should be updated to ./mlops_with_databricks-0.0.1-py3-none-any.whl

The wheel file is actually located in the same directory as the notebook (notebooks/), not in the parent directory. The current relative path ../mlops_with_databricks-0.0.1-py3-none-any.whl is incorrect and would fail to find the package.

🔗 Analysis chain

Consider using an absolute path for package installation.

The relative path ../mlops_with_databricks-0.0.1-py3-none-any.whl might break if the notebook is run from a different working directory.

Let's verify the package location:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Find the wheel file location
fd -e whl

Length of output: 66

🧰 Tools
🪛 Ruff

3-3: SyntaxError: Expected a statement


3-3: SyntaxError: Simple statements must be separated by newlines or semicolons


3-3: SyntaxError: Expected an identifier


3-3: SyntaxError: Expected an identifier


3-3: SyntaxError: Simple statements must be separated by newlines or semicolons


# COMMAND ----------

token = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Undefined name dbutils

The variable dbutils is not defined in this script. This will result in a NameError when attempting to retrieve the API token. Ensure that dbutils is available in the execution environment or properly imported.

🧰 Tools
🪛 Ruff

89-89: Undefined name dbutils

(F821)

Comment on lines +134 to +137
dataframe_records[0] = [
{k: (v.isoformat() if isinstance(v, pd.Timestamp) else v) for k, v in record.items()}
for record in dataframe_records[0]
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Processing only the first record in dataframe_records

Currently, you're modifying only the first element of dataframe_records, which means only the first record will have the pd.Timestamp objects formatted correctly. If your intent is to process all records before sending them to the endpoint, you should iterate over the entire list.

Consider updating the code as follows to process all records:

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +88 to +89
sampled_records = train_set[required_columns].sample(n=1000, replace=True).to_dict(orient="records")
dataframe_records = [[record] for record in sampled_records]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor to eliminate code duplication when converting timestamps.

The logic for converting pd.Timestamp objects to ISO format appears multiple times. Refactoring this into a helper function improves maintainability and reduces repetition.

Define a helper function:

def convert_timestamps(record):
    return {k: (v.isoformat() if isinstance(v, pd.Timestamp) else v) for k, v in record.items()}

Update the code to use this function:

-sampled_records = train_set[required_columns].sample(n=1000, replace=True).to_dict(orient="records")
-dataframe_records = [[record] for record in sampled_records]
+sampled_records = train_set[required_columns].sample(n=1000, replace=True).to_dict(orient="records")
+dataframe_records = [convert_timestamps(record) for record in sampled_records]

In the send_request function:

-def send_request():
-    random_record = [
-        {k: (v.isoformat() if isinstance(v, pd.Timestamp) else v) for k, v in record.items()}
-        for record in random.choice(dataframe_records)
-    ]
+def send_request():
+    random_record = random.choice(dataframe_records)

Ensure that when sending the request, random_record is properly formatted:

 response = requests.post(
     model_serving_endpoint,
     headers=headers,
-    json={"dataframe_records": random_record},
+    json={"dataframe_records": [random_record]},
 )

Also applies to: 118-121, 153-155


# COMMAND ----------

token = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix undefined name dbutils.

The variable dbutils is not defined in this context, which will lead to a NameError. To access dbutils in a Python script, you need to import it appropriately.

Apply this diff to fix the issue:

+from pyspark.dbutils import DBUtils

 spark = SparkSession.builder.getOrCreate()
+dbutils = DBUtils(spark)

 token = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Ruff

62-62: Undefined name dbutils

(F821)

Comment on lines +152 to +165
random_record = [
{k: (v.isoformat() if isinstance(v, pd.Timestamp) else v) for k, v in record.items()}
for record in random.choice(dataframe_records)
]
start_time = time.time()
response = requests.post(
model_serving_endpoint,
headers=headers,
json={"dataframe_records": random_record},
)
end_time = time.time()
latency = end_time - start_time
return response.status_code, latency

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling in the send_request function.

Currently, if a request fails, the exception isn't handled, which might disrupt the load test.

Add try-except blocks to handle exceptions gracefully:

 def send_request():
     try:
         random_record = random.choice(dataframe_records)
         start_time = time.time()
         response = requests.post(
             model_serving_endpoint,
             headers=headers,
             json={"dataframe_records": [random_record]},
         )
         end_time = time.time()
         latency = end_time - start_time
+        if response.status_code != 200:
+            print(f"Request failed with status code {response.status_code}")
+        return response.status_code, latency
+    except Exception as e:
+        print(f"An error occurred: {e}")
+        return None, None
-
-        return response.status_code, latency

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +215 to +228
def predict(self, context, model_input):
if isinstance(model_input, pd.DataFrame):
sleep_person_id = str(model_input["id"].values[0])
hashed_id = hashlib.md5(sleep_person_id.encode(encoding="UTF-8")).hexdigest()
# convert a hexadecimal (base-16) string into an integer
if int(hashed_id, 16) % 2:
predictions = self.model_a.predict(model_input.drop(["id"], axis=1))
return {"Prediction": predictions[0], "model": "Model A"}
else:
predictions = self.model_b.predict(model_input.drop(["id"], axis=1))
return {"Prediction": predictions[0], "model": "Model B"}
else:
raise ValueError("Input must be a pandas DataFrame.")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure proper handling of batch inputs in the predict method

The current implementation of the predict method assumes that model_input contains only a single record, accessing it via values[0]. This approach may lead to errors or incorrect predictions when multiple records are passed. To make the model robust and capable of handling batch predictions, consider iterating over all records.

Here's a suggested refactor:

 def predict(self, context, model_input):
     if isinstance(model_input, pd.DataFrame):
-        sleep_person_id = str(model_input["id"].values[0])
-        hashed_id = hashlib.md5(sleep_person_id.encode(encoding="UTF-8")).hexdigest()
-        if int(hashed_id, 16) % 2:
-            predictions = self.model_a.predict(model_input.drop(["id"], axis=1))
-            return {"Prediction": predictions[0], "model": "Model A"}
-        else:
-            predictions = self.model_b.predict(model_input.drop(["id"], axis=1))
-            return {"Prediction": predictions[0], "model": "Model B"}
+        predictions = []
+        models_used = []
+        for idx, row in model_input.iterrows():
+            sleep_person_id = str(row["id"])
+            hashed_id = hashlib.md5(sleep_person_id.encode("UTF-8")).hexdigest()
+            model_features = row.drop("id").to_frame().T
+            if int(hashed_id, 16) % 2:
+                prediction = self.model_a.predict(model_features)
+                models_used.append("Model A")
+            else:
+                prediction = self.model_b.predict(model_features)
+                models_used.append("Model B")
+            predictions.append(prediction[0])
+        return pd.DataFrame({"Prediction": predictions, "model": models_used})
     else:
         raise ValueError("Input must be a pandas DataFrame.")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def predict(self, context, model_input):
if isinstance(model_input, pd.DataFrame):
sleep_person_id = str(model_input["id"].values[0])
hashed_id = hashlib.md5(sleep_person_id.encode(encoding="UTF-8")).hexdigest()
# convert a hexadecimal (base-16) string into an integer
if int(hashed_id, 16) % 2:
predictions = self.model_a.predict(model_input.drop(["id"], axis=1))
return {"Prediction": predictions[0], "model": "Model A"}
else:
predictions = self.model_b.predict(model_input.drop(["id"], axis=1))
return {"Prediction": predictions[0], "model": "Model B"}
else:
raise ValueError("Input must be a pandas DataFrame.")
def predict(self, context, model_input):
if isinstance(model_input, pd.DataFrame):
predictions = []
models_used = []
for idx, row in model_input.iterrows():
sleep_person_id = str(row["id"])
hashed_id = hashlib.md5(sleep_person_id.encode("UTF-8")).hexdigest()
model_features = row.drop("id").to_frame().T
if int(hashed_id, 16) % 2:
prediction = self.model_a.predict(model_features)
models_used.append("Model A")
else:
prediction = self.model_b.predict(model_features)
models_used.append("Model B")
predictions.append(prediction[0])
return pd.DataFrame({"Prediction": predictions, "model": models_used})
else:
raise ValueError("Input must be a pandas DataFrame.")

FeatureLookup(
table_name=feature_table_name,
lookup_key="Id",
feature_names=["id", "sleep_duration", "awakenings", "sleep_month", "Predicted_SleepEfficiency"],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Inconsistent use of 'Id' vs 'id' could lead to errors

There is inconsistency in the use of 'Id' (uppercase I) and 'id' (lowercase i) as key identifiers in your code. This can cause KeyError exceptions or unexpected behavior when accessing DataFrame columns or performing lookups. Ensure consistent use of 'id' (lowercase) throughout the code to match the DataFrame column names.

Apply the following changes to resolve the inconsistency:

At line 132:

-            lookup_key="Id",
+            lookup_key="id",

At line 179:

- id_list = preds_df["Id"]
+ id_list = preds_df["id"]

At line 197:

-        json={"dataframe_records": [{"Id": "2"}]},
+        json={"dataframe_records": [{"id": "2"}]},

At line 215:

-        json={"dataframe_split": {"columns": ["Id"], "data": [["2"]]}},
+        json={"dataframe_split": {"columns": ["id"], "data": [["2"]]}},

At line 237:

-            json={"dataframe_records": [{"Id": random_id}]},
+            json={"dataframe_records": [{"id": random_id}]},

Also applies to: 179-179, 197-197, 215-215, 237-237

Comment on lines +234 to +241
response = requests.post(
serving_endpoint,
headers=headers,
json={"dataframe_records": [{"Id": random_id}]},
)
end_time = time.time()
latency = end_time - start_time # Calculate latency for this request
return response.status_code, latency
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add exception handling to API requests in send_request function

The send_request function lacks error handling for potential exceptions during the API request. This could cause the program to crash during the load test if a network error occurs.

Consider adding exception handling to make the code more robust:

def send_request():
    random_id = random.choice(id_list)
    start_time = time.time()
    try:
        response = requests.post(
            serving_endpoint,
            headers=headers,
            json={"dataframe_records": [{"id": random_id}]},
        )
        response.raise_for_status()
        status_code = response.status_code
    except requests.exceptions.RequestException as e:
        status_code = None
        print(f"Request failed: {e}")
    end_time = time.time()
    latency = end_time - start_time  # Calculate latency for this request
    return status_code, latency

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (9)
notebooks/week3/03.model_serving_feature_lookup.py (3)

1-2: Consider using a package registry for distribution

Installing from a local wheel file may cause versioning and distribution issues. Consider publishing the package to a private PyPI repository or using Databricks' artifact storage.


69-81: Consider workload size based on performance requirements

The endpoint is configured with a "Small" workload size and scale-to-zero enabled. While this is cost-effective, verify that:

  1. The small instance type can handle your expected request volume
  2. The cold start latency from scale-to-zero is acceptable for your use case

149-149: Remove or utilize unused data loading

The temperature features table is loaded but never used. Either remove this line if it's not needed or add the intended usage of this data.

notebooks/week3/01.feature_serving.py (5)

96-101: Add error handling for table creation

The table creation should handle cases where the table might already exist.

+try:
     fe.create_table(
         name=feature_table_name,
         primary_keys=["id"],
         df=preds_df,
         description="Sleep efficiencies predictions feature table",
     )
+except Exception as e:
+    print(f"Error creating table: {e}")
+    # Consider whether to proceed or raise the exception

151-162: Consider adding monitoring and alerting

While the endpoint configuration looks good, consider adding monitoring and alerting for production readiness. This could include metrics for latency, error rates, and resource utilization.


174-174: Add type hint for dbutils

The dbutils variable is flagged as undefined. Add a type hint to clarify it's a Databricks utility.

+from typing import TYPE_CHECKING
+if TYPE_CHECKING:
+    from databricks.sdk.runtime import dbutils
🧰 Tools
🪛 Ruff

174-174: Undefined name dbutils

(F821)


189-202: Add response content validation

The endpoint test should validate the response content structure and values, not just the status code.

 response = requests.post(
     f"{serving_endpoint}",
     headers={"Authorization": f"Bearer {token}"},
     json={"dataframe_records": [{"Id": "2"}]},
 )
 
 end_time = time.time()
 execution_time = end_time - start_time
 
 print("Response status:", response.status_code)
+if response.status_code == 200:
+    response_data = response.json()
+    # Validate response structure
+    if "predictions" not in response_data:
+        print("Warning: Unexpected response structure")
+    # Validate prediction values
+    predictions = response_data.get("predictions", [])
+    if not predictions or not all(isinstance(p, (int, float)) for p in predictions):
+        print("Warning: Invalid prediction values")
 print("Reponse text:", response.text)
 print("Execution time:", execution_time, "seconds")

221-225: Consider parameterizing load test configuration

Load test parameters like num_requests and max_workers should be configurable to support different test scenarios.

+# Load test configuration
+load_test_config = {
+    "num_requests": 10,
+    "max_workers": 100,
+    "timeout_seconds": 30
+}

 serving_endpoint = f"https://{host}/serving-endpoints/sleep-efficiencies-feature-serving/invocations"
 id_list = preds_df.select("Id").rdd.flatMap(lambda x: x).collect()
 headers = {"Authorization": f"Bearer {token}"}
-num_requests = 10
+num_requests = load_test_config["num_requests"]
notebooks/week3/04.AB_test_model_serving.py (1)

103-103: Consider retrieving git_sha programmatically

Currently, git_sha is hardcoded as "week3". To enhance reproducibility and ensure accurate tracking of model versions, consider retrieving the actual Git commit hash programmatically.

Apply this diff to implement the change:

- git_sha = "week3"
+ import subprocess
+ git_sha = subprocess.check_output(['git', 'rev-parse', 'HEAD']).decode('ascii').strip()
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 50c6938 and 8c0feba.

📒 Files selected for processing (4)
  • notebooks/week1&2/05.log_and_register_fe_model.py (3 hunks)
  • notebooks/week3/01.feature_serving.py (1 hunks)
  • notebooks/week3/03.model_serving_feature_lookup.py (1 hunks)
  • notebooks/week3/04.AB_test_model_serving.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • notebooks/week1&2/05.log_and_register_fe_model.py
🧰 Additional context used
🪛 Ruff
notebooks/week3/01.feature_serving.py

174-174: Undefined name dbutils

(F821)

notebooks/week3/03.model_serving_feature_lookup.py

90-90: Undefined name dbutils

(F821)

notebooks/week3/04.AB_test_model_serving.py

299-299: Undefined name dbutils

(F821)

🔇 Additional comments (4)
notebooks/week3/03.model_serving_feature_lookup.py (2)

90-91: Skip comment about undefined dbutils as it was already reported in past reviews

🧰 Tools
🪛 Ruff

90-90: Undefined name dbutils

(F821)


44-52: Verify primary key uniqueness for temperature features

Using only "Month" as the primary key assumes that temperature features are unique per month. This might lead to data conflicts if there are multiple temperature readings per month.

✅ Verification successful

Let me try a different approach to verify the data structure and primary key usage.


Let me try one more search without file type restriction to find the feature engineering code.


Based on the search results, I can now provide a final response since I found the table creation code that shows the complete schema and constraints.

Primary key on "Month" is correctly configured

The table schema in notebooks/week1&2/05.log_and_register_fe_model.py shows that:

  • The table has Id, Month, and AverageTemperature columns
  • Month is explicitly defined as a PRIMARY KEY constraint
  • Data is sourced from temperatures_netherlands_2021 table with a direct month-to-month mapping

The primary key choice is appropriate as the table stores aggregated temperature data per month, ensuring uniqueness.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for potential duplicate months in the source table
echo "Checking for duplicate months in temperature_features table..."
ast-grep --pattern 'SELECT Month, COUNT(*) as count 
  FROM $catalog.$schema.temperature_features 
  GROUP BY Month 
  HAVING COUNT(*) > 1'

Length of output: 276


Script:

#!/bin/bash
# Search for table creation or feature engineering code
rg -A 10 "temperature_features" --type python

Length of output: 76


Script:

#!/bin/bash
# Search for temperature features table creation and transformation
rg -A 10 "temperature_features"

Length of output: 6625

notebooks/week3/01.feature_serving.py (1)

2-2: Verify the wheel file installation path

The wheel file is being installed from a relative path. This could break if the notebook is moved to a different location.

✅ Verification successful

The relative wheel file path is correct and follows project conventions

Based on the repository structure and README documentation, the relative wheel file path ../mlops_with_databricks-0.0.1-py3-none-any.whl in notebooks/week3/01.feature_serving.py is correct because:

  1. The wheel file is consistently referenced across multiple notebooks in the same way
  2. The README explicitly documents the process of building and uploading the wheel file to Databricks volumes
  3. The notebook structure follows a weekly organization pattern where notebooks are placed in week-specific folders

The relative path ensures the wheel file can be accessed from any week's notebook directory, making it a reliable approach for this project's structure.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the wheel file exists and get its details
fd -e whl . | while read -r wheel; do
    echo "Found wheel: $wheel"
    unzip -l "$wheel" | grep -i "METADATA\|RECORD"
done

Length of output: 358


Script:

#!/bin/bash
# Let's try to understand the notebook structure and wheel file references
rg -l "mlops_with_databricks.*\.whl" --type py

# Check if there are any setup or installation instructions in README
fd "README" --type f --exec cat {}

Length of output: 7357

notebooks/week3/04.AB_test_model_serving.py (1)

215-228: Ensure proper handling of batch inputs in the predict method

Comment on lines +57 to +61
config = ProjectConfig.from_yaml(config_path="../../project_config.yml")

catalog_name = config.catalog_name
schema_name = config.schema_name

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove duplicate configuration loading

This configuration loading block is identical to the one at lines 38-40. Consider removing this duplicate block to improve maintainability.

-config = ProjectConfig.from_yaml(config_path="../../project_config.yml")
-
-catalog_name = config.catalog_name
-schema_name = config.schema_name
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
config = ProjectConfig.from_yaml(config_path="../../project_config.yml")
catalog_name = config.catalog_name
schema_name = config.schema_name

Comment on lines +114 to +117
train_set = spark.table(f"{catalog_name}.{schema_name}.train_set").toPandas()

sampled_records = train_set[required_columns].sample(n=1000, replace=True).to_dict(orient="records")
dataframe_records = [[record] for record in sampled_records]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider batching for large datasets

Loading the entire training set into memory with .toPandas() could cause OOM issues with large datasets. Consider using:

  1. Sampling at Spark level before conversion to Pandas
  2. Processing in batches for production workloads
-train_set = spark.table(f"{catalog_name}.{schema_name}.train_set").toPandas()
-sampled_records = train_set[required_columns].sample(n=1000, replace=True)
+train_set = spark.table(f"{catalog_name}.{schema_name}.train_set").select(*required_columns)
+sampled_records = train_set.sample(n=1000, withReplacement=True).toPandas()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
train_set = spark.table(f"{catalog_name}.{schema_name}.train_set").toPandas()
sampled_records = train_set[required_columns].sample(n=1000, replace=True).to_dict(orient="records")
dataframe_records = [[record] for record in sampled_records]
train_set = spark.table(f"{catalog_name}.{schema_name}.train_set").select(*required_columns)
sampled_records = train_set.sample(n=1000, withReplacement=True).toPandas()
dataframe_records = [[record] for record in sampled_records.to_dict(orient="records")]

Comment on lines +134 to +138
response = requests.post(
f"{model_serving_endpoint}",
headers={"Authorization": f"Bearer {token}"},
json={"dataframe_records": dataframe_records[0]},
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling for API requests

The endpoint call lacks error handling. Consider adding try-catch blocks and proper error handling for:

  • Connection errors
  • Timeout errors
  • HTTP error responses
+try:
     response = requests.post(
         f"{model_serving_endpoint}",
         headers={"Authorization": f"Bearer {token}"},
         json={"dataframe_records": dataframe_records[0]},
+        timeout=30  # Add appropriate timeout
     )
+    response.raise_for_status()
+except requests.exceptions.RequestException as e:
+    print(f"Error calling endpoint: {str(e)}")
+    raise
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
response = requests.post(
f"{model_serving_endpoint}",
headers={"Authorization": f"Bearer {token}"},
json={"dataframe_records": dataframe_records[0]},
)
try:
response = requests.post(
f"{model_serving_endpoint}",
headers={"Authorization": f"Bearer {token}"},
json={"dataframe_records": dataframe_records[0]},
timeout=30 # Add appropriate timeout
)
response.raise_for_status()
except requests.exceptions.RequestException as e:
print(f"Error calling endpoint: {str(e)}")
raise

# COMMAND ----------

# Load the MLflow model for predictions
pipeline = mlflow.sklearn.load_model(f"models:/{catalog_name}.{schema_name}.sleep_efficiency_model_basic/3")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider parameterizing the model version

The model version is hardcoded to "3". Consider moving this to configuration to make it more maintainable and flexible.

-pipeline = mlflow.sklearn.load_model(f"models:/{catalog_name}.{schema_name}.sleep_efficiency_model_basic/3")
+model_version = config.model_version  # Add this to your config
+pipeline = mlflow.sklearn.load_model(f"models:/{catalog_name}.{schema_name}.sleep_efficiency_model_basic/{model_version}")

Committable suggestion skipped: line range outside the PR's diff.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (6)
notebooks/week3/02.model_serving.py (4)

72-86: Move required columns to configuration file.

The list of required columns should be maintained in the configuration for better maintainability and reusability.

+# Add to ProjectConfig class
+required_columns: List[str] = [
+    "age", "sleep_duration", "rem_sleep_percentage",
+    "deep_sleep_percentage", "light_sleep_percentage",
+    "awakenings", "caffeine_consumption",
+    "alcohol_consumption", "exercise_frequency",
+    "gender", "smoking_status", "bedtime",
+    "wakeup_time",
+]

-required_columns = [
-    "age",
-    "sleep_duration",
-    ...
-]
+required_columns = config.required_columns

93-109: Remove or update misleading documentation.

The example request body format shows house pricing features (LotFrontage, LotArea, etc.) which don't match the actual sleep efficiency model features being used.


132-134: Improve response logging security and format.

The current logging might expose sensitive data and lacks proper formatting:

  1. Avoid logging full response text
  2. Add log levels (INFO/ERROR)
  3. Format the output consistently
-print("Response status:", response.status_code)
-print("Reponse text:", response.text)
-print("Execution time:", execution_time, "seconds")
+print(f"Response status: {response.status_code}")
+if response.status_code != 200:
+    print(f"Error: {response.status_code}")
+    print(f"Error message: {response.json().get('message', 'Unknown error')}")
+print(f"Execution time: {execution_time:.2f} seconds")

147-148: Make load test parameters configurable.

The number of requests and concurrent workers should be configurable parameters.

+# Add to ProjectConfig class
+load_test_config: Dict[str, int] = {
+    "num_requests": 1000,
+    "max_workers": 100
+}

-num_requests = 1000
+num_requests = config.load_test_config["num_requests"]
notebooks/week3/04.AB_test_model_serving.py (2)

96-96: Consider adding cross-validation for hyperparameter tuning

The current implementation uses fixed hyperparameters from config. Adding cross-validation would help optimize model performance.

Consider implementing k-fold cross-validation:

from sklearn.model_selection import GridSearchCV

param_grid = {
    'regressor__learning_rate': [parameters_a['learning_rate'], 0.01, 0.1],
    'regressor__max_depth': [parameters_a['max_depth'], 5, 7]
}
cv_pipeline = GridSearchCV(pipeline, param_grid, cv=5)
cv_pipeline.fit(X_train, y_train)

337-348: Improve response handling and logging

The current implementation prints raw response text without proper error handling or structured logging.

Improve response handling:

-response = requests.post(
-    f"{model_serving_endpoint}",
-    headers={"Authorization": f"Bearer {token}"},
-    json={"dataframe_records": dataframe_records[0]},
-)
-
-end_time = time.time()
-execution_time = end_time - start_time
-
-print("Response status:", response.status_code)
-print("Reponse text:", response.text)
-print("Execution time:", execution_time, "seconds")
+try:
+    response = requests.post(
+        f"{model_serving_endpoint}",
+        headers={"Authorization": f"Bearer {token}"},
+        json={"dataframe_records": dataframe_records[0]},
+        timeout=30
+    )
+    response.raise_for_status()
+    
+    end_time = time.time()
+    execution_time = end_time - start_time
+    
+    result = response.json()
+    print(f"Success! Execution time: {execution_time:.2f} seconds")
+    print(f"Predictions: {json.dumps(result, indent=2)}")
+except requests.exceptions.RequestException as e:
+    print(f"Error calling endpoint: {e}")
+    if hasattr(e, 'response'):
+        print(f"Response status: {e.response.status_code}")
+        print(f"Response body: {e.response.text}")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 8c0feba and 7c2e3ac.

📒 Files selected for processing (4)
  • notebooks/week3/01.feature_serving.py (1 hunks)
  • notebooks/week3/02.model_serving.py (1 hunks)
  • notebooks/week3/03.model_serving_feature_lookup.py (1 hunks)
  • notebooks/week3/04.AB_test_model_serving.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • notebooks/week3/01.feature_serving.py
  • notebooks/week3/03.model_serving_feature_lookup.py
🔇 Additional comments (2)
notebooks/week3/02.model_serving.py (1)

2-2: Verify the integrity of the local wheel file installation.

Installing from a local wheel file could pose security risks. Consider:

  1. Publishing the package to a private PyPI repository
  2. Adding integrity checks (e.g., SHA-256 hash verification)
  3. Documenting the wheel file's source and build process
notebooks/week3/04.AB_test_model_serving.py (1)

215-228: LGTM with existing review comment

The implementation of the A/B testing logic using hashed IDs is sound, but please address the batch prediction handling issue raised in the previous review.

Comment on lines +38 to +46
name="sleep-efficiencies-model-serving",
config=EndpointCoreConfigInput(
served_entities=[
ServedEntityInput(
entity_name=f"{catalog_name}.{schema_name}.sleep_efficiency_model_basic",
scale_to_zero_enabled=True,
workload_size="Small",
entity_version=3,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid hardcoding endpoint name and version number.

The endpoint name and version number should be configurable parameters:

  1. Move "sleep-efficiencies-model-serving" to configuration
  2. Make the entity version (3) configurable
+# Add to ProjectConfig class
+model_endpoint_name: str = "sleep-efficiencies-model-serving"
+model_version: int = 3

-    name="sleep-efficiencies-model-serving",
+    name=config.model_endpoint_name,
     config=EndpointCoreConfigInput(
         served_entities=[
             ServedEntityInput(
                 entity_name=f"{catalog_name}.{schema_name}.sleep_efficiency_model_basic",
                 scale_to_zero_enabled=True,
                 workload_size="Small",
-                entity_version=3,
+                entity_version=config.model_version,

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +171 to +177
with ThreadPoolExecutor(max_workers=100) as executor:
futures = [executor.submit(send_request) for _ in range(num_requests)]

for future in as_completed(futures):
status_code, latency = future.result()
latencies.append(latency)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve load test results collection and error handling.

The current implementation doesn't track failed requests or provide detailed statistics.

+results = {"success": 0, "failed": 0, "latencies": [], "errors": {}}

 with ThreadPoolExecutor(max_workers=100) as executor:
     futures = [executor.submit(send_request) for _ in range(num_requests)]

     for future in as_completed(futures):
-        status_code, latency = future.result()
-        latencies.append(latency)
+        try:
+            status_code, latency = future.result()
+            if status_code == 200:
+                results["success"] += 1
+                results["latencies"].append(latency)
+            else:
+                results["failed"] += 1
+                results["errors"][status_code] = results["errors"].get(status_code, 0) + 1
+        except Exception as e:
+            results["failed"] += 1
+            error_type = type(e).__name__
+            results["errors"][error_type] = results["errors"].get(error_type, 0) + 1

+print(f"\nResults:")
+print(f"Success: {results['success']}/{num_requests} ({results['success']/num_requests*100:.2f}%)")
+print(f"Failed: {results['failed']}/{num_requests} ({results['failed']/num_requests*100:.2f}%)")
+if results["errors"]:
+    print("\nErrors:")
+    for error, count in results["errors"].items():
+        print(f"- {error}: {count}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
with ThreadPoolExecutor(max_workers=100) as executor:
futures = [executor.submit(send_request) for _ in range(num_requests)]
for future in as_completed(futures):
status_code, latency = future.result()
latencies.append(latency)
results = {"success": 0, "failed": 0, "latencies": [], "errors": {}}
with ThreadPoolExecutor(max_workers=100) as executor:
futures = [executor.submit(send_request) for _ in range(num_requests)]
for future in as_completed(futures):
try:
status_code, latency = future.result()
if status_code == 200:
results["success"] += 1
results["latencies"].append(latency)
else:
results["failed"] += 1
results["errors"][status_code] = results["errors"].get(status_code, 0) + 1
except Exception as e:
results["failed"] += 1
error_type = type(e).__name__
results["errors"][error_type] = results["errors"].get(error_type, 0) + 1
print(f"\nResults:")
print(f"Success: {results['success']}/{num_requests} ({results['success']/num_requests*100:.2f}%)")
print(f"Failed: {results['failed']}/{num_requests} ({results['failed']/num_requests*100:.2f}%)")
if results["errors"]:
print("\nErrors:")
for error, count in results["errors"].items():
print(f"- {error}: {count}")

Comment on lines +158 to +185
# Repeat the training and logging steps for Model B using parameters for B
pipeline = Pipeline(steps=[("preprocessor", preprocessor), ("regressor", LGBMRegressor(**parameters_b))])

# Start MLflow run for Model B
with mlflow.start_run(tags={"model_class": "B", "git_sha": git_sha}) as run:
run_id = run.info.run_id

pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_test)

mse = mean_squared_error(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

mlflow.log_param("model_type", "LightGBM with preprocessing")
mlflow.log_params(parameters_b)
mlflow.log_metric("mse", mse)
mlflow.log_metric("mae", mae)
mlflow.log_metric("r2_score", r2)
signature = infer_signature(model_input=X_train, model_output=y_pred)

dataset = mlflow.data.from_spark(train_set_spark, table_name=f"{catalog_name}.{schema_name}.train_set", version="0")
mlflow.log_input(dataset, context="training")
mlflow.sklearn.log_model(sk_model=pipeline, artifact_path="lightgbm-pipeline-model", signature=signature)

model_version = mlflow.register_model(
model_uri=f"runs:/{run_id}/lightgbm-pipeline-model", name=model_name, tags={"git_sha": f"{git_sha}"}
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor duplicate model training code

The Model B training code is duplicated from Model A. Consider extracting the training logic into a reusable function.

Suggested refactor:

def train_and_log_model(X_train, y_train, parameters, model_class, git_sha):
    pipeline = Pipeline(steps=[
        ("preprocessor", preprocessor),
        ("regressor", LGBMRegressor(**parameters))
    ])
    
    with mlflow.start_run(tags={"model_class": model_class, "git_sha": git_sha}) as run:
        run_id = run.info.run_id
        pipeline.fit(X_train, y_train)
        y_pred = pipeline.predict(X_test)
        
        metrics = {
            "mse": mean_squared_error(y_test, y_pred),
            "mae": mean_absolute_error(y_test, y_pred),
            "r2": r2_score(y_test, y_pred)
        }
        
        mlflow.log_param("model_type", "LightGBM with preprocessing")
        mlflow.log_params(parameters)
        for metric_name, value in metrics.items():
            mlflow.log_metric(metric_name, value)
            
        signature = infer_signature(model_input=X_train, model_output=y_pred)
        dataset = mlflow.data.from_spark(
            train_set_spark,
            table_name=f"{catalog_name}.{schema_name}.train_set",
            version="0"
        )
        mlflow.log_input(dataset, context="training")
        mlflow.sklearn.log_model(
            sk_model=pipeline,
            artifact_path="lightgbm-pipeline-model",
            signature=signature
        )
        
        return run_id, pipeline

# Usage:
run_id_a, model_a = train_and_log_model(X_train, y_train, parameters_a, "A", git_sha)
run_id_b, model_b = train_and_log_model(X_train, y_train, parameters_b, "B", git_sha)

model_name = f"{catalog_name}.{schema_name}.sleep_efficiencies_model_ab"

# Git commit hash for tracking model version
git_sha = "week3"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace hardcoded git SHA with actual commit hash

Using a hardcoded value "week3" as git SHA defeats the purpose of version tracking. This should be dynamically retrieved from the git repository.

Replace with actual git SHA using:

-git_sha = "week3"
+import subprocess
+git_sha = subprocess.check_output(['git', 'rev-parse', 'HEAD']).decode('ascii').strip()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
git_sha = "week3"
import subprocess
git_sha = subprocess.check_output(['git', 'rev-parse', 'HEAD']).decode('ascii').strip()

Comment on lines +263 to +267
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version.version}")

# Run prediction
predictions = model.predict(X_test.iloc[0:1])

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling for model loading and prediction

The current implementation lacks error handling for model loading and prediction operations.

Add error handling:

-model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version.version}")
+try:
+    model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version.version}")
+except Exception as e:
+    print(f"Error loading model: {e}")
+    raise
 
-# Run prediction
-predictions = model.predict(X_test.iloc[0:1])
+try:
+    predictions = model.predict(X_test.iloc[0:1])
+except Exception as e:
+    print(f"Error during prediction: {e}")
+    raise
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version.version}")
# Run prediction
predictions = model.predict(X_test.iloc[0:1])
try:
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version.version}")
except Exception as e:
print(f"Error loading model: {e}")
raise
try:
predictions = model.predict(X_test.iloc[0:1])
except Exception as e:
print(f"Error during prediction: {e}")
raise

Comment on lines +278 to +290
workspace.serving_endpoints.create(
name="sleep-efficiencies-model-serving-ab-test",
config=EndpointCoreConfigInput(
served_entities=[
ServedEntityInput(
entity_name=f"{catalog_name}.{schema_name}.sleep_efficiencies_model_pyfunc_ab_test",
scale_to_zero_enabled=True,
workload_size="Small",
entity_version=model_version.version,
)
]
),
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add retry logic and validation for endpoint creation

The endpoint creation lacks retry logic and validation, which could lead to failures in unstable network conditions.

Add retry logic:

+from tenacity import retry, stop_after_attempt, wait_exponential
+
+@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10))
+def create_endpoint(workspace, name, config):
+    try:
+        endpoint = workspace.serving_endpoints.create(name=name, config=config)
+        # Wait for endpoint to be ready
+        while endpoint.state.config_update is not None:
+            time.sleep(10)
+            endpoint = workspace.serving_endpoints.get(name)
+        return endpoint
+    except Exception as e:
+        print(f"Error creating endpoint: {e}")
+        raise

-workspace.serving_endpoints.create(
+create_endpoint(
+    workspace,
     name="sleep-efficiencies-model-serving-ab-test",
     config=EndpointCoreConfigInput(...)
)

Committable suggestion skipped: line range outside the PR's diff.

Copy link

@basakeskili basakeskili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants