-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/week3 #2
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis pull request introduces several enhancements across multiple scripts in a Databricks environment. Key changes include the addition of a feature engineering model for sleep efficiency, the creation of feature and model serving scripts, and the implementation of A/B testing for machine learning models. The modifications also encompass updates to configuration files to support new features and parameters, improving the overall functionality and clarity of the codebase. Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
🧹 Outside diff range and nitpick comments (22)
src/sleep_efficiency/config.py (2)
Line range hint
7-16
: Add validation for A/B test parameters.Consider adding specific validation rules for A/B test parameters to ensure required fields are present and values are within expected ranges.
Example implementation:
from typing import Any, Dict, List from pydantic import BaseModel, Field class ABTestConfig(BaseModel): learning_rate_a: float = Field(gt=0, lt=1) learning_rate_b: float = Field(gt=0, lt=1) max_depth_a: int = Field(gt=0) max_depth_b: int = Field(gt=0) class ProjectConfig(BaseModel): # ... existing fields ... ab_test: ABTestConfig
Line range hint
7-24
: Consider adding configuration versioning and documentation.As this configuration is critical for the A/B testing framework and used across multiple notebooks, consider:
- Adding a version field to track configuration schema changes
- Including a docstring describing the expected structure of the YAML configuration
- Adding validation for backward compatibility
Example implementation:
class ProjectConfig(BaseModel): """Project configuration for sleep efficiency model. Expected YAML structure: ```yaml version: "1.0" ab_test: learning_rate_a: float # Learning rate for model A (0-1) learning_rate_b: float # Learning rate for model B (0-1) max_depth_a: int # Max depth for model A (>0) max_depth_b: int # Max depth for model B (>0) ``` """ version: str = "1.0" # Configuration schema version # ... rest of the fields ...notebooks/week3/README.md (6)
9-10
: Fix grammatical issues in the overview section.Add a comma after "Last week" and consider rephrasing for better clarity:
-Last week we demonstrated model training and registering for different use cases. +Last week, we demonstrated model training and registering for different use cases.🧰 Tools
🪛 LanguageTool
[uncategorized] ~9-~9: Possible missing comma found.
Context: ...red in this lecture. ## Overview Last week we demonstrated model training and regi...(AI_HYDRA_LEO_MISSING_COMMA)
15-17
: Specify language for code blocks.Add Python language specification to the code block:
-``` +```python 01.feature_serving.py🧰 Tools
🪛 Markdownlint
15-15: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
27-28
: Fix grammar in feature serving explanation.Correct the subject-verb agreement:
-The subsequent code examples shows how to invoke this endpoint and get responses. +The subsequent code examples show how to invoke this endpoint and get responses.🧰 Tools
🪛 LanguageTool
[grammar] ~27-~27: The verb form ‘shows’ does not seem to match the subject ‘examples’.
Context: ...e lookups. The subsequent code examples shows how to invoke this endpoint and get res...(SUBJECT_VERB_AGREEMENT_PLURAL)
31-32
: Fix formatting and grammar in model serving section.
- Add Python language specification to the code block
- Add missing articles:
-Model serving is a process of creating a model serving endpoint that can be used for inference. Endpoint creation process is similar to feature serving +Model serving is a process of creating a model serving endpoint that can be used for inference. The endpoint creation process is similar to feature serving -We also added an example piece of code for simple load test to get average latency. +We also added an example piece of code for a simple load test to get average latency.Also applies to: 33-33, 40-40
44-45
: Fix formatting and spelling in feature lookup section.
- Add Python language specification to the code block
- Use consistent list style (asterisks instead of dashes)
- Fix spelling errors:
-- We start with creating an online table for existing offline feature table +* We start with creating an online table for existing offline feature table -Next is the same as in the previous notebook, we create an endpoint using the model we registred +Next is the same as in the previous notebook, we create an endpoint using the model we registeredAlso applies to: 50-54
1-70
: Consider adding additional sections to enhance documentation.The documentation provides good coverage of the implementation details. Consider adding:
- A section on error handling and logging improvements mentioned in PR objectives
- Compatibility notes regarding pyarrow and mlflow integration
- A troubleshooting guide for common issues
Would you like me to help draft these additional sections?
🧰 Tools
🪛 LanguageTool
[uncategorized] ~9-~9: Possible missing comma found.
Context: ...red in this lecture. ## Overview Last week we demonstrated model training and regi...(AI_HYDRA_LEO_MISSING_COMMA)
[grammar] ~27-~27: The verb form ‘shows’ does not seem to match the subject ‘examples’.
Context: ...e lookups. The subsequent code examples shows how to invoke this endpoint and get res...(SUBJECT_VERB_AGREEMENT_PLURAL)
[uncategorized] ~33-~33: You might be missing the article “the” here.
Context: ...ndpoint that can be used for inference. Endpoint creation process is similar to feature ...(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[uncategorized] ~37-~37: You might be missing the article “the” here.
Context: ... the model. It's important to note that entity name we pass is a registered model name...(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[uncategorized] ~40-~40: You might be missing the article “a” here.
Context: ...also added an example piece of code for simple load test to get average latency. ### ...(AI_EN_LECTOR_MISSING_DETERMINER_A)
[uncategorized] ~47-~47: The preposition ‘to’ seems more likely in this position.
Context: ...rained model and create a feature table for look up. Then we create a model serving...(AI_HYDRA_LEO_REPLACE_FOR_TO)
[typographical] ~50-~50: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...he table we created last week on week 2 - 05.log_and_register_fe_model.py noteboo...(DASH_RULE)
[uncategorized] ~51-~51: You might be missing the article “the” here.
Context: ...ed for our model to look up features at serving endpoint. - Next is the same as in the ...(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[typographical] ~52-~52: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e registred in the same notebook week 2 - 05.log_and_register_fe_model.p. This is...(DASH_RULE)
[uncategorized] ~53-~53: You might be missing the article “a” here.
Context: ...lookup and feature func. - When we send request to the model endpoint, this time, we wo...(AI_EN_LECTOR_MISSING_DETERMINER_A)
[typographical] ~64-~64: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e the same approach as we did in week 2 - 03.log_and_register_model.py. - We trai...(DASH_RULE)
[uncategorized] ~70-~70: This verb may not be in the correct form. Consider using a different form for this context.
Context: ..., similar to our previous examples, and shows how to invoke it.(AI_EN_LECTOR_REPLACEMENT_VERB_FORM)
🪛 Markdownlint
50-50: Expected: asterisk; Actual: dash
Unordered list style(MD004, ul-style)
51-51: Expected: asterisk; Actual: dash
Unordered list style(MD004, ul-style)
52-52: Expected: asterisk; Actual: dash
Unordered list style(MD004, ul-style)
53-53: Expected: asterisk; Actual: dash
Unordered list style(MD004, ul-style)
63-63: Expected: asterisk; Actual: dash
Unordered list style(MD004, ul-style)
64-64: Expected: asterisk; Actual: dash
Unordered list style(MD004, ul-style)
65-65: Expected: asterisk; Actual: dash
Unordered list style(MD004, ul-style)
66-66: Expected: asterisk; Actual: dash
Unordered list style(MD004, ul-style)
67-67: Expected: asterisk; Actual: dash
Unordered list style(MD004, ul-style)
68-68: Expected: asterisk; Actual: dash
Unordered list style(MD004, ul-style)
69-69: Expected: asterisk; Actual: dash
Unordered list style(MD004, ul-style)
70-70: Expected: asterisk; Actual: dash
Unordered list style(MD004, ul-style)
15-15: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
30-30: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
43-43: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
57-57: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
notebooks/week1&2/05.log_and_register_fe_model.py (3)
179-187
: Document feature selection rationale.The addition of "AverageTemperature" as a feature lacks documentation explaining its significance for sleep efficiency prediction. Additionally, there's commented code about
sleep_hours_duration
that should be cleaned up.Consider:
- Adding a comment explaining why temperature affects sleep efficiency
- Removing the commented code about
sleep_hours_duration
since it's not being used
229-229
: Consider adding version information to the model name.While the model name change from house prices to sleep efficiencies is appropriate, consider incorporating version information in the model name for better tracking and A/B testing support.
Example:
- model_uri=f"runs:/{run_id}/lightgbm-pipeline-model-fe", name=f"{catalog_name}.{schema_name}.sleep_efficiencies_model_fe" + model_uri=f"runs:/{run_id}/lightgbm-pipeline-model-fe", name=f"{catalog_name}.{schema_name}.sleep_efficiencies_model_fe_v1"
Line range hint
1-230
: Consider splitting responsibilities and adding error handling.The script currently handles multiple concerns (feature engineering, model training, and registration) which could be separated for better maintainability. Additionally, error handling should be added for feature lookups and model training.
Consider:
- Splitting into separate notebooks for feature engineering and model training
- Adding try-catch blocks around feature lookups and model training
- Adding logging for better observability
notebooks/week3/03.model_serving_feature_lookup.py (5)
94-94
: Inconsistent feature names in the commentThe comment mentions excluding
"OverallQual"
,"GrLivArea"
, and"GarageCars"
, which seem unrelated to sleep efficiency. These features appear to be from a different dataset, possibly housing data. Please update the comment to reflect the correct features relevant to sleep efficiency.
120-120
: Useless expressiontrain_set.dtypes
The expression
train_set.dtypes
is evaluated but not assigned or used elsewhere in the code. If this was intended for debugging, consider removing it or assigning the result to a variable for further use.🧰 Tools
🪛 Ruff
120-120: Found useless expression. Either assign it to a variable or remove it.
(B018)
124-124
: Redundant evaluation ofdataframe_records[0]
The expression
dataframe_records[0]
is evaluated but not used or assigned. If this was meant for debugging purposes, consider removing it or utilizing it meaningfully within the code.
128-128
: Module-level import not at the top of the fileThe import statement
import pandas as pd
should be placed at the top of the file according to PEP 8 guidelines. This improves code readability and maintains consistency.Apply this diff to move the import statement:
17a18 + import pandas as pd 128d128 - import pandas as pd🧰 Tools
🪛 Ruff
128-128: Module level import not at top of file
(E402)
158-158
: Useless expressionsleep_features.dtypes
The expression
sleep_features.dtypes
is evaluated but not used or assigned to any variable. If this was intended for debugging, consider removing it or using the result appropriately.🧰 Tools
🪛 Ruff
158-158: Found useless expression. Either assign it to a variable or remove it.
(B018)
notebooks/week3/02.model_serving.py (3)
133-133
: Correct typo in print statement.There's a typo in the print statement: "Reponse" should be "Response".
Apply this diff to fix the typo:
-print("Reponse text:", response.text) +print("Response text:", response.text)
93-109
: Remove irrelevant example payload in the docstring.The multiline string contains an example payload unrelated to the current context, which may cause confusion.
Apply this diff to remove the unnecessary code:
-""" -Each body should be list of json with columns - -[{'LotFrontage': 78.0, - 'LotArea': 9317, - 'OverallQual': 6, - 'OverallCond': 5, - 'YearBuilt': 2006, - 'Exterior1st': 'VinylSd', - 'Exterior2nd': 'VinylSd', - 'MasVnrType': 'None', - 'Foundation': 'PConc', - 'Heating': 'GasA', - 'CentralAir': 'Y', - 'SaleType': 'WD', - 'SaleCondition': 'Normal'}] -"""
42-45
: Parameterize the model version instead of hardcoding.Hardcoding
entity_version=3
and the model name with version can lead to maintenance issues. Parameterizing the model version makes the code more adaptable to changes.Introduce a variable for the model version:
model_version = 3 # Update as necessary or retrieve dynamicallyUpdate the code to use this variable:
entity_name=f"{catalog_name}.{schema_name}.sleep_efficiency_model_basic", scale_to_zero_enabled=True, workload_size="Small", - entity_version=3, + entity_version=model_version,traffic_config=TrafficConfig( - routes=[Route(served_model_name="sleep_efficiency_model_basic-3", traffic_percentage=100)] + routes=[Route(served_model_name=f"sleep_efficiency_model_basic-{model_version}", traffic_percentage=100)] ),Also applies to: 50-50
notebooks/week3/04.AB_test_model_serving.py (2)
269-269
: Remove or utilize the unnecessary expression at line 269The standalone expression
predictions
doesn't affect the program's execution. If the intention is to display the predictions, consider using a print statement or logging.Apply this change to display the predictions:
-predictions +print(predictions)🧰 Tools
🪛 Ruff
269-269: Found useless expression. Either assign it to a variable or remove it.
(B018)
340-344
: Add error handling for the HTTP request to the serving endpointCurrently, the
requests.post
call lacks error handling, which may lead to unhandled exceptions in case of network failures or unexpected server responses. Implementing error handling will make the code more robust.Here's a suggested refactor:
+try: response = requests.post( f"{model_serving_endpoint}", headers={"Authorization": f"Bearer {token}"}, json={"dataframe_records": dataframe_records[0]}, ) + response.raise_for_status() +except requests.exceptions.HTTPError as http_err: + print(f"HTTP error occurred: {http_err}") +except Exception as err: + print(f"An error occurred: {err}")notebooks/week3/01.feature_serving.py (1)
253-254
: Verify the response status codes during load testingIn the load test, the status codes of the responses are retrieved but not checked. It's recommended to verify that all requests were successful and handle any failed requests to ensure accurate performance measurements.
Consider updating the code to check for successful status codes:
for future in as_completed(futures): status_code, latency = future.result() if status_code == 200: latencies.append(latency) else: print(f"Request failed with status code: {status_code}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (8)
notebooks/week1&2/05.log_and_register_fe_model.py
(3 hunks)notebooks/week3/01.feature_serving.py
(1 hunks)notebooks/week3/02.model_serving.py
(1 hunks)notebooks/week3/03.model_serving_feature_lookup.py
(1 hunks)notebooks/week3/04.AB_test_model_serving.py
(1 hunks)notebooks/week3/README.md
(1 hunks)project_config.yml
(1 hunks)src/sleep_efficiency/config.py
(1 hunks)
🧰 Additional context used
🪛 Ruff
notebooks/week1&2/05.log_and_register_fe_model.py
3-3: SyntaxError: Expected a statement
3-3: SyntaxError: Simple statements must be separated by newlines or semicolons
3-3: SyntaxError: Expected an identifier
3-3: SyntaxError: Expected an identifier
3-3: SyntaxError: Simple statements must be separated by newlines or semicolons
notebooks/week3/01.feature_serving.py
174-174: Undefined name dbutils
(F821)
183-183: Undefined name display
(F821)
notebooks/week3/02.model_serving.py
62-62: Undefined name dbutils
(F821)
notebooks/week3/03.model_serving_feature_lookup.py
89-89: Undefined name dbutils
(F821)
120-120: Found useless expression. Either assign it to a variable or remove it.
(B018)
128-128: Module level import not at top of file
(E402)
158-158: Found useless expression. Either assign it to a variable or remove it.
(B018)
notebooks/week3/04.AB_test_model_serving.py
269-269: Found useless expression. Either assign it to a variable or remove it.
(B018)
301-301: Undefined name dbutils
(F821)
🪛 LanguageTool
notebooks/week3/README.md
[uncategorized] ~9-~9: Possible missing comma found.
Context: ...red in this lecture. ## Overview Last week we demonstrated model training and regi...
(AI_HYDRA_LEO_MISSING_COMMA)
[grammar] ~27-~27: The verb form ‘shows’ does not seem to match the subject ‘examples’.
Context: ...e lookups. The subsequent code examples shows how to invoke this endpoint and get res...
(SUBJECT_VERB_AGREEMENT_PLURAL)
[uncategorized] ~33-~33: You might be missing the article “the” here.
Context: ...ndpoint that can be used for inference. Endpoint creation process is similar to feature ...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[uncategorized] ~37-~37: You might be missing the article “the” here.
Context: ... the model. It's important to note that entity name we pass is a registered model name...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[uncategorized] ~40-~40: You might be missing the article “a” here.
Context: ...also added an example piece of code for simple load test to get average latency. ### ...
(AI_EN_LECTOR_MISSING_DETERMINER_A)
[uncategorized] ~47-~47: The preposition ‘to’ seems more likely in this position.
Context: ...rained model and create a feature table for look up. Then we create a model serving...
(AI_HYDRA_LEO_REPLACE_FOR_TO)
[typographical] ~50-~50: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...he table we created last week on week 2 - 05.log_and_register_fe_model.py noteboo...
(DASH_RULE)
[uncategorized] ~51-~51: You might be missing the article “the” here.
Context: ...ed for our model to look up features at serving endpoint. - Next is the same as in the ...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[typographical] ~52-~52: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e registred in the same notebook week 2 - 05.log_and_register_fe_model.p. This is...
(DASH_RULE)
[uncategorized] ~53-~53: You might be missing the article “a” here.
Context: ...lookup and feature func. - When we send request to the model endpoint, this time, we wo...
(AI_EN_LECTOR_MISSING_DETERMINER_A)
[typographical] ~64-~64: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e the same approach as we did in week 2 - 03.log_and_register_model.py. - We trai...
(DASH_RULE)
[uncategorized] ~70-~70: This verb may not be in the correct form. Consider using a different form for this context.
Context: ..., similar to our previous examples, and shows how to invoke it.
(AI_EN_LECTOR_REPLACEMENT_VERB_FORM)
🪛 Markdownlint
notebooks/week3/README.md
50-50: Expected: asterisk; Actual: dash
Unordered list style
(MD004, ul-style)
51-51: Expected: asterisk; Actual: dash
Unordered list style
(MD004, ul-style)
52-52: Expected: asterisk; Actual: dash
Unordered list style
(MD004, ul-style)
53-53: Expected: asterisk; Actual: dash
Unordered list style
(MD004, ul-style)
63-63: Expected: asterisk; Actual: dash
Unordered list style
(MD004, ul-style)
64-64: Expected: asterisk; Actual: dash
Unordered list style
(MD004, ul-style)
65-65: Expected: asterisk; Actual: dash
Unordered list style
(MD004, ul-style)
66-66: Expected: asterisk; Actual: dash
Unordered list style
(MD004, ul-style)
67-67: Expected: asterisk; Actual: dash
Unordered list style
(MD004, ul-style)
68-68: Expected: asterisk; Actual: dash
Unordered list style
(MD004, ul-style)
69-69: Expected: asterisk; Actual: dash
Unordered list style
(MD004, ul-style)
70-70: Expected: asterisk; Actual: dash
Unordered list style
(MD004, ul-style)
15-15: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
30-30: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
43-43: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
57-57: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (10)
project_config.yml (3)
10-11
: Review A/B test parameter values
The learning rates for both variants are identical (0.02), which defeats the purpose of A/B testing for this parameter. Consider using different values to effectively test the impact of learning rate on model performance.
13-14
: Verify max_depth parameter implications
The significant difference in max_depth values (6 vs 10) could lead to notably different model complexities:
- Variant A (depth=6): More regularized, less prone to overfitting
- Variant B (depth=10): More complex, potentially better at capturing intricate patterns but higher risk of overfitting
Ensure this difference aligns with your experimental objectives and consider monitoring validation metrics closely.
12-12
: 🛠️ Refactor suggestion
Restructure A/B test parameters for consistency
The n_estimators
parameter in the ab_test
section should follow the same naming pattern as other parameters with variant suffixes (a/b). Additionally, consider removing the duplicate definition as it's already defined in the main parameters section.
Apply this change:
ab_test:
learning_rate_a: 0.02
learning_rate_b: 0.02
- n_estimators: 1000
+ n_estimators_a: 1000
+ n_estimators_b: 1000
max_depth_a: 6
max_depth_b: 10
Likely invalid or redundant comment.
src/sleep_efficiency/config.py (1)
15-15
: LGTM! Verify A/B test parameters in config files.
The new ab_test
attribute is well-defined and properly typed.
Let's verify the A/B test parameters are properly defined in the YAML config files:
✅ Verification successful
A/B test parameters are properly configured in YAML
The ab_test
configuration in project_config.yml
is well-structured with:
- Variant-specific parameters (A/B) for learning rate and max depth
- Common parameters shared across variants (n_estimators)
- Values that align with the base model parameters
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if A/B test parameters are defined in YAML config files
# Expected: Find YAML files containing ab_test configuration
# Search for YAML files containing ab_test configuration
rg -l "ab_test:" -g "*.yml" -g "*.yaml"
Length of output: 58
Script:
#!/bin/bash
# Let's check the content of project_config.yml to verify the A/B test parameters
cat project_config.yml
Length of output: 608
notebooks/week3/README.md (1)
58-59
: Fix formatting and verify A/B testing implementation.
- Add Python language specification to the code block
- Use consistent list style (asterisks instead of dashes)
The A/B testing implementation using hash function needs verification to ensure fair traffic distribution.
Also applies to: 63-70
✅ Verification successful
A/B testing implementation is well-implemented with fair traffic distribution
The implementation in notebooks/week3/04.AB_test_model_serving.py
shows a robust A/B testing setup:
- Uses MD5 hash of user ID for consistent model assignment
- Modulo 2 operation ensures 50-50 split between models
- Wrapper class properly handles input validation and model selection
The original review comment's concerns about traffic distribution are addressed by the implementation.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify A/B testing implementation in the codebase
# Search for hash function implementation in A/B testing
rg -A 10 "hash.*function" "04.AB_test_model_serving.py"
# Look for model wrapper class implementation
ast-grep --pattern 'class $_ {
$$$
predict($_, $_) {
$$$
}
}'
Length of output: 197
Script:
#!/bin/bash
# Search for the A/B testing related Python files in notebooks directory
fd -e py "AB.*test" notebooks/
# Check content of any A/B testing related notebooks
fd -e ipynb "AB.*test" notebooks/ -x cat {}
# Search for hash function implementation in Python files under notebooks
rg -A 10 "hash.*function" notebooks/
Length of output: 587
Script:
#!/bin/bash
# Check the content of the A/B testing implementation file
cat notebooks/week3/04.AB_test_model_serving.py
# Search for any other files that might reference A/B testing implementation
rg -l "hash|wrapper.*class|AB.*test" notebooks/week3/
Length of output: 11528
notebooks/week3/02.model_serving.py (1)
35-35
: Verify dataset size before converting to Pandas DataFrame.
Converting a large Spark DataFrame to a Pandas DataFrame using .toPandas()
can cause memory issues if the dataset is substantial.
Ensure that the train_set
is small enough to fit into memory. If it's large, consider sampling or processing the data in Spark before conversion.
notebooks/week3/04.AB_test_model_serving.py (1)
301-301
: Verify the availability of dbutils
in the execution environment
The variable dbutils
might be undefined if this script runs outside a Databricks notebook environment. Ensure that dbutils
is available or import it appropriately.
If necessary, you can import dbutils
using:
from pyspark.dbutils import DBUtils
dbutils = DBUtils(spark)
🧰 Tools
🪛 Ruff
301-301: Undefined name dbutils
(F821)
notebooks/week3/01.feature_serving.py (3)
174-174
: Undefined name dbutils
outside of notebook context
The variable dbutils
is used without being defined or imported. In Databricks notebooks, dbutils
is provided implicitly. However, if this code is intended to run outside of a notebook environment, this will result in a NameError
.
Ensure that dbutils
is appropriately defined or imported if the code will be executed outside of a notebook. If running in a notebook, you can disregard the static analysis warning.
🧰 Tools
🪛 Ruff
174-174: Undefined name dbutils
(F821)
183-183
: Undefined name display
outside of notebook context
The function display()
is used without being imported or defined. In Databricks notebooks, display()
is available by default. If this script is intended to run outside of a notebook environment, consider replacing display(id_list)
with print(id_list)
or importing the necessary module.
Ensure compatibility with the intended execution environment.
🧰 Tools
🪛 Ruff
183-183: Undefined name display
(F821)
90-90
: Confirm preprocessing of categorical features for prediction
The line pipeline.predict(df[cat_features + num_features])
uses both categorical and numerical features directly for prediction. Ensure that the pipeline includes preprocessing steps to handle categorical variables appropriately (e.g., encoding). If the model expects preprocessed inputs, verify that the data matches the expected format.
Confirm that the pipeline
can handle the provided categorical features without additional preprocessing.
@@ -1,10 +1,10 @@ | |||
# Databricks notebook source | |||
# The 2 cells below is only when you are running from databricks UI, because of 'possible' not working locally in VS | |||
# %pip install mlops_with_databricks-0.0.1-py3-none-any.whl | |||
%pip install ../mlops_with_databricks-0.0.1-py3-none-any.whl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
The relative path is incorrect and should be updated to ./mlops_with_databricks-0.0.1-py3-none-any.whl
The wheel file is actually located in the same directory as the notebook (notebooks/
), not in the parent directory. The current relative path ../mlops_with_databricks-0.0.1-py3-none-any.whl
is incorrect and would fail to find the package.
🔗 Analysis chain
Consider using an absolute path for package installation.
The relative path ../mlops_with_databricks-0.0.1-py3-none-any.whl
might break if the notebook is run from a different working directory.
Let's verify the package location:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Find the wheel file location
fd -e whl
Length of output: 66
🧰 Tools
🪛 Ruff
3-3: SyntaxError: Expected a statement
3-3: SyntaxError: Simple statements must be separated by newlines or semicolons
3-3: SyntaxError: Expected an identifier
3-3: SyntaxError: Expected an identifier
3-3: SyntaxError: Simple statements must be separated by newlines or semicolons
|
||
# COMMAND ---------- | ||
|
||
token = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undefined name dbutils
The variable dbutils
is not defined in this script. This will result in a NameError
when attempting to retrieve the API token. Ensure that dbutils
is available in the execution environment or properly imported.
🧰 Tools
🪛 Ruff
89-89: Undefined name dbutils
(F821)
dataframe_records[0] = [ | ||
{k: (v.isoformat() if isinstance(v, pd.Timestamp) else v) for k, v in record.items()} | ||
for record in dataframe_records[0] | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Processing only the first record in dataframe_records
Currently, you're modifying only the first element of dataframe_records
, which means only the first record will have the pd.Timestamp
objects formatted correctly. If your intent is to process all records before sending them to the endpoint, you should iterate over the entire list.
Consider updating the code as follows to process all records:
Committable suggestion skipped: line range outside the PR's diff.
sampled_records = train_set[required_columns].sample(n=1000, replace=True).to_dict(orient="records") | ||
dataframe_records = [[record] for record in sampled_records] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor to eliminate code duplication when converting timestamps.
The logic for converting pd.Timestamp
objects to ISO format appears multiple times. Refactoring this into a helper function improves maintainability and reduces repetition.
Define a helper function:
def convert_timestamps(record):
return {k: (v.isoformat() if isinstance(v, pd.Timestamp) else v) for k, v in record.items()}
Update the code to use this function:
-sampled_records = train_set[required_columns].sample(n=1000, replace=True).to_dict(orient="records")
-dataframe_records = [[record] for record in sampled_records]
+sampled_records = train_set[required_columns].sample(n=1000, replace=True).to_dict(orient="records")
+dataframe_records = [convert_timestamps(record) for record in sampled_records]
In the send_request
function:
-def send_request():
- random_record = [
- {k: (v.isoformat() if isinstance(v, pd.Timestamp) else v) for k, v in record.items()}
- for record in random.choice(dataframe_records)
- ]
+def send_request():
+ random_record = random.choice(dataframe_records)
Ensure that when sending the request, random_record
is properly formatted:
response = requests.post(
model_serving_endpoint,
headers=headers,
- json={"dataframe_records": random_record},
+ json={"dataframe_records": [random_record]},
)
Also applies to: 118-121, 153-155
notebooks/week3/02.model_serving.py
Outdated
|
||
# COMMAND ---------- | ||
|
||
token = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix undefined name dbutils
.
The variable dbutils
is not defined in this context, which will lead to a NameError
. To access dbutils
in a Python script, you need to import it appropriately.
Apply this diff to fix the issue:
+from pyspark.dbutils import DBUtils
spark = SparkSession.builder.getOrCreate()
+dbutils = DBUtils(spark)
token = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff
62-62: Undefined name dbutils
(F821)
random_record = [ | ||
{k: (v.isoformat() if isinstance(v, pd.Timestamp) else v) for k, v in record.items()} | ||
for record in random.choice(dataframe_records) | ||
] | ||
start_time = time.time() | ||
response = requests.post( | ||
model_serving_endpoint, | ||
headers=headers, | ||
json={"dataframe_records": random_record}, | ||
) | ||
end_time = time.time() | ||
latency = end_time - start_time | ||
return response.status_code, latency | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling in the send_request
function.
Currently, if a request fails, the exception isn't handled, which might disrupt the load test.
Add try-except blocks to handle exceptions gracefully:
def send_request():
try:
random_record = random.choice(dataframe_records)
start_time = time.time()
response = requests.post(
model_serving_endpoint,
headers=headers,
json={"dataframe_records": [random_record]},
)
end_time = time.time()
latency = end_time - start_time
+ if response.status_code != 200:
+ print(f"Request failed with status code {response.status_code}")
+ return response.status_code, latency
+ except Exception as e:
+ print(f"An error occurred: {e}")
+ return None, None
-
- return response.status_code, latency
Committable suggestion skipped: line range outside the PR's diff.
def predict(self, context, model_input): | ||
if isinstance(model_input, pd.DataFrame): | ||
sleep_person_id = str(model_input["id"].values[0]) | ||
hashed_id = hashlib.md5(sleep_person_id.encode(encoding="UTF-8")).hexdigest() | ||
# convert a hexadecimal (base-16) string into an integer | ||
if int(hashed_id, 16) % 2: | ||
predictions = self.model_a.predict(model_input.drop(["id"], axis=1)) | ||
return {"Prediction": predictions[0], "model": "Model A"} | ||
else: | ||
predictions = self.model_b.predict(model_input.drop(["id"], axis=1)) | ||
return {"Prediction": predictions[0], "model": "Model B"} | ||
else: | ||
raise ValueError("Input must be a pandas DataFrame.") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure proper handling of batch inputs in the predict
method
The current implementation of the predict
method assumes that model_input
contains only a single record, accessing it via values[0]
. This approach may lead to errors or incorrect predictions when multiple records are passed. To make the model robust and capable of handling batch predictions, consider iterating over all records.
Here's a suggested refactor:
def predict(self, context, model_input):
if isinstance(model_input, pd.DataFrame):
- sleep_person_id = str(model_input["id"].values[0])
- hashed_id = hashlib.md5(sleep_person_id.encode(encoding="UTF-8")).hexdigest()
- if int(hashed_id, 16) % 2:
- predictions = self.model_a.predict(model_input.drop(["id"], axis=1))
- return {"Prediction": predictions[0], "model": "Model A"}
- else:
- predictions = self.model_b.predict(model_input.drop(["id"], axis=1))
- return {"Prediction": predictions[0], "model": "Model B"}
+ predictions = []
+ models_used = []
+ for idx, row in model_input.iterrows():
+ sleep_person_id = str(row["id"])
+ hashed_id = hashlib.md5(sleep_person_id.encode("UTF-8")).hexdigest()
+ model_features = row.drop("id").to_frame().T
+ if int(hashed_id, 16) % 2:
+ prediction = self.model_a.predict(model_features)
+ models_used.append("Model A")
+ else:
+ prediction = self.model_b.predict(model_features)
+ models_used.append("Model B")
+ predictions.append(prediction[0])
+ return pd.DataFrame({"Prediction": predictions, "model": models_used})
else:
raise ValueError("Input must be a pandas DataFrame.")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def predict(self, context, model_input): | |
if isinstance(model_input, pd.DataFrame): | |
sleep_person_id = str(model_input["id"].values[0]) | |
hashed_id = hashlib.md5(sleep_person_id.encode(encoding="UTF-8")).hexdigest() | |
# convert a hexadecimal (base-16) string into an integer | |
if int(hashed_id, 16) % 2: | |
predictions = self.model_a.predict(model_input.drop(["id"], axis=1)) | |
return {"Prediction": predictions[0], "model": "Model A"} | |
else: | |
predictions = self.model_b.predict(model_input.drop(["id"], axis=1)) | |
return {"Prediction": predictions[0], "model": "Model B"} | |
else: | |
raise ValueError("Input must be a pandas DataFrame.") | |
def predict(self, context, model_input): | |
if isinstance(model_input, pd.DataFrame): | |
predictions = [] | |
models_used = [] | |
for idx, row in model_input.iterrows(): | |
sleep_person_id = str(row["id"]) | |
hashed_id = hashlib.md5(sleep_person_id.encode("UTF-8")).hexdigest() | |
model_features = row.drop("id").to_frame().T | |
if int(hashed_id, 16) % 2: | |
prediction = self.model_a.predict(model_features) | |
models_used.append("Model A") | |
else: | |
prediction = self.model_b.predict(model_features) | |
models_used.append("Model B") | |
predictions.append(prediction[0]) | |
return pd.DataFrame({"Prediction": predictions, "model": models_used}) | |
else: | |
raise ValueError("Input must be a pandas DataFrame.") |
FeatureLookup( | ||
table_name=feature_table_name, | ||
lookup_key="Id", | ||
feature_names=["id", "sleep_duration", "awakenings", "sleep_month", "Predicted_SleepEfficiency"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent use of 'Id' vs 'id' could lead to errors
There is inconsistency in the use of 'Id'
(uppercase I
) and 'id'
(lowercase i
) as key identifiers in your code. This can cause KeyError
exceptions or unexpected behavior when accessing DataFrame columns or performing lookups. Ensure consistent use of 'id'
(lowercase) throughout the code to match the DataFrame column names.
Apply the following changes to resolve the inconsistency:
At line 132:
- lookup_key="Id",
+ lookup_key="id",
At line 179:
- id_list = preds_df["Id"]
+ id_list = preds_df["id"]
At line 197:
- json={"dataframe_records": [{"Id": "2"}]},
+ json={"dataframe_records": [{"id": "2"}]},
At line 215:
- json={"dataframe_split": {"columns": ["Id"], "data": [["2"]]}},
+ json={"dataframe_split": {"columns": ["id"], "data": [["2"]]}},
At line 237:
- json={"dataframe_records": [{"Id": random_id}]},
+ json={"dataframe_records": [{"id": random_id}]},
Also applies to: 179-179, 197-197, 215-215, 237-237
response = requests.post( | ||
serving_endpoint, | ||
headers=headers, | ||
json={"dataframe_records": [{"Id": random_id}]}, | ||
) | ||
end_time = time.time() | ||
latency = end_time - start_time # Calculate latency for this request | ||
return response.status_code, latency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add exception handling to API requests in send_request
function
The send_request
function lacks error handling for potential exceptions during the API request. This could cause the program to crash during the load test if a network error occurs.
Consider adding exception handling to make the code more robust:
def send_request():
random_id = random.choice(id_list)
start_time = time.time()
try:
response = requests.post(
serving_endpoint,
headers=headers,
json={"dataframe_records": [{"id": random_id}]},
)
response.raise_for_status()
status_code = response.status_code
except requests.exceptions.RequestException as e:
status_code = None
print(f"Request failed: {e}")
end_time = time.time()
latency = end_time - start_time # Calculate latency for this request
return status_code, latency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (9)
notebooks/week3/03.model_serving_feature_lookup.py (3)
1-2
: Consider using a package registry for distributionInstalling from a local wheel file may cause versioning and distribution issues. Consider publishing the package to a private PyPI repository or using Databricks' artifact storage.
69-81
: Consider workload size based on performance requirementsThe endpoint is configured with a "Small" workload size and scale-to-zero enabled. While this is cost-effective, verify that:
- The small instance type can handle your expected request volume
- The cold start latency from scale-to-zero is acceptable for your use case
149-149
: Remove or utilize unused data loadingThe temperature features table is loaded but never used. Either remove this line if it's not needed or add the intended usage of this data.
notebooks/week3/01.feature_serving.py (5)
96-101
: Add error handling for table creationThe table creation should handle cases where the table might already exist.
+try: fe.create_table( name=feature_table_name, primary_keys=["id"], df=preds_df, description="Sleep efficiencies predictions feature table", ) +except Exception as e: + print(f"Error creating table: {e}") + # Consider whether to proceed or raise the exception
151-162
: Consider adding monitoring and alertingWhile the endpoint configuration looks good, consider adding monitoring and alerting for production readiness. This could include metrics for latency, error rates, and resource utilization.
174-174
: Add type hint for dbutilsThe
dbutils
variable is flagged as undefined. Add a type hint to clarify it's a Databricks utility.+from typing import TYPE_CHECKING +if TYPE_CHECKING: + from databricks.sdk.runtime import dbutils🧰 Tools
🪛 Ruff
174-174: Undefined name
dbutils
(F821)
189-202
: Add response content validationThe endpoint test should validate the response content structure and values, not just the status code.
response = requests.post( f"{serving_endpoint}", headers={"Authorization": f"Bearer {token}"}, json={"dataframe_records": [{"Id": "2"}]}, ) end_time = time.time() execution_time = end_time - start_time print("Response status:", response.status_code) +if response.status_code == 200: + response_data = response.json() + # Validate response structure + if "predictions" not in response_data: + print("Warning: Unexpected response structure") + # Validate prediction values + predictions = response_data.get("predictions", []) + if not predictions or not all(isinstance(p, (int, float)) for p in predictions): + print("Warning: Invalid prediction values") print("Reponse text:", response.text) print("Execution time:", execution_time, "seconds")
221-225
: Consider parameterizing load test configurationLoad test parameters like
num_requests
andmax_workers
should be configurable to support different test scenarios.+# Load test configuration +load_test_config = { + "num_requests": 10, + "max_workers": 100, + "timeout_seconds": 30 +} serving_endpoint = f"https://{host}/serving-endpoints/sleep-efficiencies-feature-serving/invocations" id_list = preds_df.select("Id").rdd.flatMap(lambda x: x).collect() headers = {"Authorization": f"Bearer {token}"} -num_requests = 10 +num_requests = load_test_config["num_requests"]notebooks/week3/04.AB_test_model_serving.py (1)
103-103
: Consider retrievinggit_sha
programmaticallyCurrently,
git_sha
is hardcoded as"week3"
. To enhance reproducibility and ensure accurate tracking of model versions, consider retrieving the actual Git commit hash programmatically.Apply this diff to implement the change:
- git_sha = "week3" + import subprocess + git_sha = subprocess.check_output(['git', 'rev-parse', 'HEAD']).decode('ascii').strip()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
notebooks/week1&2/05.log_and_register_fe_model.py
(3 hunks)notebooks/week3/01.feature_serving.py
(1 hunks)notebooks/week3/03.model_serving_feature_lookup.py
(1 hunks)notebooks/week3/04.AB_test_model_serving.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- notebooks/week1&2/05.log_and_register_fe_model.py
🧰 Additional context used
🪛 Ruff
notebooks/week3/01.feature_serving.py
174-174: Undefined name dbutils
(F821)
notebooks/week3/03.model_serving_feature_lookup.py
90-90: Undefined name dbutils
(F821)
notebooks/week3/04.AB_test_model_serving.py
299-299: Undefined name dbutils
(F821)
🔇 Additional comments (4)
notebooks/week3/03.model_serving_feature_lookup.py (2)
90-91
: Skip comment about undefined dbutils as it was already reported in past reviews
🧰 Tools
🪛 Ruff
90-90: Undefined name dbutils
(F821)
44-52
: Verify primary key uniqueness for temperature features
Using only "Month" as the primary key assumes that temperature features are unique per month. This might lead to data conflicts if there are multiple temperature readings per month.
✅ Verification successful
Let me try a different approach to verify the data structure and primary key usage.
Let me try one more search without file type restriction to find the feature engineering code.
Based on the search results, I can now provide a final response since I found the table creation code that shows the complete schema and constraints.
Primary key on "Month" is correctly configured
The table schema in notebooks/week1&2/05.log_and_register_fe_model.py
shows that:
- The table has
Id
,Month
, andAverageTemperature
columns Month
is explicitly defined as a PRIMARY KEY constraint- Data is sourced from
temperatures_netherlands_2021
table with a direct month-to-month mapping
The primary key choice is appropriate as the table stores aggregated temperature data per month, ensuring uniqueness.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for potential duplicate months in the source table
echo "Checking for duplicate months in temperature_features table..."
ast-grep --pattern 'SELECT Month, COUNT(*) as count
FROM $catalog.$schema.temperature_features
GROUP BY Month
HAVING COUNT(*) > 1'
Length of output: 276
Script:
#!/bin/bash
# Search for table creation or feature engineering code
rg -A 10 "temperature_features" --type python
Length of output: 76
Script:
#!/bin/bash
# Search for temperature features table creation and transformation
rg -A 10 "temperature_features"
Length of output: 6625
notebooks/week3/01.feature_serving.py (1)
2-2
: Verify the wheel file installation path
The wheel file is being installed from a relative path. This could break if the notebook is moved to a different location.
✅ Verification successful
The relative wheel file path is correct and follows project conventions
Based on the repository structure and README documentation, the relative wheel file path ../mlops_with_databricks-0.0.1-py3-none-any.whl
in notebooks/week3/01.feature_serving.py
is correct because:
- The wheel file is consistently referenced across multiple notebooks in the same way
- The README explicitly documents the process of building and uploading the wheel file to Databricks volumes
- The notebook structure follows a weekly organization pattern where notebooks are placed in week-specific folders
The relative path ensures the wheel file can be accessed from any week's notebook directory, making it a reliable approach for this project's structure.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if the wheel file exists and get its details
fd -e whl . | while read -r wheel; do
echo "Found wheel: $wheel"
unzip -l "$wheel" | grep -i "METADATA\|RECORD"
done
Length of output: 358
Script:
#!/bin/bash
# Let's try to understand the notebook structure and wheel file references
rg -l "mlops_with_databricks.*\.whl" --type py
# Check if there are any setup or installation instructions in README
fd "README" --type f --exec cat {}
Length of output: 7357
notebooks/week3/04.AB_test_model_serving.py (1)
215-228
: Ensure proper handling of batch inputs in the predict
method
config = ProjectConfig.from_yaml(config_path="../../project_config.yml") | ||
|
||
catalog_name = config.catalog_name | ||
schema_name = config.schema_name | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove duplicate configuration loading
This configuration loading block is identical to the one at lines 38-40. Consider removing this duplicate block to improve maintainability.
-config = ProjectConfig.from_yaml(config_path="../../project_config.yml")
-
-catalog_name = config.catalog_name
-schema_name = config.schema_name
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
config = ProjectConfig.from_yaml(config_path="../../project_config.yml") | |
catalog_name = config.catalog_name | |
schema_name = config.schema_name |
train_set = spark.table(f"{catalog_name}.{schema_name}.train_set").toPandas() | ||
|
||
sampled_records = train_set[required_columns].sample(n=1000, replace=True).to_dict(orient="records") | ||
dataframe_records = [[record] for record in sampled_records] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider batching for large datasets
Loading the entire training set into memory with .toPandas()
could cause OOM issues with large datasets. Consider using:
- Sampling at Spark level before conversion to Pandas
- Processing in batches for production workloads
-train_set = spark.table(f"{catalog_name}.{schema_name}.train_set").toPandas()
-sampled_records = train_set[required_columns].sample(n=1000, replace=True)
+train_set = spark.table(f"{catalog_name}.{schema_name}.train_set").select(*required_columns)
+sampled_records = train_set.sample(n=1000, withReplacement=True).toPandas()
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
train_set = spark.table(f"{catalog_name}.{schema_name}.train_set").toPandas() | |
sampled_records = train_set[required_columns].sample(n=1000, replace=True).to_dict(orient="records") | |
dataframe_records = [[record] for record in sampled_records] | |
train_set = spark.table(f"{catalog_name}.{schema_name}.train_set").select(*required_columns) | |
sampled_records = train_set.sample(n=1000, withReplacement=True).toPandas() | |
dataframe_records = [[record] for record in sampled_records.to_dict(orient="records")] |
response = requests.post( | ||
f"{model_serving_endpoint}", | ||
headers={"Authorization": f"Bearer {token}"}, | ||
json={"dataframe_records": dataframe_records[0]}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for API requests
The endpoint call lacks error handling. Consider adding try-catch blocks and proper error handling for:
- Connection errors
- Timeout errors
- HTTP error responses
+try:
response = requests.post(
f"{model_serving_endpoint}",
headers={"Authorization": f"Bearer {token}"},
json={"dataframe_records": dataframe_records[0]},
+ timeout=30 # Add appropriate timeout
)
+ response.raise_for_status()
+except requests.exceptions.RequestException as e:
+ print(f"Error calling endpoint: {str(e)}")
+ raise
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
response = requests.post( | |
f"{model_serving_endpoint}", | |
headers={"Authorization": f"Bearer {token}"}, | |
json={"dataframe_records": dataframe_records[0]}, | |
) | |
try: | |
response = requests.post( | |
f"{model_serving_endpoint}", | |
headers={"Authorization": f"Bearer {token}"}, | |
json={"dataframe_records": dataframe_records[0]}, | |
timeout=30 # Add appropriate timeout | |
) | |
response.raise_for_status() | |
except requests.exceptions.RequestException as e: | |
print(f"Error calling endpoint: {str(e)}") | |
raise |
# COMMAND ---------- | ||
|
||
# Load the MLflow model for predictions | ||
pipeline = mlflow.sklearn.load_model(f"models:/{catalog_name}.{schema_name}.sleep_efficiency_model_basic/3") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider parameterizing the model version
The model version is hardcoded to "3". Consider moving this to configuration to make it more maintainable and flexible.
-pipeline = mlflow.sklearn.load_model(f"models:/{catalog_name}.{schema_name}.sleep_efficiency_model_basic/3")
+model_version = config.model_version # Add this to your config
+pipeline = mlflow.sklearn.load_model(f"models:/{catalog_name}.{schema_name}.sleep_efficiency_model_basic/{model_version}")
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Outside diff range and nitpick comments (6)
notebooks/week3/02.model_serving.py (4)
72-86
: Move required columns to configuration file.The list of required columns should be maintained in the configuration for better maintainability and reusability.
+# Add to ProjectConfig class +required_columns: List[str] = [ + "age", "sleep_duration", "rem_sleep_percentage", + "deep_sleep_percentage", "light_sleep_percentage", + "awakenings", "caffeine_consumption", + "alcohol_consumption", "exercise_frequency", + "gender", "smoking_status", "bedtime", + "wakeup_time", +] -required_columns = [ - "age", - "sleep_duration", - ... -] +required_columns = config.required_columns
93-109
: Remove or update misleading documentation.The example request body format shows house pricing features (LotFrontage, LotArea, etc.) which don't match the actual sleep efficiency model features being used.
132-134
: Improve response logging security and format.The current logging might expose sensitive data and lacks proper formatting:
- Avoid logging full response text
- Add log levels (INFO/ERROR)
- Format the output consistently
-print("Response status:", response.status_code) -print("Reponse text:", response.text) -print("Execution time:", execution_time, "seconds") +print(f"Response status: {response.status_code}") +if response.status_code != 200: + print(f"Error: {response.status_code}") + print(f"Error message: {response.json().get('message', 'Unknown error')}") +print(f"Execution time: {execution_time:.2f} seconds")
147-148
: Make load test parameters configurable.The number of requests and concurrent workers should be configurable parameters.
+# Add to ProjectConfig class +load_test_config: Dict[str, int] = { + "num_requests": 1000, + "max_workers": 100 +} -num_requests = 1000 +num_requests = config.load_test_config["num_requests"]notebooks/week3/04.AB_test_model_serving.py (2)
96-96
: Consider adding cross-validation for hyperparameter tuningThe current implementation uses fixed hyperparameters from config. Adding cross-validation would help optimize model performance.
Consider implementing k-fold cross-validation:
from sklearn.model_selection import GridSearchCV param_grid = { 'regressor__learning_rate': [parameters_a['learning_rate'], 0.01, 0.1], 'regressor__max_depth': [parameters_a['max_depth'], 5, 7] } cv_pipeline = GridSearchCV(pipeline, param_grid, cv=5) cv_pipeline.fit(X_train, y_train)
337-348
: Improve response handling and loggingThe current implementation prints raw response text without proper error handling or structured logging.
Improve response handling:
-response = requests.post( - f"{model_serving_endpoint}", - headers={"Authorization": f"Bearer {token}"}, - json={"dataframe_records": dataframe_records[0]}, -) - -end_time = time.time() -execution_time = end_time - start_time - -print("Response status:", response.status_code) -print("Reponse text:", response.text) -print("Execution time:", execution_time, "seconds") +try: + response = requests.post( + f"{model_serving_endpoint}", + headers={"Authorization": f"Bearer {token}"}, + json={"dataframe_records": dataframe_records[0]}, + timeout=30 + ) + response.raise_for_status() + + end_time = time.time() + execution_time = end_time - start_time + + result = response.json() + print(f"Success! Execution time: {execution_time:.2f} seconds") + print(f"Predictions: {json.dumps(result, indent=2)}") +except requests.exceptions.RequestException as e: + print(f"Error calling endpoint: {e}") + if hasattr(e, 'response'): + print(f"Response status: {e.response.status_code}") + print(f"Response body: {e.response.text}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
notebooks/week3/01.feature_serving.py
(1 hunks)notebooks/week3/02.model_serving.py
(1 hunks)notebooks/week3/03.model_serving_feature_lookup.py
(1 hunks)notebooks/week3/04.AB_test_model_serving.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- notebooks/week3/01.feature_serving.py
- notebooks/week3/03.model_serving_feature_lookup.py
🔇 Additional comments (2)
notebooks/week3/02.model_serving.py (1)
2-2
: Verify the integrity of the local wheel file installation.
Installing from a local wheel file could pose security risks. Consider:
- Publishing the package to a private PyPI repository
- Adding integrity checks (e.g., SHA-256 hash verification)
- Documenting the wheel file's source and build process
notebooks/week3/04.AB_test_model_serving.py (1)
215-228
: LGTM with existing review comment
The implementation of the A/B testing logic using hashed IDs is sound, but please address the batch prediction handling issue raised in the previous review.
name="sleep-efficiencies-model-serving", | ||
config=EndpointCoreConfigInput( | ||
served_entities=[ | ||
ServedEntityInput( | ||
entity_name=f"{catalog_name}.{schema_name}.sleep_efficiency_model_basic", | ||
scale_to_zero_enabled=True, | ||
workload_size="Small", | ||
entity_version=3, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid hardcoding endpoint name and version number.
The endpoint name and version number should be configurable parameters:
- Move "sleep-efficiencies-model-serving" to configuration
- Make the entity version (3) configurable
+# Add to ProjectConfig class
+model_endpoint_name: str = "sleep-efficiencies-model-serving"
+model_version: int = 3
- name="sleep-efficiencies-model-serving",
+ name=config.model_endpoint_name,
config=EndpointCoreConfigInput(
served_entities=[
ServedEntityInput(
entity_name=f"{catalog_name}.{schema_name}.sleep_efficiency_model_basic",
scale_to_zero_enabled=True,
workload_size="Small",
- entity_version=3,
+ entity_version=config.model_version,
Committable suggestion skipped: line range outside the PR's diff.
with ThreadPoolExecutor(max_workers=100) as executor: | ||
futures = [executor.submit(send_request) for _ in range(num_requests)] | ||
|
||
for future in as_completed(futures): | ||
status_code, latency = future.result() | ||
latencies.append(latency) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve load test results collection and error handling.
The current implementation doesn't track failed requests or provide detailed statistics.
+results = {"success": 0, "failed": 0, "latencies": [], "errors": {}}
with ThreadPoolExecutor(max_workers=100) as executor:
futures = [executor.submit(send_request) for _ in range(num_requests)]
for future in as_completed(futures):
- status_code, latency = future.result()
- latencies.append(latency)
+ try:
+ status_code, latency = future.result()
+ if status_code == 200:
+ results["success"] += 1
+ results["latencies"].append(latency)
+ else:
+ results["failed"] += 1
+ results["errors"][status_code] = results["errors"].get(status_code, 0) + 1
+ except Exception as e:
+ results["failed"] += 1
+ error_type = type(e).__name__
+ results["errors"][error_type] = results["errors"].get(error_type, 0) + 1
+print(f"\nResults:")
+print(f"Success: {results['success']}/{num_requests} ({results['success']/num_requests*100:.2f}%)")
+print(f"Failed: {results['failed']}/{num_requests} ({results['failed']/num_requests*100:.2f}%)")
+if results["errors"]:
+ print("\nErrors:")
+ for error, count in results["errors"].items():
+ print(f"- {error}: {count}")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
with ThreadPoolExecutor(max_workers=100) as executor: | |
futures = [executor.submit(send_request) for _ in range(num_requests)] | |
for future in as_completed(futures): | |
status_code, latency = future.result() | |
latencies.append(latency) | |
results = {"success": 0, "failed": 0, "latencies": [], "errors": {}} | |
with ThreadPoolExecutor(max_workers=100) as executor: | |
futures = [executor.submit(send_request) for _ in range(num_requests)] | |
for future in as_completed(futures): | |
try: | |
status_code, latency = future.result() | |
if status_code == 200: | |
results["success"] += 1 | |
results["latencies"].append(latency) | |
else: | |
results["failed"] += 1 | |
results["errors"][status_code] = results["errors"].get(status_code, 0) + 1 | |
except Exception as e: | |
results["failed"] += 1 | |
error_type = type(e).__name__ | |
results["errors"][error_type] = results["errors"].get(error_type, 0) + 1 | |
print(f"\nResults:") | |
print(f"Success: {results['success']}/{num_requests} ({results['success']/num_requests*100:.2f}%)") | |
print(f"Failed: {results['failed']}/{num_requests} ({results['failed']/num_requests*100:.2f}%)") | |
if results["errors"]: | |
print("\nErrors:") | |
for error, count in results["errors"].items(): | |
print(f"- {error}: {count}") |
# Repeat the training and logging steps for Model B using parameters for B | ||
pipeline = Pipeline(steps=[("preprocessor", preprocessor), ("regressor", LGBMRegressor(**parameters_b))]) | ||
|
||
# Start MLflow run for Model B | ||
with mlflow.start_run(tags={"model_class": "B", "git_sha": git_sha}) as run: | ||
run_id = run.info.run_id | ||
|
||
pipeline.fit(X_train, y_train) | ||
y_pred = pipeline.predict(X_test) | ||
|
||
mse = mean_squared_error(y_test, y_pred) | ||
mae = mean_absolute_error(y_test, y_pred) | ||
r2 = r2_score(y_test, y_pred) | ||
|
||
mlflow.log_param("model_type", "LightGBM with preprocessing") | ||
mlflow.log_params(parameters_b) | ||
mlflow.log_metric("mse", mse) | ||
mlflow.log_metric("mae", mae) | ||
mlflow.log_metric("r2_score", r2) | ||
signature = infer_signature(model_input=X_train, model_output=y_pred) | ||
|
||
dataset = mlflow.data.from_spark(train_set_spark, table_name=f"{catalog_name}.{schema_name}.train_set", version="0") | ||
mlflow.log_input(dataset, context="training") | ||
mlflow.sklearn.log_model(sk_model=pipeline, artifact_path="lightgbm-pipeline-model", signature=signature) | ||
|
||
model_version = mlflow.register_model( | ||
model_uri=f"runs:/{run_id}/lightgbm-pipeline-model", name=model_name, tags={"git_sha": f"{git_sha}"} | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor duplicate model training code
The Model B training code is duplicated from Model A. Consider extracting the training logic into a reusable function.
Suggested refactor:
def train_and_log_model(X_train, y_train, parameters, model_class, git_sha):
pipeline = Pipeline(steps=[
("preprocessor", preprocessor),
("regressor", LGBMRegressor(**parameters))
])
with mlflow.start_run(tags={"model_class": model_class, "git_sha": git_sha}) as run:
run_id = run.info.run_id
pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_test)
metrics = {
"mse": mean_squared_error(y_test, y_pred),
"mae": mean_absolute_error(y_test, y_pred),
"r2": r2_score(y_test, y_pred)
}
mlflow.log_param("model_type", "LightGBM with preprocessing")
mlflow.log_params(parameters)
for metric_name, value in metrics.items():
mlflow.log_metric(metric_name, value)
signature = infer_signature(model_input=X_train, model_output=y_pred)
dataset = mlflow.data.from_spark(
train_set_spark,
table_name=f"{catalog_name}.{schema_name}.train_set",
version="0"
)
mlflow.log_input(dataset, context="training")
mlflow.sklearn.log_model(
sk_model=pipeline,
artifact_path="lightgbm-pipeline-model",
signature=signature
)
return run_id, pipeline
# Usage:
run_id_a, model_a = train_and_log_model(X_train, y_train, parameters_a, "A", git_sha)
run_id_b, model_b = train_and_log_model(X_train, y_train, parameters_b, "B", git_sha)
model_name = f"{catalog_name}.{schema_name}.sleep_efficiencies_model_ab" | ||
|
||
# Git commit hash for tracking model version | ||
git_sha = "week3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace hardcoded git SHA with actual commit hash
Using a hardcoded value "week3" as git SHA defeats the purpose of version tracking. This should be dynamically retrieved from the git repository.
Replace with actual git SHA using:
-git_sha = "week3"
+import subprocess
+git_sha = subprocess.check_output(['git', 'rev-parse', 'HEAD']).decode('ascii').strip()
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
git_sha = "week3" | |
import subprocess | |
git_sha = subprocess.check_output(['git', 'rev-parse', 'HEAD']).decode('ascii').strip() |
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version.version}") | ||
|
||
# Run prediction | ||
predictions = model.predict(X_test.iloc[0:1]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for model loading and prediction
The current implementation lacks error handling for model loading and prediction operations.
Add error handling:
-model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version.version}")
+try:
+ model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version.version}")
+except Exception as e:
+ print(f"Error loading model: {e}")
+ raise
-# Run prediction
-predictions = model.predict(X_test.iloc[0:1])
+try:
+ predictions = model.predict(X_test.iloc[0:1])
+except Exception as e:
+ print(f"Error during prediction: {e}")
+ raise
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version.version}") | |
# Run prediction | |
predictions = model.predict(X_test.iloc[0:1]) | |
try: | |
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version.version}") | |
except Exception as e: | |
print(f"Error loading model: {e}") | |
raise | |
try: | |
predictions = model.predict(X_test.iloc[0:1]) | |
except Exception as e: | |
print(f"Error during prediction: {e}") | |
raise |
workspace.serving_endpoints.create( | ||
name="sleep-efficiencies-model-serving-ab-test", | ||
config=EndpointCoreConfigInput( | ||
served_entities=[ | ||
ServedEntityInput( | ||
entity_name=f"{catalog_name}.{schema_name}.sleep_efficiencies_model_pyfunc_ab_test", | ||
scale_to_zero_enabled=True, | ||
workload_size="Small", | ||
entity_version=model_version.version, | ||
) | ||
] | ||
), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add retry logic and validation for endpoint creation
The endpoint creation lacks retry logic and validation, which could lead to failures in unstable network conditions.
Add retry logic:
+from tenacity import retry, stop_after_attempt, wait_exponential
+
+@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10))
+def create_endpoint(workspace, name, config):
+ try:
+ endpoint = workspace.serving_endpoints.create(name=name, config=config)
+ # Wait for endpoint to be ready
+ while endpoint.state.config_update is not None:
+ time.sleep(10)
+ endpoint = workspace.serving_endpoints.get(name)
+ return endpoint
+ except Exception as e:
+ print(f"Error creating endpoint: {e}")
+ raise
-workspace.serving_endpoints.create(
+create_endpoint(
+ workspace,
name="sleep-efficiencies-model-serving-ab-test",
config=EndpointCoreConfigInput(...)
)
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice!
Release Notes
New Features
Improvements
Bug Fixes
Resolved compatibility issues with pyarrow and mlflow.
Chores
Updated project dependencies and versioning in pyproject.toml.
Summary by CodeRabbit
Release Notes
New Features
Enhancements
Documentation
Bug Fixes