Skip to content

Commit 628e87e

Browse files
Feature group sample (#46)
Added a feature group sample subdirectory, README, and sample code. This PR assumes feature group testing code from PR #50, is already in place.
1 parent 260ed83 commit 628e87e

File tree

4 files changed

+114
-0
lines changed

4 files changed

+114
-0
lines changed

samples/feature_group/README.md

+62
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Feature Group Sample
2+
3+
This sample demonstrates how to create a feature group using the Amazon AWS Controllers for Kubernetes (ACK) service controller for Amazon SageMaker.
4+
5+
Inspiration for this sample was taken from the notebook on [Fraud Detection with Amazon SageMaker FeatureStore](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-featurestore/sagemaker_featurestore_fraud_detection_python_sdk.html).
6+
7+
## Prerequisites
8+
9+
This sample assumes that you have completed the [common prerequisites](https://github.com/aws-controllers-k8s/sagemaker-controller/blob/main/samples/README.md).
10+
11+
### Create an S3 bucket:
12+
13+
Since we are using the offline store in this example, you need to set up an s3 bucket. [Here are directions](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html) to set up your s3 bucket through the S3 Console, AWS SDK, or AWS CLI.
14+
15+
### Updating the Feature Group Specification:
16+
17+
In the `my-feature-group.yaml` file, modify the placeholder values with those associated with your account and feature group.
18+
19+
## Creating your Feature Group
20+
21+
### Create a Feature Group:
22+
23+
To submit your prepared feature group specification, apply the specification to your Kubernetes cluster as such:
24+
25+
```
26+
$ kubectl apply -f my-feature-group.yaml
27+
featuregroup.sagemaker.services.k8s.aws/my-feature-group created
28+
```
29+
30+
### List Feature Groups:
31+
32+
To list all feature groups created using the ACK controller use the following command:
33+
34+
```
35+
$ kubectl get featuregroup
36+
```
37+
38+
### Describe a Feature Group:
39+
40+
To get more details about the feature group once it's submitted, like checking the status, errors or parameters of the feature group use the following command:
41+
42+
```
43+
$ kubectl describe featuregroup my-feature-group
44+
```
45+
46+
## Ingesting Data into your Feature Group
47+
48+
Note that ingestion is **not** supported in the controller.
49+
To ingest data from the my-sample-data.csv file into your feature group, use the following command:
50+
51+
```
52+
$ python3 data_ingestion.py -i my-sample-data.csv -fg my-feature-group
53+
```
54+
55+
## Deleting your Feature Group
56+
57+
To delete the feature group, use the following command:
58+
59+
```
60+
$ kubectl delete featuregroup my-feature-group
61+
featuregroup.sagemaker.services.k8s.aws "my-feature-group" deleted
62+
```
+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
#!/usr/bin/python
2+
3+
import argparse
4+
import boto3
5+
import csv
6+
7+
sagemaker_featurestore_runtime_client = boto3.Session().client(
8+
service_name="sagemaker-featurestore-runtime")
9+
10+
# Initialize the parser.
11+
parser = argparse.ArgumentParser()
12+
parser.add_argument("-i", "--input_file", help = "Path to a csv file containing data for ingestion.")
13+
parser.add_argument("-fg", "--feature_group_name", help = "Name of the feature group to write data to.")
14+
15+
# Read arguments from the command line.
16+
args = parser.parse_args()
17+
18+
# Write records from the csv file to s3.
19+
with open(args.input_file) as file_handle:
20+
for row in csv.DictReader(file_handle, skipinitialspace=True):
21+
record=[]
22+
for featureName, valueAsString in row.items():
23+
record.append({
24+
'FeatureName':featureName,
25+
'ValueAsString':valueAsString
26+
})
27+
sagemaker_featurestore_runtime_client.put_record(
28+
FeatureGroupName=args.feature_group_name,
29+
Record=record)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
apiVersion: sagemaker.services.k8s.aws/v1alpha1
2+
kind: FeatureGroup
3+
metadata:
4+
name: <YOUR FEATURE GROUP NAME>
5+
spec:
6+
eventTimeFeatureName: EventTime
7+
featureDefinitions:
8+
- featureName: TransactionID
9+
featureType: Integral
10+
- featureName: EventTime
11+
featureType: Fractional
12+
featureGroupName: <YOUR FEATURE GROUP NAME>
13+
recordIdentifierFeatureName: TransactionID
14+
offlineStoreConfig:
15+
s3StorageConfig:
16+
s3URI: s3://<YOUR BUCKET>/feature-group-data
17+
onlineStoreConfig:
18+
enableOnlineStore: True
19+
roleARN: <YOUR SAGEMAKER ROLE ARN>
+4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
TransactionID,EventTime
2+
1,1623434915
3+
2,1623435267
4+
3,1623435284

0 commit comments

Comments
 (0)