-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(lmeval): Add support for S3 offline assets #399
feat(lmeval): Add support for S3 offline assets #399
Conversation
Skipping CI for Draft Pull Request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/LGTM
PR image build and manifest generation completed successfully! 📦 PR image: 📦 LMES driver image: 📦 LMES job image: 🗂️ CI manifests
|
@ruivieira one question about the |
@yhwang I'll do the PR for lm-evaluation-harness shortly.
Good point, it's is possible IMO, I think the above PR (apart from the obvious download methods) would have to be changed so that:
However, if you agree we could have this method now (decoupled from the driver) and investigate moving it to the driver in a future iteration (this would be a non-breaking change, since the download mechanism would be purely internal, the CR API would be the same) |
@ruivieira totally agree! I only wanted to know the possibility when I asked the question. |
If the SSL Verify field was omitted, the controller would crash trying to convert it to a boolean.
/retest |
/test images |
@ruivieira: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
…upport # Conflicts: # cmd/lmes_driver/main.go # controllers/lmes/driver/driver.go
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: RobGeada, yhwang The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…ices/konflux/component-updates/component-update-ta-lmes-driver-v2-18 chore(deps): update ta-lmes-driver-v2-18 to 0caea4f
This PR adds support for using offline assets (i.e. models, datasets) in S3-compatible storage from LMEval.
An example of the new CRD is:
We assume the following in this example:
s3-secret
exists in the same namespace and contains the keysAWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
,AWS_S3_BUCKET
,AWS_S3_ENDPOINT
andAWS_DEFAULT_REGION
.spec.offline.storage.s3.endpoint
can be called anything, as long as its values is the value for the endpoint.AWS_S3_BUCKET
exists (as an example a bucket calledmodels
)s3://models/myassets/model1
ands3://models/myassets/dataset2
) they can be specified inpath
(e.g.myexperiment
)This will allow the LMEval Job to download all assets from
s3://models/myassets
locally and use them for offline evaluation.This PR has a corresponding PR on https://github.com/opendatahub-io/lm-evaluation-harness with the script which downloads the assets given the information on the CR.
The path from LMEval's POV will be relative to the local storage
/opt/app-root/src/hf_home
. i.e.s3://models/myassets/model1
will be available at/opt/app-root/src/hf_home/model1
s3://models/myassets/another/model2
will be available at/opt/app-root/src/hf_home/another/model2