Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: project-codeflare/codeflare-operator
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.0.4
Choose a base ref
...
head repository: project-codeflare/codeflare-operator
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.0.5
Choose a head ref
Loading
Showing with 5,751 additions and 173 deletions.
  1. +157 −0 .github/workflows/e2e_tests.yaml
  2. +2 −2 .github/workflows/precommit.yml
  3. +101 −23 .github/workflows/tag-and-build.yml
  4. +2 −2 .github/workflows/unit_tests.yml
  5. +0 −1 .pre-commit-config.yaml
  6. +9 −2 CONTRIBUTING.md
  7. +162 −16 Makefile
  8. +67 −4 README.md
  9. +2 −17 api/{v1alpha1/groupversion_info.go → codeflare/v1alpha1/doc.go}
  10. +1 −4 api/{ → codeflare}/v1alpha1/instascale_types.go
  11. +16 −8 api/{ → codeflare}/v1alpha1/mcad_types.go
  12. +51 −0 api/codeflare/v1alpha1/register.go
  13. +1 −1 api/{ → codeflare}/v1alpha1/zz_generated.deepcopy.go
  14. +219 −0 client/applyconfiguration/codeflare/v1alpha1/instascale.go
  15. +79 −0 client/applyconfiguration/codeflare/v1alpha1/instascalespec.go
  16. +39 −0 client/applyconfiguration/codeflare/v1alpha1/instascalestatus.go
  17. +219 −0 client/applyconfiguration/codeflare/v1alpha1/mcad.go
  18. +106 −0 client/applyconfiguration/codeflare/v1alpha1/mcadspec.go
  19. +39 −0 client/applyconfiguration/codeflare/v1alpha1/mcadstatus.go
  20. +62 −0 client/applyconfiguration/internal/internal.go
  21. +47 −0 client/applyconfiguration/utils.go
  22. +120 −0 client/clientset/versioned/clientset.go
  23. +85 −0 client/clientset/versioned/fake/clientset_generated.go
  24. +20 −0 client/clientset/versioned/fake/doc.go
  25. +56 −0 client/clientset/versioned/fake/register.go
  26. +20 −0 client/clientset/versioned/scheme/doc.go
  27. +56 −0 client/clientset/versioned/scheme/register.go
  28. +112 −0 client/clientset/versioned/typed/codeflare/v1alpha1/codeflare_client.go
  29. +20 −0 client/clientset/versioned/typed/codeflare/v1alpha1/doc.go
  30. +20 −0 client/clientset/versioned/typed/codeflare/v1alpha1/fake/doc.go
  31. +44 −0 client/clientset/versioned/typed/codeflare/v1alpha1/fake/fake_codeflare_client.go
  32. +189 −0 client/clientset/versioned/typed/codeflare/v1alpha1/fake/fake_instascale.go
  33. +189 −0 client/clientset/versioned/typed/codeflare/v1alpha1/fake/fake_mcad.go
  34. +23 −0 client/clientset/versioned/typed/codeflare/v1alpha1/generated_expansion.go
  35. +256 −0 client/clientset/versioned/typed/codeflare/v1alpha1/instascale.go
  36. +256 −0 client/clientset/versioned/typed/codeflare/v1alpha1/mcad.go
  37. +46 −0 client/informer/externalversions/codeflare/interface.go
  38. +90 −0 client/informer/externalversions/codeflare/v1alpha1/instascale.go
  39. +52 −0 client/informer/externalversions/codeflare/v1alpha1/interface.go
  40. +90 −0 client/informer/externalversions/codeflare/v1alpha1/mcad.go
  41. +251 −0 client/informer/externalversions/factory.go
  42. +64 −0 client/informer/externalversions/generic.go
  43. +40 −0 client/informer/externalversions/internalinterfaces/factory_interfaces.go
  44. +35 −0 client/listers/codeflare/v1alpha1/expansion_generated.go
  45. +99 −0 client/listers/codeflare/v1alpha1/instascale.go
  46. +99 −0 client/listers/codeflare/v1alpha1/mcad.go
  47. +11 −0 config/crd/bases/codeflare.codeflare.dev_mcads.yaml
  48. +3 −1 config/crd/mcad/kustomization.yaml
  49. +17 −0 config/internal/instascale/clusterrole.yaml.tmpl
  50. +11 −0 config/internal/instascale/deployment.yaml.tmpl
  51. +14 −2 config/internal/mcad/deployment.yaml.tmpl
  52. +0 −4 config/manager/kustomization.yaml
  53. +0 −7 config/manifests/kustomization.yaml
  54. +2 −1 controllers/defaults.go
  55. +1 −1 controllers/instascale.go
  56. +5 −5 controllers/instascale_controller.go
  57. +1 −1 controllers/instascale_controller_test.go
  58. +1 −1 controllers/instascale_params.go
  59. +35 −37 controllers/mcad_controller.go
  60. +15 −3 controllers/mcad_controller_test.go
  61. +23 −8 controllers/mcad_params.go
  62. +17 −1 controllers/multi_cluster_app_dispatcher.go
  63. +16 −14 controllers/suite_test.go
  64. +18 −0 controllers/testdata/instascale_test_results/case_1/clusterrole.yaml
  65. +1 −1 controllers/testdata/instascale_test_results/case_1/deployment.yaml
  66. +1 −1 controllers/testdata/instascale_test_results/case_2/deployment.yaml
  67. +6 −0 controllers/testdata/mcad_test_cases/case_3.yaml
  68. +45 −0 controllers/testdata/mcad_test_results/case_3/deployment.yaml
  69. +3 −1 go.mod
  70. +4 −0 go.sum
  71. +5 −4 main.go
  72. +31 −0 test/e2e/kind.yaml
  73. +160 −0 test/e2e/mnist.py
  74. +3 −0 test/e2e/mnist_pip_requirements.txt
  75. +161 −0 test/e2e/mnist_pytorch_mcad_job_test.go
  76. +65 −0 test/e2e/mnist_raycluster_sdk.py
  77. +210 −0 test/e2e/mnist_raycluster_sdk_test.go
  78. +265 −0 test/e2e/mnist_rayjob_mcad_raycluster_test.go
  79. +73 −0 test/e2e/setup.sh
  80. +35 −0 test/e2e/support.go
  81. +58 −0 test/support/batch.go
  82. +99 −0 test/support/client.go
  83. +58 −0 test/support/codeflare.go
  84. +53 −0 test/support/conditions.go
  85. +59 −0 test/support/core.go
  86. +11 −0 test/support/defaults.go
  87. +38 −0 test/support/mcad.go
  88. +55 −0 test/support/namespace.go
  89. +33 −0 test/support/openshift.go
  90. +70 −0 test/support/ray.go
  91. +68 −0 test/support/support.go
  92. +137 −0 test/support/test.go
  93. +41 −0 test/support/utils.go
157 changes: 157 additions & 0 deletions .github/workflows/e2e_tests.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
name: e2e

on:
pull_request:
branches:
- main
- 'release-*'
paths-ignore:
- 'docs/**'
- '**.adoc'
- '**.md'
- 'LICENSE'
push:
branches:
- main
- 'release-*'
paths-ignore:
- 'docs/**'
- '**.adoc'
- '**.md'
- 'LICENSE'

concurrency:
group: ${{ github.head_ref }}-${{ github.workflow }}
cancel-in-progress: true

jobs:
kubernetes:

runs-on: ubuntu-20.04

steps:
- name: Cleanup
run: |
ls -lart
echo "Initial status:"
df -h
echo "Cleaning up resources:"
sudo swapoff -a
sudo rm -f /swapfile
sudo apt clean
sudo rm -rf /usr/share/dotnet
sudo rm -rf /opt/ghc
sudo rm -rf "/usr/local/share/boost"
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
docker rmi $(docker image ls -aq)
echo "Final status:"
df -h
- name: Checkout code
uses: actions/checkout@v3
with:
submodules: recursive

- name: Init directories
run: |
TEMP_DIR="$(pwd)/tmp"
mkdir -p "${TEMP_DIR}"
echo "TEMP_DIR=${TEMP_DIR}" >> $GITHUB_ENV
mkdir -p "$(pwd)/bin"
echo "$(pwd)/bin" >> $GITHUB_PATH
- name: Set Go
uses: actions/setup-go@v3
with:
go-version: v1.18

- name: Set up gotestfmt
uses: gotesttools/gotestfmt-action@v2
with:
token: ${{ secrets.GITHUB_TOKEN }}

- name: Container image registry
run: |
podman run -d -p 5000:5000 --name registry registry:2.8.1
export REGISTRY_ADDRESS=$(hostname -i):5000
echo "REGISTRY_ADDRESS=${REGISTRY_ADDRESS}" >> $GITHUB_ENV
echo "Container image registry started at ${REGISTRY_ADDRESS}"
KIND_CONFIG_FILE=${{ env.TEMP_DIR }}/kind.yaml
echo "KIND_CONFIG_FILE=${KIND_CONFIG_FILE}" >> $GITHUB_ENV
envsubst < ./test/e2e/kind.yaml > ${KIND_CONFIG_FILE}
sudo --preserve-env=REGISTRY_ADDRESS sh -c 'cat > /etc/containers/registries.conf.d/local.conf <<EOF
[[registry]]
prefix = "$REGISTRY_ADDRESS"
insecure = true
location = "$REGISTRY_ADDRESS"
EOF'
- name: Setup KinD cluster
uses: helm/kind-action@v1.5.0
with:
cluster_name: cluster
version: v0.17.0
config: ${{ env.KIND_CONFIG_FILE }}

- name: Print cluster info
run: |
echo "KinD cluster:"
kubectl cluster-info
kubectl describe nodes
- name: Deploy CodeFlare stack
id: deploy
run: |
echo Deploying CodeFlare operator
IMG="${REGISTRY_ADDRESS}"/codeflare-operator
make image-push -e IMG="${IMG}"
make deploy -e IMG="${IMG}"
kubectl wait --timeout=120s --for=condition=Available=true deployment -n openshift-operators codeflare-operator-manager
echo Setting up CodeFlare stack
make setup-e2e
- name: Run e2e tests
run: |
export CODEFLARE_TEST_TIMEOUT_SHORT=1m
export CODEFLARE_TEST_TIMEOUT_MEDIUM=3m
export CODEFLARE_TEST_TIMEOUT_LONG=8m
export CODEFLARE_TEST_OUTPUT_DIR=${{ env.TEMP_DIR }}
echo "CODEFLARE_TEST_OUTPUT_DIR=${CODEFLARE_TEST_OUTPUT_DIR}" >> $GITHUB_ENV
set -euo pipefail
go test -timeout 30m -v ./test/e2e -json 2>&1 | tee ${CODEFLARE_TEST_OUTPUT_DIR}/gotest.log | gotestfmt
- name: Print CodeFlare operator logs
if: always() && steps.deploy.outcome == 'success'
run: |
echo "Printing CodeFlare operator logs"
kubectl logs -n openshift-operators --tail -1 -l app.kubernetes.io/name=codeflare-operator | tee ${CODEFLARE_TEST_OUTPUT_DIR}/codeflare-operator.log
- name: Print MCAD controller logs
if: always() && steps.deploy.outcome == 'success'
run: |
echo "Printing MCAD controller logs"
kubectl logs -n codeflare-system --tail -1 -l component=multi-cluster-application-dispatcher | tee ${CODEFLARE_TEST_OUTPUT_DIR}/mcad.log
- name: Print KubeRay operator logs
if: always() && steps.deploy.outcome == 'success'
run: |
echo "Printing KubeRay operator logs"
kubectl logs -n ray-system --tail -1 -l app.kubernetes.io/name=kuberay | tee ${CODEFLARE_TEST_OUTPUT_DIR}/kuberay.log
- name: Upload logs
uses: actions/upload-artifact@v3
if: always() && steps.deploy.outcome == 'success'
with:
name: logs
retention-days: 10
path: |
${{ env.CODEFLARE_TEST_OUTPUT_DIR }}/**/*.log
4 changes: 2 additions & 2 deletions .github/workflows/precommit.yml
Original file line number Diff line number Diff line change
@@ -21,10 +21,10 @@ jobs:
volumes:
- /cache
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3

- name: Activate cache
uses: actions/cache@v2
uses: actions/cache@v3
with:
path: /cache
key: ${{ runner.os }}-cache-${{ hashFiles('**/go.sum', '.pre-commit-config.yaml') }}
124 changes: 101 additions & 23 deletions .github/workflows/tag-and-build.yml
Original file line number Diff line number Diff line change
@@ -7,35 +7,69 @@ on:
version:
description: 'Tag to be used for operator image'
required: true
default: '0.0.0-dev'
default: 'v0.0.0-dev'
replaces:
description: 'The previous semantic version that this tag replaces.'
required: true
default: '0.0.0-dev'
default: 'v0.0.0-dev'
mcad-version:
description: 'Published version of multi-cluster-app-dispatcher'
required: true
default: 'v0.0.0-dev'
codeflare-sdk-version:
description: 'Published version of CodeFlare-SDK'
required: true
default: 'v0.0.0-dev'
instascale-version:
description: 'Published version of InstaScale'
required: true
default: 'v0.0.0-dev'
is-stable:
description: 'Select if the built image should be tagged as stable'
required: true
type: boolean
quay-organization:
description: 'Quay organization used to push the built images to'
required: true
default: 'project-codeflare'
community-operators-prod-fork-organization:
description: 'Owner of forked community-operators-prod repository used to push bundle files to'
required: true
default: 'project-codeflare'
community-operators-prod-organization:
description: 'Owner of target community-operators-prod repository used to open a PR against'
required: true
default: 'redhat-openshift-ecosystem'

jobs:
push:
runs-on: ubuntu-latest

# Permission required to create a release
permissions:
contents: write

steps:
- uses: actions/checkout@v3

- name: Verify that release doesn't exist yet
shell: bash {0}
run: |
gh release view ${{ github.event.inputs.version }}
status=$?
if [[ $status -eq 0 ]]; then
echo "Release ${{ github.event.inputs.version }} already exists."
exit 1
fi
env:
GITHUB_TOKEN: ${{ github.TOKEN }}

- name: Activate cache
uses: actions/cache@v2
uses: actions/cache@v3
with:
path: /cache
key: ${{ runner.os }}-cache-${{ hashFiles('**/go.sum', '.pre-commit-config.yaml') }}

- name: Create tag
uses: actions/github-script@v6
with:
script: |
github.rest.git.createRef({
owner: context.repo.owner,
repo: context.repo.repo,
ref: 'refs/tags/${{ github.event.inputs.version }}',
sha: context.sha
})
- name: Install operator-sdk
run: make install-operator-sdk

@@ -46,18 +80,62 @@ jobs:
password: ${{ secrets.QUAY_TOKEN }}
registry: quay.io

- name: Image Build
- name: Image Build and Push
run: |
make build
make bundle
make image-build -e IMG=quay.io/project-codeflare/codeflare-operator:${SOURCE_TAG}
podman tag quay.io/project-codeflare/codeflare-operator:${SOURCE_TAG} quay.io/project-codeflare/codeflare-operator:latest
make image-build -e IMG=quay.io/${{ github.event.inputs.quay-organization }}/codeflare-operator:${{ github.event.inputs.version }}
make image-push -e IMG=quay.io/${{ github.event.inputs.quay-organization }}/codeflare-operator:${{ github.event.inputs.version }}
- name: Image Push as stable tag
if: ${{ inputs.is-stable }}
run: |
podman tag quay.io/${{ github.event.inputs.quay-organization }}/codeflare-operator:${{ github.event.inputs.version }} quay.io/${{ github.event.inputs.quay-organization }}/codeflare-operator:stable
make image-push -e IMG=quay.io/${{ github.event.inputs.quay-organization }}/codeflare-operator:stable
- name: Build bundle and create PR in OpenShift community operators repository
run: |
git config --global user.email "138894154+codeflare-machine-account@users.noreply.github.com"
git config --global user.name "codeflare-machine-account"
make openshift-community-operator-release
env:
SOURCE_TAG: ${{ github.event.inputs.version }}
VERSION: ${{ github.event.inputs.version }}
PREVIOUS_VERSION: ${{ github.event.inputs.replaces }}
INSTASCALE_VERSION: ${{ github.event.inputs.instascale-version }}
MCAD_VERSION: ${{ github.event.inputs.mcad-version }}
GH_TOKEN: ${{ secrets.CODEFLARE_MACHINE_ACCOUNT_TOKEN }}
IMAGE_ORG_BASE: quay.io/${{ github.event.inputs.quay-organization }}
OPERATORS_REPO_FORK_ORG: ${{ github.event.inputs.community-operators-prod-fork-organization }}
OPERATORS_REPO_ORG: ${{ github.event.inputs.community-operators-prod-organization }}

- name: Adjust Compatibility Matrix in readme
run: |
sed -i -E "s/(.*CodeFlare Operator.*)v[0-9]+\.[0-9]+\.[0-9]+(.*)/\1${{ github.event.inputs.version }}\2/" README.md
sed -i -E "s/(.*Multi-Cluster App Dispatcher.*)v[0-9]+\.[0-9]+\.[0-9]+(.*)/\1${{ github.event.inputs.mcad-version }}\2/" README.md
sed -i -E "s/(.*CodeFlare-SDK.*)v[0-9]+\.[0-9]+\.[0-9]+(.*)/\1${{ github.event.inputs.codeflare-sdk-version }}\2/" README.md
sed -i -E "s/(.*InstaScale.*)v[0-9]+\.[0-9]+\.[0-9]+(.*)/\1${{ github.event.inputs.instascale-version }}\2/" README.md
- name: Adjust MCAD and InstaScale dependencies in the code
run: |
sed -i -E "s/(.*MCAD_VERSION \?= )v[0-9]+\.[0-9]+\.[0-9]+(.*)/\1${{ github.event.inputs.mcad-version }}\2/" Makefile
sed -i -E "s/(.*INSTASCALE_VERSION \?= )v[0-9]+\.[0-9]+\.[0-9]+(.*)/\1${{ github.event.inputs.instascale-version }}\2/" Makefile
sed -i -E "s/(.*instascale-controller:)v[0-9]+\.[0-9]+\.[0-9]+(.*)/\1${{ github.event.inputs.instascale-version }}\2/" controllers/testdata/instascale_test_results/case_1/deployment.yaml
sed -i -E "s/(.*instascale-controller:)v[0-9]+\.[0-9]+\.[0-9]+(.*)/\1${{ github.event.inputs.instascale-version }}\2/" controllers/testdata/instascale_test_results/case_2/deployment.yaml
- name: Commit readme changes back to repository
uses: stefanzweifel/git-auto-commit-action@v4
with:
commit_message: Update dependency versions for release ${{ github.event.inputs.version }}
file_pattern: 'README.md controllers/defaults.go *.yaml Makefile'

- name: Image Push
- name: Creates a release in GitHub
run: |
make image-push -e IMG=quay.io/project-codeflare/codeflare-operator:${SOURCE_TAG}
make image-push -e IMG=quay.io/project-codeflare/codeflare-operator:latest
gh release create ${{ github.event.inputs.version }} --target ${{ github.ref }} --generate-notes
# Edit notes to add there compatibility matrix
sed --null-data -E "s/(.*<\!-- Compatibility Matrix start -->)(.*)(<\!-- Compatibility Matrix end -->.*)/\2/" README.md > release-notes.md
echo "" >> release-notes.md
echo "$(gh release view --json body --jq .body)" >> release-notes.md
gh release edit ${{ github.event.inputs.version }} --notes-file release-notes.md
rm release-notes.md
env:
SOURCE_TAG: ${{ github.event.inputs.version }}
GITHUB_TOKEN: ${{ github.TOKEN }}
shell: bash
4 changes: 2 additions & 2 deletions .github/workflows/unit_tests.yml
Original file line number Diff line number Diff line change
@@ -22,10 +22,10 @@ jobs:
volumes:
- /cache
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3

- name: Activate cache
uses: actions/cache@v2
uses: actions/cache@v3
with:
path: /cache
key: ${{ runner.os }}-cache-${{ hashFiles('**/go.sum', '.pre-commit-config.yaml') }}
1 change: 0 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -25,5 +25,4 @@ repos:
hooks:
- id: go-fmt
- id: golangci-lint
- id: go-build
- id: go-mod-tidy
11 changes: 9 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -31,14 +31,21 @@ If changes are made to any Go code (like in the `controllers` dir for example),
- This will check and build/compile the modified code

For building and pushing a new version of the operator image:
- `make image-build -e IMG=<image-repo/image-name>`
- `make image-push -e IMG<image-repo/image-name>`
- `make image-build -e IMAGE_TAG_BASE=<image-repo/image-name> VERSION=<semver>`
- `make image-push -e IMAGE_TAG_BASE=<image-repo/image-name> VERSION=<semver>`

For deploying onto a cluster:
- First, either set `KUBECONFIG` or ensure you are logged into a cluster in your environment
- `make install`
- `make deploy -e IMG=<image-repo/image-name>`

For building and pushing a new version of the bundled operator image:
- `make bundle-build -e IMAGE_TAG_BASE=<image-repo/image-name> VERSION=<new semver> PREVIOUS_VERSION=<semver to replace>`
- `make bundle-push -e IMAGE_TAG_BASE=<image-repo/image-name> VERSION=<new semver> PREVIOUS_VERSION=<semver to replace>`

To create a new openshift-community-operator-release:
- `make openshift-community-operator-release -e IMAGE_TAG_BASE=<image-repo/image-name> VERSION=<new semver> PREVIOUS_VERSION=<semver to replace> GH_TOKEN=<GitHub token for pushing bundle content to forked repository>`

## Testing
The CodeFlare Operator currently has unit tests and pre-commit checks
- To enable and view pre-commit checks: `pre-commit install`
Loading