Skip to content

project-codeflare/appwrapper

Repository files navigation

AppWrapper

License Continuous Integration

An AppWrapper contains a collection of Kubernetes resources that a user desires to manage as a single logical workload. AppWrapper is designed to smoothly interoperate with Kueue. AppWrapper provides a flexible and workload-agnostic mechanism for enabling Kueue to manage a group of Kubernetes resources as a single logical unit without requiring any Kueue-specific support by the controllers of those resources. Beginning in Kueue 0.11 (and AppWrapper v1.1), AppWrapper is a built-in Kueue integration and is enabled by default. In older versions AppWrapper was supported by Kueue as an external framework and needed to be explicitly enabled via a custom Kueue configuration.

An AppWrapper can be used to harden workloads by providing an additional level of automatic fault detection and recovery. The AppWrapper controller monitors the health of the workload and if corrective actions are not taken by the primary resource controllers within specified deadlines, the AppWrapper controller will orchestrate workload-level retries and resource deletion to ensure that either the workload returns to a healthy state or is cleanly removed from the cluster and its quota freed for use by other workloads. If Autopilot is also being used on the cluster, the AppWrapper controller can be configured to automatically inject Node anti-affinities into Pods and to trigger retries when Pods in already running workloads are using resources that Autopilot has tagged as unhealthy. For details on customizing and configuring these fault tolerance capabilities, please see the Fault Tolerance section of our website.

AppWrapper is designed to be used as part of fully open source software stack to run production batch workloads on Kubernetes and OpenShift. The MLBatch project leverages Kueue, the Kubeflow Training Operator, KubeRay, and the Codeflare Operator from Red Hat OpenShift AI. MLBatch enables AppWrapper and adds Coscheduler. MLBatch includes a number of configuration steps to help these components work in harmony and support large workloads on large clusters.

Installation

To install the latest release of AppWrapper in a Kubernetes cluster with Kueue already installed and configured, simply run the command:

kubectl apply --server-side -f https://github.com/project-codeflare/appwrapper/releases/download/v1.1.1/install.yaml

The controller runs in the appwrapper-system namespace.

Read the Quick Start Guide to learn more.

If you have modified the default configuration of Kueue to set manageJobsWithoutQueueName to true, then you must also apply this patch to your Kueue installation.

Usage

For example of AppWrapper usage, browse our Samples directory or see the Samples section of the project website.

Development

To contribute to the AppWrapper project and for detailed instructions on how to build and deploy the project from source, see the Development Setup section of the project website.

License

Copyright 2024 IBM Corporation.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.