Apply PodSetInfo to PipelineRun taskRunTemplate.podTemplate

**What would you like to be added**:

Enhance the `RunWithPodSetsInfo` method in the PipelineRun controller to apply labelSelector and tolerations from the `PodSetInfo` to the `pipelinerun.spec.taskRunTemplate.podTemplate` field.

Currently, the `RunWithPodSetsInfo` method in `internal/controller/pipelinerun_controller.go` only sets the spec status to empty and returns:

```go
func (p *PipelineRun) RunWithPodSetsInfo(podSetsInfo []podset.PodSetInfo) error {
	p.Spec.Status = ""
	return nil
}
```

The enhancement should:

1. **Extract node scheduling information from PodSetInfo**: Parse the `podSetsInfo` parameter to extract:
   - Node selectors from the resource flavor
   - Tolerations for tainted nodes
   - Any additional pod template specifications

2. **Apply to PipelineRun taskRunTemplate**: Update the PipelineRun's `spec.taskRunTemplate.podTemplate` field with:
   - `nodeSelector` from the podset.PodSetInfo
   - `tolerations` from the podset.PodSetInfo

3. **Handle multiple PodSetInfo entries**: When multiple PodSetInfo entries exist, apply the appropriate scheduling constraints to ensure all TaskRuns in the PipelineRun are scheduled according to the resource flavor requirements.

**Why is this needed**:

Currently, when Kueue admits a PipelineRun workload and assigns it to a specific resource flavor, the node scheduling information (labelSelector and tolerations) from the resource flavor is not propagated to the actual TaskRun pods. This creates a disconnect between Kueue's resource management and Tekton's pod scheduling.

This enhancement is critical for several reasons:

1. **Resource Flavor Enforcement**: When administrators configure ClusterQueues with specific resource flavors (e.g., GPU nodes, high-memory nodes), the PipelineRun's TaskRuns should actually run on those designated nodes. Without this, workloads might be scheduled on inappropriate nodes despite Kueue's resource allocation.

2. **Node Affinity and Tolerations**: Resource flavors often include node selectors and tolerations to target specific node pools (e.g., `node-type=gpu`, `workload-type=build`). These constraints must be applied to TaskRun pods to ensure proper scheduling.

3. **Multi-tenant Isolation**: In multi-tenant environments, resource flavors provide isolation by directing workloads to specific node pools. This isolation is only effective if TaskRun pods respect these constraints.

4. **Compliance with Kueue Design**: The `RunWithPodSetsInfo` method exists specifically to allow job controllers to apply Kueue-determined scheduling constraints. The current no-op implementation defeats this purpose.

**Example Impact**:
```yaml
# ClusterQueue with GPU resource flavor
apiVersion: kueue.x-k8s.io/v1beta1
kind: ResourceFlavor
metadata:
  name: gpu-flavor
spec:
  nodeLabels:
    node-type: gpu
  tolerations:
  - key: nvidia.com/gpu
    operator: Equal
    value: "true"
    effect: NoSchedule
```

Without this enhancement, a PipelineRun admitted with the `gpu-flavor` would not have its TaskRuns scheduled on GPU nodes, leading to resource misallocation and potential workload failures.

**Completion requirements**:

This enhancement requires the following artifacts:

- [ ] Design doc
- [ ] API change
- [ ] Docs update

The artifacts should be linked in subsequent comments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Apply PodSetInfo to PipelineRun taskRunTemplate.podTemplate #84

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Apply PodSetInfo to PipelineRun taskRunTemplate.podTemplate #84

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions