[Feature]: FMA actuation path classification, hit rates, and per-path timing (Phases 1-2)

### Component

Harnesses

### Desired use case or feature

The FMA benchmarking doc ([llm-d-fast-model-actuation PR #454](https://github.com/llm-d-incubation/llm-d-fast-model-actuation/pull/454)) defines new metrics that the nop harness should collect for FMA actuation paths:

- **Hot_hit_rate**: Fraction of server-requesting Pods satisfied by waking a sleeping vLLM instance
- **Warm_hit_rate**: Fraction satisfied by an existing launcher pod (no new launcher creation needed)
- **T_wake**: Hot-start timing (requester creation to ready)
- **T_instance_create**: Warm-start timing (launcher receiving create request to DPC readiness relay)
- **T_cold_launcher**: Cold-start-with-launcher timing (launcher pod creation to DPC readiness relay)

The current `fma_functions.py` classifies actuation paths using a pod-name heuristic (`TODO: Improve the warm/luke_warm check`). This needs to be replaced with timestamp-based classification.

### Proposed solution

This work is a single combined PR covering Phase 1 (classification + hit rates) and Phase 2 (per-path timing). It opens after the FMA repo logging prerequisites ([#495](https://github.com/llm-d-incubation/llm-d-fast-model-actuation/issues/495), [#497](https://github.com/llm-d-incubation/llm-d-fast-model-actuation/issues/497)) are merged.

**Phase 1: Classification and hit rates**
1. Rename `FMAActuationCondition.T_LUKE_WARM` to `T_COLD_LAUNCHER` (aligns with updated FMA terminology)
2. Store `launcher_creation_timestamp` in `FMALauncherInfo`
3. Replace pod-name heuristic with timestamp comparison:
   - HOT: sleep/wake metrics indicate a wake event
   - WARM: launcher pod `creationTimestamp` < requester pod `creationTimestamp` (pre-existing launcher)
   - COLD_LAUNCHER: launcher created after requester (DPC had to create a new launcher)
4. Compute `hot_hit_rate`, `warm_hit_rate`, `cold_launcher_rate` per iteration in `FMAMetricsIteration`

**Phase 2: Per-path timing (upper bound via Kube timestamps)**
5. Add `t_wake`, `t_instance_create`, `t_cold_launcher` fields to `FMALauncherInfo`
6. Compute upper-bound timing:
   - T_wake: `requester_ready - requester_creation` (hot path)
   - T_instance_create: `requester_ready - requester_creation` (warm path)
   - T_cold_launcher: `requester_ready - launcher_creation` (cold path)

These are upper bounds (include kubelet polling delay). Tighter measurements via DPC log parsing are a future follow-up once #495/#497 logs are deployed and available on test clusters.

**Success criteria:**
- JSON output includes `actuation_condition` with values `T_cold_launcher`, `T_warm`, or `T_hot`
- Each iteration reports `hot_hit_rate`, `warm_hit_rate`, `cold_launcher_rate`
- Per-launcher info includes the relevant timing metric (`t_wake`, `t_instance_create`, or `t_cold_launcher`)
- Classification is verified against each actuation path config (hot, warm, cold with launcher)

### Alternatives

- Pod-name heuristic (current approach): fragile, breaks if naming conventions change
- Separate Phase 1 and Phase 2 into two PRs: unnecessary complexity for reviewers given the shared data structures

### Additional context or screenshots

- Depends on FMA repo issues [#495](https://github.com/llm-d-incubation/llm-d-fast-model-actuation/issues/495) and [#497](https://github.com/llm-d-incubation/llm-d-fast-model-actuation/issues/497) being merged first
- Related PR: [llm-d-fast-model-actuation #454](https://github.com/llm-d-incubation/llm-d-fast-model-actuation/pull/454) (benchmarking doc defining these metrics)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: FMA actuation path classification, hit rates, and per-path timing (Phases 1-2) #1422

Component

Desired use case or feature

Proposed solution

Alternatives

Additional context or screenshots

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature]: FMA actuation path classification, hit rates, and per-path timing (Phases 1-2) #1422

Description

Component

Desired use case or feature

Proposed solution

Alternatives

Additional context or screenshots

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions