Implement orchestration management of lab runners

Today, our lab uses hyper-V machine checkpoints in order to revert the VM back to a previously known good state. 

This is how we are achieving (semi) stateless runners. We are doing this hyper-V revert on a schedule: once every 3 hours.

At the very least, if a PR causes a crash/bugcheck, we can be confident the system will heal without manual intervention.

Still, a 3 hour wait may be sub-optimal. We would like to do a revert prior to every run. That means we need some sort of orchestration system so that concurrent jobs do not step on each other's toes, and we may need to accept some trade-offs in terms of complexity / accept that not every job will have a fresh state if we have multiple concurrent jobs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement orchestration management of lab runners #395

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement orchestration management of lab runners #395

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions