Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ui-meta for new quota checks #40

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions group_vars/openstack.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,6 @@ terraform_project_path: "{{ playbook_dir }}/terraform"
terraform_state: "{{ cluster_state | default('present') }}"
cluster_ssh_user: rocky

# Set the size of the state volume to metrics_db_maximum_size + 10
state_volume_size: "{{ metrics_db_maximum_size + 10 }}"

# Provision a single "standard" compute partition using the supplied
# node count and flavor
openhpc_slurm_partitions:
Expand Down
3 changes: 2 additions & 1 deletion group_vars/prometheus.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@ openondemand_address: "{{ hostvars[groups['openondemand'].0].api_address if 'ope
prometheus_scrape_configs: "{{ prometheus_scrape_configs_default + (openondemand_scrape_configs if ( 'openondemand' in groups ) else [] ) }}"

# Set Prometheus storage retention size
prometheus_storage_retention_size: "{{ metrics_db_maximum_size }}GB"
# We reserve 10GB of the state volume for cluster state, the rest is for metrics
prometheus_storage_retention_size: "{{ state_volume_size - 10 }}GB"
43 changes: 31 additions & 12 deletions ui-meta/slurm-infra-fast-volume-type.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,24 @@ parameters:
kind: cloud.ip
immutable: true

- name: login_flavor
label: Login node size
description: The size to use for the login node.
kind: cloud.size
immutable: true
options:
min_ram: 2048
min_disk: 20

- name: control_flavor
label: Control node size
description: The size to use for the control node.
kind: cloud.size
immutable: true
options:
min_ram: 2048
min_disk: 20

- name: compute_count
label: Compute node count
description: The number of compute nodes in the cluster.
Expand All @@ -23,16 +41,16 @@ parameters:
- name: compute_flavor
label: Compute node size
description: The size to use for the compute node.
kind: "cloud.size"
kind: cloud.size
immutable: true
options:
min_ram: 2048
min_disk: 20

- name: home_volume_size
label: Home volume size (GB)
description: The size of the cloud volume to use for home directories
kind: integer
description: The size of the cloud volume to use for home directories.
kind: cloud.volume_size
immutable: true
options:
min: 10
Expand All @@ -51,19 +69,20 @@ parameters:
options:
checkboxLabel: Put home directories on high-performance storage?

- name: metrics_db_maximum_size
label: Metrics database size (GB)
- name: state_volume_size
label: State volume size (GB)
description: |
The size of the state volume, used to hold and persist important files and data. Of
this volume, 10GB is set aside for cluster state and the remaining space is used
to store cluster metrics.

The oldest metrics records in the [Prometheus](https://prometheus.io/) database will be
discarded to ensure that the database does not grow larger than this size.

**A cloud volume of this size +10GB will be created to hold and persist the metrics
database and important Slurm files.**
kind: integer
discarded to ensure that the database does not grow larger than this volume.
kind: cloud.volume_size
immutable: true
options:
min: 10
default: 10
min: 20
default: 20

- name: cluster_run_validation
label: Post-configuration validation
Expand Down
44 changes: 32 additions & 12 deletions ui-meta/slurm-infra.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,24 @@ parameters:
kind: cloud.ip
immutable: true

- name: login_flavor
label: Login node size
description: The size to use for the login node.
kind: cloud.size
immutable: true
options:
min_ram: 2048
min_disk: 20

- name: control_flavor
label: Control node size
description: The size to use for the control node.
kind: cloud.size
immutable: true
options:
min_ram: 2048
min_disk: 20

- name: compute_count
label: Compute node count
description: The number of compute nodes in the cluster.
Expand All @@ -23,34 +41,36 @@ parameters:
- name: compute_flavor
label: Compute node size
description: The size to use for the compute node.
kind: "cloud.size"
kind: cloud.size
immutable: true
options:
count_parameter: compute_count
min_ram: 2048
min_disk: 20

- name: home_volume_size
label: Home volume size (GB)
description: The size of the cloud volume to use for home directories
kind: integer
description: The size of the cloud volume to use for home directories.
kind: cloud.volume_size
immutable: true
options:
min: 10
default: 100

- name: metrics_db_maximum_size
label: Metrics database size (GB)
- name: state_volume_size
label: State volume size (GB)
description: |
The size of the state volume, used to hold and persist important files and data. Of
this volume, 10GB is set aside for cluster state and the remaining space is used
to store cluster metrics.

The oldest metrics records in the [Prometheus](https://prometheus.io/) database will be
discarded to ensure that the database does not grow larger than this size.

**A cloud volume of this size +10GB will be created to hold and persist the metrics
database and important Slurm files.**
kind: integer
discarded to ensure that the database does not grow larger than this volume.
kind: cloud.volume_size
immutable: true
options:
min: 10
default: 10
min: 20
default: 20

- name: cluster_run_validation
label: Post-configuration validation
Expand Down