Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define availability set and make all instances part of it #128

Open
cmd-ntrf opened this issue Dec 15, 2020 · 5 comments
Open

Define availability set and make all instances part of it #128

cmd-ntrf opened this issue Dec 15, 2020 · 5 comments
Assignees
Labels
azure enhancement New feature or request

Comments

@cmd-ntrf
Copy link
Member

cmd-ntrf commented Dec 15, 2020

I would like to configure Azure to use their infiniband nodes. Someone from Azure has given me a custom CentOS7 image that has the infiniband support baked in. I tested it and things seem to work fine. One piece of advice I was given though was:

 In order for the two VMs to be in the same IB fabric you have to first create an availability set with:

$ az vm availability-set create --name <as_name> --resource-group <rg-name> --location eastus --platform-fault-domain-count 1 --platform-update-domain-count 1

Then, when you create each VM, assign them to the availability set with the following option:

$ az vm create […] --availability-set <as_name>

Also do not forget to put them on the same network and subnet.

It doesn't look like availability_set_id is currently being used/configured. I'm not sure what the impact of this would be, but it seems like a reasonable thing to do by default for the execution nodes.

Originally posted by @ocaisa in #127 (comment)

@cmd-ntrf cmd-ntrf self-assigned this Dec 15, 2020
@cmd-ntrf cmd-ntrf added azure enhancement New feature or request labels Dec 15, 2020
@ocaisa
Copy link
Collaborator

ocaisa commented Dec 16, 2020

I implemented this for some (successful) infiniband tests on Azure, this required some easy change in azure/infrastructure.tf

+# Create an availability set for the execution nodes
+resource "azurerm_availability_set" "avset" {
+  name                = "${var.cluster_name}_availability_set"
+  location            = var.location
+  resource_group_name = local.resource_group_name
+  platform_update_domain_count = 1
+  platform_fault_domain_count  = 1
+}
+
...
@@ -326,15 +341,19 @@ resource "azurerm_linux_virtual_machine" "node" {
   location              = each.value["location"]
   resource_group_name   = local.resource_group_name
   network_interface_ids = [azurerm_network_interface.nodeNIC[each.key].id]
+  availability_set_id   = azurerm_availability_set.avset.id

@cmd-ntrf
Copy link
Member Author

cmd-ntrf commented Dec 16, 2020

Virtual machine scale sets - In a virtual machine scale set, ensure that you limit the deployment to a single placement group for InfiniBand communication within the scale set.

For example, in a Resource Manager template, set the singlePlacementGroup property to true. Note that the maximum scale set size that can be spun up with singlePlacementGroup property to true is capped at 100 VMs by default. If your HPC job scale needs are higher than 100 VMs in a single tenant, you may request an increase, open an online customer support request at no charge. The limit on the number of VMs in a single scale set can be increased to 300.

Note that when deploying VMs using Availability Sets the maximum limit is at 200 VMs per Availability Set.

Ref: https://docs.microsoft.com/en-us/azure/virtual-machines/sizes-hpc#cluster-configuration-options

@cmd-ntrf
Copy link
Member Author

Because of the maximum limit of vms per availability set, we will want to define a count for the availability set ressource which is function of the number of compute nodes. Which brings the question of what should we do with heterogenous cluster: should all compute instances be part of the same availability set ? Or the availability set should be define per instance type?

@ocaisa
Copy link
Collaborator

ocaisa commented Dec 17, 2020

I would say that by default they should be separate availability sets per instance type since it 's not that unlikely that you may run into restrictions on the Azure side, but the user should be able to override this (for example by explicitly naming the availability sets, and providing the same name for the different instance types). The use case I imagine is GPU and non-GPU nodes.

@cmd-ntrf
Copy link
Member Author

AWS has the same concept named differently - cluster placement group.
Ref: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa-start.html#efa-start-instances

It would be worth looking at all cloud providers supported by MC and implement it at once for all clouds.

cmd-ntrf added a commit that referenced this issue Mar 15, 2022
Make profile::base initialize the sudoer account with ssh keys
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
azure enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants