From 89d54b2b21b517700d4e6dc81c0859515cf8487c Mon Sep 17 00:00:00 2001 From: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com> Date: Thu, 15 Aug 2024 11:17:50 +0100 Subject: [PATCH 01/42] Update Azure Documentation to include Service Principals --- .../compute-envs/azure-batch.mdx | 46 +++++++++++++++++-- 1 file changed, 41 insertions(+), 5 deletions(-) diff --git a/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx index 4df7ff754..027c5b2c3 100644 --- a/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx @@ -21,15 +21,15 @@ Azure uses 'accounts' for each service. For example, an [Azure Storage account][ ### Resource group -To create Azure Batch and Azure Storage accounts, first create a [resource group][az-learn-rg] in your preferred region. +An Azure Resource Group is a logical container that holds related Azure resources such as virtual machines, storage accounts, databases, and more. It serves as a management boundary, allowing you to organize, deploy, monitor, and manage all the resources within it as a single entity. Resources in a Resource Group share the same lifecycle, meaning they can be deployed, updated, and deleted together. This grouping also enables easier access control, monitoring, and cost management, making it a foundational element in organizing and managing cloud infrastructure in Azure. -:::note -A resource group can be created while creating an Azure Storage Account or Azure Batch account. -::: +### Service Principal + +An Azure Service Principal is an identity created for use with applications, hosted services, or automated tools to access Azure resources. It acts like a "user identity" with a specific set of permissions assigned to it. Seqera Platform can use an Azure Service Principal to authenticate and authorize access to Azure Batch for job execution and Azure Storage for data management. By assigning the necessary roles to the Service Principal, Seqera can securely interact with these Azure services, ensuring that only authorized operations are performed during pipeline execution. ### Regions -Azure resources can operate across regions, but this incurs additional costs and security requirements. It is recommended to place all resources in the same region. See the [Azure product page on data residency][az-data-residency] for more information. +Azure regions are distinct geographic areas that contain multiple data centers, strategically located around the world, to provide high availability, fault tolerance, and low latency for cloud services. Each region offers a wide range of Azure services, and by choosing a specific region, users can optimize performance, ensure data residency compliance, and meet regulatory requirements. Azure regions also enable redundancy and disaster recovery options by allowing resources to be replicated across different regions, enhancing the resilience of applications and data. ## Resource group @@ -37,6 +37,12 @@ A resource group in Azure is a unit of related resources in Azure. As a rule of ### Create a resource group +An Azure Batch and Azure Storage account need to be linked to a resource group in Azure, so it is necessary to create one. + +:::note +A resource group can be created while creating an Azure Storage Account or Azure Batch account. +::: + 1. Log in to your Azure account, go to the [Create Resource group][az-create-rg] page, and select **Create new resource group**. 2. Enter a name for the resource group, e.g., _towerrg_. 3. Choose the preferred region. @@ -117,6 +123,36 @@ After you have created a resource group and storage account, create a [Batch acc - **Spot/low-priority vCPUs**: Platform does not support spot or low-priority machines when using Forge, so when using Forge this number can be zero. When manually setting up a pool, select an appropriate number of concurrent vCPUs here. - **Total Dedicated vCPUs per VM series**: See the Azure documentation for [virtual machine sizes][az-vm-sizes] to help determine the machine size you need. We recommend the latest version of the ED series available in your region as a cost-effective and appropriately-sized machine for running Nextflow. However, you will need to select alternative machine series that have additional requirements, such as those with additional GPUs or faster storage. Increase the quota by the number of required concurrent CPUs. In Azure, machines are charged per cpu minute so there is no additional cost for a higher number. +### Credentials + +There are two Azure credential options available, primary Access Keys and a Service Principal. Primary access keys are simple to use but provide full access to the storage and batch account. Furthermore there can only be two per account so they are a single point of failure. A Service Principal provides an account which can be granted access to the Azure Batch and Storage resources, providing role based access which can be precisely controlled. Furthermore, some features of Azure Batch are only available to a Service Principal instead of primary access keys. + +:::note +The two Azure Credentials types follow entirely different authenetication methods. You can add more than one credential to a workspace but only one can be used at a time. While they can be used concurrently they are not cross compatible and access granted by one will not be conferred to another. +::: + +#### Access Keys + +1. Navigate to the Azure Portal and sign in. +2. Locate the Azure Batch account and select "Keys" under "Account management". Here you will see the Primary and Secondary keys. Copy one of the keys and save it in a secure location to be used later. +3. Locate the Azure Storage account and under the "Security and Networking" section select "Access keys". Here you will see Key1 and Key2 options. Copy one of them and save it to a secure location to be used later. Be sure to delete them after saving them in Seqera Platform. +4. On Seqera Platform, go to your workspace and select add a new credential and select type "Azure Credentials" +5. Enter a name for the credentials, e.g., _Azure Credentials_. +6. Add the **Batch account** and **Blob Storage** account names and access keys. + +#### Service Princiapl + +1. In the Azure Portal, navigate to "Microsoft Entra ID" and under "App registrations" click "New registration". See [Azure documentation](https://learn.microsoft.com/en-us/entra/identity-platform/howto-create-service-principal-portal) for more details. +2. Provide a name for the application. The application will have a service principal automatically associated with it. +3. We now need to assign roles to the Service Prinicipal. Go to the Azure Storage account and under "Access Control (IAM)", click "Add role assignment". +4. Choose the roles "Storage Blob Data Reader" and "Storage Blob Data Contributor", then select "Members", "Select Members" and and search for your newly created Service Principal and assign the role. +5. Repeat the same process for the Azure Batch account but with the role "Azure Batch Contributor". +6. Seqera Platform will need credentials to authenticate as the Service Principal. Navigate to the app registration again, on the "Overview" page save the "Application (client) ID" value to be used on Seqera Platform. +7. From there, click "Certificates & secrets". Click "New client secret". A new secret will be created containing a value and secret ID. Save both of these somewhere secure to be used in Seqera Platform. Be sure to delete them after saving them in Seqera Platform. +4. On Seqera Platform, go to your workspace and select add a new credential and select type "Azure Credentials" then select the "Service Prinicipal" tab. +5. Enter a name for the credentials, e.g., _Azure Credentials_. +6. Add the Application ID, the Secret ID, Secret, **Batch account** and **Blob Storage** account names to the relevant fields. + ### Compute environment There are two ways to create an Azure Batch compute environment in Seqera Platform: From b4746c71cb1abb26c5096aed526d29ee096e6afb Mon Sep 17 00:00:00 2001 From: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com> Date: Thu, 15 Aug 2024 11:23:57 +0100 Subject: [PATCH 02/42] Language tidy --- .../compute-envs/azure-batch.mdx | 39 +++++++++---------- 1 file changed, 18 insertions(+), 21 deletions(-) diff --git a/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx index 027c5b2c3..c4cb039c9 100644 --- a/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx @@ -31,10 +31,6 @@ An Azure Service Principal is an identity created for use with applications, hos Azure regions are distinct geographic areas that contain multiple data centers, strategically located around the world, to provide high availability, fault tolerance, and low latency for cloud services. Each region offers a wide range of Azure services, and by choosing a specific region, users can optimize performance, ensure data residency compliance, and meet regulatory requirements. Azure regions also enable redundancy and disaster recovery options by allowing resources to be replicated across different regions, enhancing the resilience of applications and data. -## Resource group - -A resource group in Azure is a unit of related resources in Azure. As a rule of thumb, resources that have a similar lifecycle should be within the same resource group. You can delete a resource group and all associated components together. We recommend placing all platform compute resources in the same resource group, but this is not necessary. - ### Create a resource group An Azure Batch and Azure Storage account need to be linked to a resource group in Azure, so it is necessary to create one. @@ -125,33 +121,33 @@ After you have created a resource group and storage account, create a [Batch acc ### Credentials -There are two Azure credential options available, primary Access Keys and a Service Principal. Primary access keys are simple to use but provide full access to the storage and batch account. Furthermore there can only be two per account so they are a single point of failure. A Service Principal provides an account which can be granted access to the Azure Batch and Storage resources, providing role based access which can be precisely controlled. Furthermore, some features of Azure Batch are only available to a Service Principal instead of primary access keys. +There are two Azure credential options available: primary Access Keys and a Service Principal. Primary Access Keys are simple to use but provide full access to the storage and batch accounts. Additionally, there can only be two keys per account, making them a single point of failure. A Service Principal, on the other hand, provides an account that can be granted access to Azure Batch and Storage resources, allowing for role-based access control with more precise permissions. Moreover, some Azure Batch features are only available when using a Service Principal instead of primary Access Keys. :::note -The two Azure Credentials types follow entirely different authenetication methods. You can add more than one credential to a workspace but only one can be used at a time. While they can be used concurrently they are not cross compatible and access granted by one will not be conferred to another. +The two Azure credential types use entirely different authentication methods. You can add more than one credential to a workspace, but only one can be used at a time. While they can be used concurrently, they are not cross-compatible, and access granted by one will not be shared with the other. ::: #### Access Keys 1. Navigate to the Azure Portal and sign in. -2. Locate the Azure Batch account and select "Keys" under "Account management". Here you will see the Primary and Secondary keys. Copy one of the keys and save it in a secure location to be used later. -3. Locate the Azure Storage account and under the "Security and Networking" section select "Access keys". Here you will see Key1 and Key2 options. Copy one of them and save it to a secure location to be used later. Be sure to delete them after saving them in Seqera Platform. -4. On Seqera Platform, go to your workspace and select add a new credential and select type "Azure Credentials" -5. Enter a name for the credentials, e.g., _Azure Credentials_. +2. Locate the Azure Batch account and select "Keys" under "Account management." Here, you will see the Primary and Secondary keys. Copy one of the keys and save it in a secure location for later use. +3. Locate the Azure Storage account and, under the "Security and Networking" section, select "Access keys". Here, you will see Key1 and Key2 options. Copy one of them and save it in a secure location for later use. Be sure to delete them after saving them in Seqera Platform. +4. In Seqera Platform, go to your workspace, select "Add a new credential," and choose the "Azure Credentials" type. +5. Enter a name for the credentials, such as _Azure Credentials_. 6. Add the **Batch account** and **Blob Storage** account names and access keys. -#### Service Princiapl +#### Service Principal -1. In the Azure Portal, navigate to "Microsoft Entra ID" and under "App registrations" click "New registration". See [Azure documentation](https://learn.microsoft.com/en-us/entra/identity-platform/howto-create-service-principal-portal) for more details. -2. Provide a name for the application. The application will have a service principal automatically associated with it. -3. We now need to assign roles to the Service Prinicipal. Go to the Azure Storage account and under "Access Control (IAM)", click "Add role assignment". -4. Choose the roles "Storage Blob Data Reader" and "Storage Blob Data Contributor", then select "Members", "Select Members" and and search for your newly created Service Principal and assign the role. -5. Repeat the same process for the Azure Batch account but with the role "Azure Batch Contributor". -6. Seqera Platform will need credentials to authenticate as the Service Principal. Navigate to the app registration again, on the "Overview" page save the "Application (client) ID" value to be used on Seqera Platform. -7. From there, click "Certificates & secrets". Click "New client secret". A new secret will be created containing a value and secret ID. Save both of these somewhere secure to be used in Seqera Platform. Be sure to delete them after saving them in Seqera Platform. -4. On Seqera Platform, go to your workspace and select add a new credential and select type "Azure Credentials" then select the "Service Prinicipal" tab. -5. Enter a name for the credentials, e.g., _Azure Credentials_. -6. Add the Application ID, the Secret ID, Secret, **Batch account** and **Blob Storage** account names to the relevant fields. +1. In the Azure Portal, navigate to "Microsoft Entra ID," and under "App registrations," click "New registration." See the [Azure documentation][az-create-sp] for more details. +2. Provide a name for the application. The application will automatically have a Service Principal associated with it. +3. Assign roles to the Service Principal. Go to the Azure Storage account, and under "Access Control (IAM)," click "Add role assignment." +4. Choose the roles "Storage Blob Data Reader" and "Storage Blob Data Contributor", then select "Members", click "Select Members", search for your newly created Service Principal, and assign the role. +5. Repeat the same process for the Azure Batch account, but use the "Azure Batch Contributor" role. +6. Seqera Platform will need credentials to authenticate as the Service Principal. Navigate back to the app registration, and on the "Overview" page, save the "Application (client) ID" value for use in Seqera Platform. +7. Then, click "Certificates & secrets" and select "New client secret." A new secret will be created containing a value and secret ID. Save both of these securely for use in Seqera Platform. Be sure to delete them after saving them in Seqera Platform. +8. In Seqera Platform, go to your workspace, select "Add a new credential," choose the "Azure Credentials" type, then select the "Service Principal" tab. +9. Enter a name for the credentials, such as _Azure Credentials_. +10. Add the Application ID, Secret ID, Secret, **Batch account**, and **Blob Storage** account names to the relevant fields. ### Compute environment @@ -262,6 +258,7 @@ Your Seqera compute environment uses resources that you may be charged for in yo [az-learn-jobs]: https://learn.microsoft.com/en-us/azure/batch/jobs-and-tasks [az-create-rg]: https://portal.azure.com/#create/Microsoft.ResourceGroup [az-create-storage]: https://portal.azure.com/#create/Microsoft.StorageAccount-ARM +[az-create-sp](https://learn.microsoft.com/en-us/entra/identity-platform/howto-create-service-principal-portal) [wave-docs]: https://docs.seqera.io/wave [nf-fusion-docs]: https://www.nextflow.io/docs/latest/fusion.html From 2db492ce48a7556044283064c66d1c59e8038910 Mon Sep 17 00:00:00 2001 From: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com> Date: Thu, 15 Aug 2024 11:28:23 +0100 Subject: [PATCH 03/42] Remove previous reference to credentials that are no longer relevant or 100% correct --- .../compute-envs/azure-batch.mdx | 62 +++++++------------ 1 file changed, 24 insertions(+), 38 deletions(-) diff --git a/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx index c4cb039c9..8c8146f96 100644 --- a/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx @@ -167,38 +167,31 @@ Create a Batch Forge Azure Batch compute environment: 1. In a workspace, select **Compute Environments > New Environment**. 2. Enter a descriptive name, e.g., _Azure Batch (east-us)_. 3. Select **Azure Batch** as the target platform. -4. Choose existing Azure credentials or add a new credential. If you are using existing credentials, skip to step 7. - - :::tip - You can create multiple credentials in your Seqera environment. - ::: - -5. Enter a name for the credentials, e.g., _Azure Credentials_. -6. Add the **Batch account** and **Blob Storage** account names and access keys. -7. Select a **Region**, e.g., _eastus_. -8. In the **Pipeline work directory** field, enter the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. +4. Choose existing Azure credentials or add a new credential. +5. Select a **Region**, e.g., _eastus_. +6. In the **Pipeline work directory** field, enter the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: -9. Select **Enable Wave containers** to facilitate access to private container repositories and provision containers in your pipelines using the Wave containers service. See [Wave containers][wave-docs] for more information. -10. Select **Enable Fusion v2** to allow access to your Azure Blob Storage data via the [Fusion v2][nf-fusion-docs] virtual distributed file system. This speeds up most data operations. The Fusion v2 file system requires Wave containers to be enabled. See [Fusion file system](../supported_software/fusion/fusion.mdx) for configuration details. -11. Set the **Config mode** to **Batch Forge**. -12. Enter the default **VMs type**, depending on your quota limits set previously. The default is _Standard_D4_v3_. -13. Enter the **VMs count**. If autoscaling is enabled (default), this is the maximum number of VMs you wish the pool to scale up to. If autoscaling is disabled, this is the fixed number of virtual machines in the pool. -14. Enable **Autoscale** to scale up and down automatically, based on the number of pipeline tasks. The number of VMs will vary from **0** to **VMs count**. -15. Enable **Dispose resources** for Seqera to automatically delete the Batch pool if the compute environment is deleted on the platform. -16. Select or create [**Container registry credentials**](../credentials/azure_registry_credentials.mdx) to authenticate a registry (used by the [Wave containers](https://www.nextflow.io/docs/latest/wave.html) service). It is recommended to use an [Azure Container registry](https://azure.microsoft.com/en-gb/products/container-registry) within the same region for maximum performance. -17. Apply [**Resource labels**](../resource-labels/overview.mdx). This will populate the **Metadata** fields of the Azure Batch pool. -18. Expand **Staging options** to include optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre-and-post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. -19. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. -20. Configure any advanced options you need: +7. Select **Enable Wave containers** to facilitate access to private container repositories and provision containers in your pipelines using the Wave containers service. See [Wave containers][wave-docs] for more information. +8. Select **Enable Fusion v2** to allow access to your Azure Blob Storage data via the [Fusion v2][nf-fusion-docs] virtual distributed file system. This speeds up most data operations. The Fusion v2 file system requires Wave containers to be enabled. See [Fusion file system](../supported_software/fusion/fusion.mdx) for configuration details. +9. Set the **Config mode** to **Batch Forge**. +10. Enter the default **VMs type**, depending on your quota limits set previously. The default is _Standard_D4_v3_. +11. Enter the **VMs count**. If autoscaling is enabled (default), this is the maximum number of VMs you wish the pool to scale up to. If autoscaling is disabled, this is the fixed number of virtual machines in the pool. +12. Enable **Autoscale** to scale up and down automatically, based on the number of pipeline tasks. The number of VMs will vary from **0** to **VMs count**. +13. Enable **Dispose resources** for Seqera to automatically delete the Batch pool if the compute environment is deleted on the platform. +14. Select or create [**Container registry credentials**](../credentials/azure_registry_credentials.mdx) to authenticate a registry (used by the [Wave containers](https://www.nextflow.io/docs/latest/wave.html) service). It is recommended to use an [Azure Container registry](https://azure.microsoft.com/en-gb/products/container-registry) within the same region for maximum performance. +15. Apply [**Resource labels**](../resource-labels/overview.mdx). This will populate the **Metadata** fields of the Azure Batch pool. +16. Expand **Staging options** to include optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre-and-post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. +17. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. +18. Configure any advanced options you need: - Use **Jobs cleanup policy** to control how Nextflow process jobs are deleted on completion. Active jobs consume the quota of the Azure Batch account. By default, jobs are terminated by Nextflow and removed from the quota when all tasks succesfully complete. If set to _Always_, all jobs are deleted by Nextflow after pipeline completion. If set to _Never_, jobs are never deleted. If set to _On success_, successful tasks are removed but failed tasks will be left for debugging purposes. - Use **Token duration** to control the duration of the SAS token generated by Nextflow. This must be as long as the longest period of time the pipeline will run. -21. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before the compute environment is ready to launch pipelines. +19. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before the compute environment is ready to launch pipelines. **See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment.** @@ -215,35 +208,28 @@ Your Seqera compute environment uses resources that you may be charged for in yo 1. In a workspace, select **Compute Environments > New Environment**. 2. Enter a descriptive name for this environment, e.g., _Azure Batch (east-us)_. 3. Select **Azure Batch** as the target platform. -4. Select your existing Azure credentials or select **+** to add new credentials. If you choose to use existing credentials, skip to step 7. - - :::tip - You can create multiple credentials in your Seqera environment. - ::: - -5. Enter a name, e.g., _Azure Credentials_. -6. Add the **Batch account** and **Blob Storage** credentials you created previously. -7. Select a **Region**, e.g., _eastus (East US)_. -8. In the **Pipeline work directory** field, add the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. +4. Select your existing Azure credentials or select **+** to add new credentials. +5. Select a **Region**, e.g., _eastus (East US)_. +6. In the **Pipeline work directory** field, add the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: -9. Set the **Config mode** to **Manual**. -10. Enter the **Compute Pool name**. This is the name of the Azure Batch pool you created previously in the Azure Batch account. +7. Set the **Config mode** to **Manual**. +8. Enter the **Compute Pool name**. This is the name of the Azure Batch pool you created previously in the Azure Batch account. :::note The default Azure Batch implementation uses a single pool for head and compute nodes. To use separate pools for head and compute nodes (e.g., to use low-priority VMs for compute jobs), see [this FAQ entry](../faqs.mdx#azure). ::: -11. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. -12. Configure any advanced options you need: +9. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. +10. Configure any advanced options you need: - Use **Jobs cleanup policy** to control how Nextflow process jobs are deleted on completion. Active jobs consume the quota of the Azure Batch account. By default, jobs are terminated by Nextflow and removed from the quota when all tasks succesfully complete. If set to _Always_, all jobs are deleted by Nextflow after pipeline completion. If set to _Never_, jobs are never deleted. If set to _On success_, successful tasks are removed but failed tasks will be left for debugging purposes. - Use **Token duration** to control the duration of the SAS token generated by Nextflow. This must be as long as the longest period of time the pipeline will run. -13. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before you are ready to launch pipelines. +11. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before you are ready to launch pipelines. **See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment.** From 74eeaae986f2fd1fe301e33e27971688b69ce488 Mon Sep 17 00:00:00 2001 From: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com> Date: Thu, 15 Aug 2024 11:34:38 +0100 Subject: [PATCH 04/42] Section headers --- .../compute-envs/azure-batch.mdx | 26 ++++++++++--------- 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx index 8c8146f96..b87a8bfd1 100644 --- a/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1.0/compute-envs/azure-batch.mdx @@ -13,25 +13,27 @@ Ensure you have sufficient permissions to create resource groups, an Azure Stora ## Concepts -### Accounts - -Seqera Platform relies on an existing Azure Storage and Azure Batch account. You need at least 1 valid Azure Storage account and Azure Batch account within your subscription. +### Regions -Azure uses 'accounts' for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. While you can have multiple Azure Storage and Azure Batch accounts in an Azure subscription, each compute environment on the platform can only use one of each (one storage and one Batch account). You can set up multiple compute environments on the platform with different credentials, storage accounts, and Batch accounts. +Azure regions are distinct geographic areas that contain multiple data centers, strategically located around the world, to provide high availability, fault tolerance, and low latency for cloud services. Each region offers a wide range of Azure services, and by choosing a specific region, users can optimize performance, ensure data residency compliance, and meet regulatory requirements. Azure regions also enable redundancy and disaster recovery options by allowing resources to be replicated across different regions, enhancing the resilience of applications and data. ### Resource group An Azure Resource Group is a logical container that holds related Azure resources such as virtual machines, storage accounts, databases, and more. It serves as a management boundary, allowing you to organize, deploy, monitor, and manage all the resources within it as a single entity. Resources in a Resource Group share the same lifecycle, meaning they can be deployed, updated, and deleted together. This grouping also enables easier access control, monitoring, and cost management, making it a foundational element in organizing and managing cloud infrastructure in Azure. +### Accounts + +Seqera Platform relies on an existing Azure Storage and Azure Batch account. You need at least 1 valid Azure Storage account and Azure Batch account within your subscription. + +Azure uses 'accounts' for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. While you can have multiple Azure Storage and Azure Batch accounts in an Azure subscription, each compute environment on the platform can only use one of each (one storage and one Batch account). You can set up multiple compute environments on the platform with different credentials, storage accounts, and Batch accounts. + ### Service Principal An Azure Service Principal is an identity created for use with applications, hosted services, or automated tools to access Azure resources. It acts like a "user identity" with a specific set of permissions assigned to it. Seqera Platform can use an Azure Service Principal to authenticate and authorize access to Azure Batch for job execution and Azure Storage for data management. By assigning the necessary roles to the Service Principal, Seqera can securely interact with these Azure services, ensuring that only authorized operations are performed during pipeline execution. -### Regions - -Azure regions are distinct geographic areas that contain multiple data centers, strategically located around the world, to provide high availability, fault tolerance, and low latency for cloud services. Each region offers a wide range of Azure services, and by choosing a specific region, users can optimize performance, ensure data residency compliance, and meet regulatory requirements. Azure regions also enable redundancy and disaster recovery options by allowing resources to be replicated across different regions, enhancing the resilience of applications and data. +## Azure Resources -### Create a resource group +### Resource group An Azure Batch and Azure Storage account need to be linked to a resource group in Azure, so it is necessary to create one. @@ -45,7 +47,7 @@ A resource group can be created while creating an Azure Storage Account or Azure 4. Select **Review and Create** to proceed. 5. Select **Create**. -## Storage account +### Storage account After creating a resource group, set up an [Azure storage account][az-learn-storage]. @@ -80,7 +82,7 @@ After creating a resource group, set up an [Azure storage account][az-learn-stor Blob container storage credentials are associated with the Batch pool configuration. Avoid changing these credentials in your Seqera instance after you have created the compute environment. ::: -## Batch account +### Batch account After you have created a resource group and storage account, create a [Batch account][az-learn-batch]. @@ -149,7 +151,7 @@ The two Azure credential types use entirely different authentication methods. Yo 9. Enter a name for the credentials, such as _Azure Credentials_. 10. Add the Application ID, Secret ID, Secret, **Batch account**, and **Blob Storage** account names to the relevant fields. -### Compute environment +## Seqera Platform There are two ways to create an Azure Batch compute environment in Seqera Platform: @@ -195,7 +197,7 @@ Create a Batch Forge Azure Batch compute environment: **See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment.** -## Manual +### Manual This section is for users with a pre-configured Azure Batch pool. This requires an existing Azure Batch account with an existing pool. From dcc64cc5509e801faecf82f9d3c76c1593a13ce3 Mon Sep 17 00:00:00 2001 From: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com> Date: Thu, 15 Aug 2024 11:40:48 +0100 Subject: [PATCH 05/42] Put changes on correct platform version --- .../version-23.4/compute-envs/azure-batch.mdx | 127 ++++++-------- .../version-24.1/compute-envs/azure-batch.mdx | 157 +++++++++--------- 2 files changed, 133 insertions(+), 151 deletions(-) diff --git a/platform_versioned_docs/version-23.4/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-23.4/compute-envs/azure-batch.mdx index b87a8bfd1..4df7ff754 100644 --- a/platform_versioned_docs/version-23.4/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-23.4/compute-envs/azure-batch.mdx @@ -13,41 +13,37 @@ Ensure you have sufficient permissions to create resource groups, an Azure Stora ## Concepts -### Regions - -Azure regions are distinct geographic areas that contain multiple data centers, strategically located around the world, to provide high availability, fault tolerance, and low latency for cloud services. Each region offers a wide range of Azure services, and by choosing a specific region, users can optimize performance, ensure data residency compliance, and meet regulatory requirements. Azure regions also enable redundancy and disaster recovery options by allowing resources to be replicated across different regions, enhancing the resilience of applications and data. - -### Resource group - -An Azure Resource Group is a logical container that holds related Azure resources such as virtual machines, storage accounts, databases, and more. It serves as a management boundary, allowing you to organize, deploy, monitor, and manage all the resources within it as a single entity. Resources in a Resource Group share the same lifecycle, meaning they can be deployed, updated, and deleted together. This grouping also enables easier access control, monitoring, and cost management, making it a foundational element in organizing and managing cloud infrastructure in Azure. - ### Accounts Seqera Platform relies on an existing Azure Storage and Azure Batch account. You need at least 1 valid Azure Storage account and Azure Batch account within your subscription. Azure uses 'accounts' for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. While you can have multiple Azure Storage and Azure Batch accounts in an Azure subscription, each compute environment on the platform can only use one of each (one storage and one Batch account). You can set up multiple compute environments on the platform with different credentials, storage accounts, and Batch accounts. -### Service Principal - -An Azure Service Principal is an identity created for use with applications, hosted services, or automated tools to access Azure resources. It acts like a "user identity" with a specific set of permissions assigned to it. Seqera Platform can use an Azure Service Principal to authenticate and authorize access to Azure Batch for job execution and Azure Storage for data management. By assigning the necessary roles to the Service Principal, Seqera can securely interact with these Azure services, ensuring that only authorized operations are performed during pipeline execution. - -## Azure Resources - ### Resource group -An Azure Batch and Azure Storage account need to be linked to a resource group in Azure, so it is necessary to create one. +To create Azure Batch and Azure Storage accounts, first create a [resource group][az-learn-rg] in your preferred region. :::note A resource group can be created while creating an Azure Storage Account or Azure Batch account. ::: +### Regions + +Azure resources can operate across regions, but this incurs additional costs and security requirements. It is recommended to place all resources in the same region. See the [Azure product page on data residency][az-data-residency] for more information. + +## Resource group + +A resource group in Azure is a unit of related resources in Azure. As a rule of thumb, resources that have a similar lifecycle should be within the same resource group. You can delete a resource group and all associated components together. We recommend placing all platform compute resources in the same resource group, but this is not necessary. + +### Create a resource group + 1. Log in to your Azure account, go to the [Create Resource group][az-create-rg] page, and select **Create new resource group**. 2. Enter a name for the resource group, e.g., _towerrg_. 3. Choose the preferred region. 4. Select **Review and Create** to proceed. 5. Select **Create**. -### Storage account +## Storage account After creating a resource group, set up an [Azure storage account][az-learn-storage]. @@ -82,7 +78,7 @@ After creating a resource group, set up an [Azure storage account][az-learn-stor Blob container storage credentials are associated with the Batch pool configuration. Avoid changing these credentials in your Seqera instance after you have created the compute environment. ::: -### Batch account +## Batch account After you have created a resource group and storage account, create a [Batch account][az-learn-batch]. @@ -121,37 +117,7 @@ After you have created a resource group and storage account, create a [Batch acc - **Spot/low-priority vCPUs**: Platform does not support spot or low-priority machines when using Forge, so when using Forge this number can be zero. When manually setting up a pool, select an appropriate number of concurrent vCPUs here. - **Total Dedicated vCPUs per VM series**: See the Azure documentation for [virtual machine sizes][az-vm-sizes] to help determine the machine size you need. We recommend the latest version of the ED series available in your region as a cost-effective and appropriately-sized machine for running Nextflow. However, you will need to select alternative machine series that have additional requirements, such as those with additional GPUs or faster storage. Increase the quota by the number of required concurrent CPUs. In Azure, machines are charged per cpu minute so there is no additional cost for a higher number. -### Credentials - -There are two Azure credential options available: primary Access Keys and a Service Principal. Primary Access Keys are simple to use but provide full access to the storage and batch accounts. Additionally, there can only be two keys per account, making them a single point of failure. A Service Principal, on the other hand, provides an account that can be granted access to Azure Batch and Storage resources, allowing for role-based access control with more precise permissions. Moreover, some Azure Batch features are only available when using a Service Principal instead of primary Access Keys. - -:::note -The two Azure credential types use entirely different authentication methods. You can add more than one credential to a workspace, but only one can be used at a time. While they can be used concurrently, they are not cross-compatible, and access granted by one will not be shared with the other. -::: - -#### Access Keys - -1. Navigate to the Azure Portal and sign in. -2. Locate the Azure Batch account and select "Keys" under "Account management." Here, you will see the Primary and Secondary keys. Copy one of the keys and save it in a secure location for later use. -3. Locate the Azure Storage account and, under the "Security and Networking" section, select "Access keys". Here, you will see Key1 and Key2 options. Copy one of them and save it in a secure location for later use. Be sure to delete them after saving them in Seqera Platform. -4. In Seqera Platform, go to your workspace, select "Add a new credential," and choose the "Azure Credentials" type. -5. Enter a name for the credentials, such as _Azure Credentials_. -6. Add the **Batch account** and **Blob Storage** account names and access keys. - -#### Service Principal - -1. In the Azure Portal, navigate to "Microsoft Entra ID," and under "App registrations," click "New registration." See the [Azure documentation][az-create-sp] for more details. -2. Provide a name for the application. The application will automatically have a Service Principal associated with it. -3. Assign roles to the Service Principal. Go to the Azure Storage account, and under "Access Control (IAM)," click "Add role assignment." -4. Choose the roles "Storage Blob Data Reader" and "Storage Blob Data Contributor", then select "Members", click "Select Members", search for your newly created Service Principal, and assign the role. -5. Repeat the same process for the Azure Batch account, but use the "Azure Batch Contributor" role. -6. Seqera Platform will need credentials to authenticate as the Service Principal. Navigate back to the app registration, and on the "Overview" page, save the "Application (client) ID" value for use in Seqera Platform. -7. Then, click "Certificates & secrets" and select "New client secret." A new secret will be created containing a value and secret ID. Save both of these securely for use in Seqera Platform. Be sure to delete them after saving them in Seqera Platform. -8. In Seqera Platform, go to your workspace, select "Add a new credential," choose the "Azure Credentials" type, then select the "Service Principal" tab. -9. Enter a name for the credentials, such as _Azure Credentials_. -10. Add the Application ID, Secret ID, Secret, **Batch account**, and **Blob Storage** account names to the relevant fields. - -## Seqera Platform +### Compute environment There are two ways to create an Azure Batch compute environment in Seqera Platform: @@ -169,35 +135,42 @@ Create a Batch Forge Azure Batch compute environment: 1. In a workspace, select **Compute Environments > New Environment**. 2. Enter a descriptive name, e.g., _Azure Batch (east-us)_. 3. Select **Azure Batch** as the target platform. -4. Choose existing Azure credentials or add a new credential. -5. Select a **Region**, e.g., _eastus_. -6. In the **Pipeline work directory** field, enter the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. +4. Choose existing Azure credentials or add a new credential. If you are using existing credentials, skip to step 7. + + :::tip + You can create multiple credentials in your Seqera environment. + ::: + +5. Enter a name for the credentials, e.g., _Azure Credentials_. +6. Add the **Batch account** and **Blob Storage** account names and access keys. +7. Select a **Region**, e.g., _eastus_. +8. In the **Pipeline work directory** field, enter the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: -7. Select **Enable Wave containers** to facilitate access to private container repositories and provision containers in your pipelines using the Wave containers service. See [Wave containers][wave-docs] for more information. -8. Select **Enable Fusion v2** to allow access to your Azure Blob Storage data via the [Fusion v2][nf-fusion-docs] virtual distributed file system. This speeds up most data operations. The Fusion v2 file system requires Wave containers to be enabled. See [Fusion file system](../supported_software/fusion/fusion.mdx) for configuration details. -9. Set the **Config mode** to **Batch Forge**. -10. Enter the default **VMs type**, depending on your quota limits set previously. The default is _Standard_D4_v3_. -11. Enter the **VMs count**. If autoscaling is enabled (default), this is the maximum number of VMs you wish the pool to scale up to. If autoscaling is disabled, this is the fixed number of virtual machines in the pool. -12. Enable **Autoscale** to scale up and down automatically, based on the number of pipeline tasks. The number of VMs will vary from **0** to **VMs count**. -13. Enable **Dispose resources** for Seqera to automatically delete the Batch pool if the compute environment is deleted on the platform. -14. Select or create [**Container registry credentials**](../credentials/azure_registry_credentials.mdx) to authenticate a registry (used by the [Wave containers](https://www.nextflow.io/docs/latest/wave.html) service). It is recommended to use an [Azure Container registry](https://azure.microsoft.com/en-gb/products/container-registry) within the same region for maximum performance. -15. Apply [**Resource labels**](../resource-labels/overview.mdx). This will populate the **Metadata** fields of the Azure Batch pool. -16. Expand **Staging options** to include optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre-and-post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. -17. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. -18. Configure any advanced options you need: +9. Select **Enable Wave containers** to facilitate access to private container repositories and provision containers in your pipelines using the Wave containers service. See [Wave containers][wave-docs] for more information. +10. Select **Enable Fusion v2** to allow access to your Azure Blob Storage data via the [Fusion v2][nf-fusion-docs] virtual distributed file system. This speeds up most data operations. The Fusion v2 file system requires Wave containers to be enabled. See [Fusion file system](../supported_software/fusion/fusion.mdx) for configuration details. +11. Set the **Config mode** to **Batch Forge**. +12. Enter the default **VMs type**, depending on your quota limits set previously. The default is _Standard_D4_v3_. +13. Enter the **VMs count**. If autoscaling is enabled (default), this is the maximum number of VMs you wish the pool to scale up to. If autoscaling is disabled, this is the fixed number of virtual machines in the pool. +14. Enable **Autoscale** to scale up and down automatically, based on the number of pipeline tasks. The number of VMs will vary from **0** to **VMs count**. +15. Enable **Dispose resources** for Seqera to automatically delete the Batch pool if the compute environment is deleted on the platform. +16. Select or create [**Container registry credentials**](../credentials/azure_registry_credentials.mdx) to authenticate a registry (used by the [Wave containers](https://www.nextflow.io/docs/latest/wave.html) service). It is recommended to use an [Azure Container registry](https://azure.microsoft.com/en-gb/products/container-registry) within the same region for maximum performance. +17. Apply [**Resource labels**](../resource-labels/overview.mdx). This will populate the **Metadata** fields of the Azure Batch pool. +18. Expand **Staging options** to include optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre-and-post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. +19. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. +20. Configure any advanced options you need: - Use **Jobs cleanup policy** to control how Nextflow process jobs are deleted on completion. Active jobs consume the quota of the Azure Batch account. By default, jobs are terminated by Nextflow and removed from the quota when all tasks succesfully complete. If set to _Always_, all jobs are deleted by Nextflow after pipeline completion. If set to _Never_, jobs are never deleted. If set to _On success_, successful tasks are removed but failed tasks will be left for debugging purposes. - Use **Token duration** to control the duration of the SAS token generated by Nextflow. This must be as long as the longest period of time the pipeline will run. -19. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before the compute environment is ready to launch pipelines. +21. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before the compute environment is ready to launch pipelines. **See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment.** -### Manual +## Manual This section is for users with a pre-configured Azure Batch pool. This requires an existing Azure Batch account with an existing pool. @@ -210,28 +183,35 @@ Your Seqera compute environment uses resources that you may be charged for in yo 1. In a workspace, select **Compute Environments > New Environment**. 2. Enter a descriptive name for this environment, e.g., _Azure Batch (east-us)_. 3. Select **Azure Batch** as the target platform. -4. Select your existing Azure credentials or select **+** to add new credentials. -5. Select a **Region**, e.g., _eastus (East US)_. -6. In the **Pipeline work directory** field, add the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. +4. Select your existing Azure credentials or select **+** to add new credentials. If you choose to use existing credentials, skip to step 7. + + :::tip + You can create multiple credentials in your Seqera environment. + ::: + +5. Enter a name, e.g., _Azure Credentials_. +6. Add the **Batch account** and **Blob Storage** credentials you created previously. +7. Select a **Region**, e.g., _eastus (East US)_. +8. In the **Pipeline work directory** field, add the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: -7. Set the **Config mode** to **Manual**. -8. Enter the **Compute Pool name**. This is the name of the Azure Batch pool you created previously in the Azure Batch account. +9. Set the **Config mode** to **Manual**. +10. Enter the **Compute Pool name**. This is the name of the Azure Batch pool you created previously in the Azure Batch account. :::note The default Azure Batch implementation uses a single pool for head and compute nodes. To use separate pools for head and compute nodes (e.g., to use low-priority VMs for compute jobs), see [this FAQ entry](../faqs.mdx#azure). ::: -9. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. -10. Configure any advanced options you need: +11. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. +12. Configure any advanced options you need: - Use **Jobs cleanup policy** to control how Nextflow process jobs are deleted on completion. Active jobs consume the quota of the Azure Batch account. By default, jobs are terminated by Nextflow and removed from the quota when all tasks succesfully complete. If set to _Always_, all jobs are deleted by Nextflow after pipeline completion. If set to _Never_, jobs are never deleted. If set to _On success_, successful tasks are removed but failed tasks will be left for debugging purposes. - Use **Token duration** to control the duration of the SAS token generated by Nextflow. This must be as long as the longest period of time the pipeline will run. -11. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before you are ready to launch pipelines. +13. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before you are ready to launch pipelines. **See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment.** @@ -246,7 +226,6 @@ Your Seqera compute environment uses resources that you may be charged for in yo [az-learn-jobs]: https://learn.microsoft.com/en-us/azure/batch/jobs-and-tasks [az-create-rg]: https://portal.azure.com/#create/Microsoft.ResourceGroup [az-create-storage]: https://portal.azure.com/#create/Microsoft.StorageAccount-ARM -[az-create-sp](https://learn.microsoft.com/en-us/entra/identity-platform/howto-create-service-principal-portal) [wave-docs]: https://docs.seqera.io/wave [nf-fusion-docs]: https://www.nextflow.io/docs/latest/fusion.html diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 0024f9c15..c4743269e 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -13,37 +13,41 @@ Ensure you have sufficient permissions to create resource groups, an Azure Stora ## Concepts +### Regions + +Azure regions are distinct geographic areas that contain multiple data centers, strategically located around the world, to provide high availability, fault tolerance, and low latency for cloud services. Each region offers a wide range of Azure services, and by choosing a specific region, users can optimize performance, ensure data residency compliance, and meet regulatory requirements. Azure regions also enable redundancy and disaster recovery options by allowing resources to be replicated across different regions, enhancing the resilience of applications and data. + +### Resource group + +An Azure Resource Group is a logical container that holds related Azure resources such as virtual machines, storage accounts, databases, and more. It serves as a management boundary, allowing you to organize, deploy, monitor, and manage all the resources within it as a single entity. Resources in a Resource Group share the same lifecycle, meaning they can be deployed, updated, and deleted together. This grouping also enables easier access control, monitoring, and cost management, making it a foundational element in organizing and managing cloud infrastructure in Azure. + ### Accounts Seqera Platform relies on an existing Azure Storage and Azure Batch account. You need at least 1 valid Azure Storage account and Azure Batch account within your subscription. Azure uses 'accounts' for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. While you can have multiple Azure Storage and Azure Batch accounts in an Azure subscription, each compute environment on the platform can only use one of each (one storage and one Batch account). You can set up multiple compute environments on the platform with different credentials, storage accounts, and Batch accounts. +### Service Principal + +An Azure Service Principal is an identity created for use with applications, hosted services, or automated tools to access Azure resources. It acts like a "user identity" with a specific set of permissions assigned to it. Seqera Platform can use an Azure Service Principal to authenticate and authorize access to Azure Batch for job execution and Azure Storage for data management. By assigning the necessary roles to the Service Principal, Seqera can securely interact with these Azure services, ensuring that only authorized operations are performed during pipeline execution. + +## Azure Resources + ### Resource group -To create Azure Batch and Azure Storage accounts, first create a [resource group][az-learn-rg] in your preferred region. +An Azure Batch and Azure Storage account need to be linked to a resource group in Azure, so it is necessary to create one. :::note A resource group can be created while creating an Azure Storage Account or Azure Batch account. ::: -### Regions - -Azure resources can operate across regions, but this incurs additional costs and security requirements. It is recommended to place all resources in the same region. See the [Azure product page on data residency][az-data-residency] for more information. - -## Resource group - -A resource group in Azure is a unit of related resources in Azure. As a rule of thumb, resources that have a similar lifecycle should be within the same resource group. You can delete a resource group and all associated components together. We recommend placing all platform compute resources in the same resource group, but this is not necessary. - -### Create a resource group - 1. Log in to your Azure account, go to the [Create Resource group][az-create-rg] page, and select **Create new resource group**. 2. Enter a name for the resource group, e.g., _towerrg_. 3. Choose the preferred region. 4. Select **Review and Create** to proceed. 5. Select **Create**. -## Storage account +### Storage account After creating a resource group, set up an [Azure storage account][az-learn-storage]. @@ -78,7 +82,7 @@ After creating a resource group, set up an [Azure storage account][az-learn-stor Blob container storage credentials are associated with the Batch pool configuration. Avoid changing these credentials in your Seqera instance after you have created the compute environment. ::: -## Batch account +### Batch account After you have created a resource group and storage account, create a [Batch account][az-learn-batch]. @@ -117,7 +121,37 @@ After you have created a resource group and storage account, create a [Batch acc - **Spot/low-priority vCPUs**: Platform does not support spot or low-priority machines when using Forge, so when using Forge this number can be zero. When manually setting up a pool, select an appropriate number of concurrent vCPUs here. - **Total Dedicated vCPUs per VM series**: See the Azure documentation for [virtual machine sizes][az-vm-sizes] to help determine the machine size you need. We recommend the latest version of the ED series available in your region as a cost-effective and appropriately-sized machine for running Nextflow. However, you will need to select alternative machine series that have additional requirements, such as those with additional GPUs or faster storage. Increase the quota by the number of required concurrent CPUs. In Azure, machines are charged per cpu minute so there is no additional cost for a higher number. -### Compute environment +### Credentials + +There are two Azure credential options available: primary Access Keys and a Service Principal. Primary Access Keys are simple to use but provide full access to the storage and batch accounts. Additionally, there can only be two keys per account, making them a single point of failure. A Service Principal, on the other hand, provides an account that can be granted access to Azure Batch and Storage resources, allowing for role-based access control with more precise permissions. Moreover, some Azure Batch features are only available when using a Service Principal instead of primary Access Keys. + +:::note +The two Azure credential types use entirely different authentication methods. You can add more than one credential to a workspace, but only one can be used at a time. While they can be used concurrently, they are not cross-compatible, and access granted by one will not be shared with the other. +::: + +#### Access Keys + +1. Navigate to the Azure Portal and sign in. +2. Locate the Azure Batch account and select "Keys" under "Account management." Here, you will see the Primary and Secondary keys. Copy one of the keys and save it in a secure location for later use. +3. Locate the Azure Storage account and, under the "Security and Networking" section, select "Access keys". Here, you will see Key1 and Key2 options. Copy one of them and save it in a secure location for later use. Be sure to delete them after saving them in Seqera Platform. +4. In Seqera Platform, go to your workspace, select "Add a new credential," and choose the "Azure Credentials" type. +5. Enter a name for the credentials, such as _Azure Credentials_. +6. Add the **Batch account** and **Blob Storage** account names and access keys. + +#### Service Principal + +1. In the Azure Portal, navigate to "Microsoft Entra ID," and under "App registrations," click "New registration." See the [Azure documentation][az-create-sp] for more details. +2. Provide a name for the application. The application will automatically have a Service Principal associated with it. +3. Assign roles to the Service Principal. Go to the Azure Storage account, and under "Access Control (IAM)," click "Add role assignment." +4. Choose the roles "Storage Blob Data Reader" and "Storage Blob Data Contributor", then select "Members", click "Select Members", search for your newly created Service Principal, and assign the role. +5. Repeat the same process for the Azure Batch account, but use the "Azure Batch Contributor" role. +6. Seqera Platform will need credentials to authenticate as the Service Principal. Navigate back to the app registration, and on the "Overview" page, save the "Application (client) ID" value for use in Seqera Platform. +7. Then, click "Certificates & secrets" and select "New client secret." A new secret will be created containing a value and secret ID. Save both of these securely for use in Seqera Platform. Be sure to delete them after saving them in Seqera Platform. +8. In Seqera Platform, go to your workspace, select "Add a new credential," choose the "Azure Credentials" type, then select the "Service Principal" tab. +9. Enter a name for the credentials, such as _Azure Credentials_. +10. Add the Application ID, Secret ID, Secret, **Batch account**, and **Blob Storage** account names to the relevant fields. + +## Seqera Platform There are two ways to create an Azure Batch compute environment in Seqera Platform: @@ -133,46 +167,37 @@ Batch Forge automatically creates resources that you may be charged for in your Create a Batch Forge Azure Batch compute environment: 1. In a workspace, select **Compute Environments > New Environment**. -1. Enter a descriptive name, e.g., _Azure Batch (east-us)_. -1. Select **Azure Batch** as the target platform. -1. Choose existing Azure credentials or add a new credential. If you are using existing credentials, skip to step 7. - - :::tip - You can create multiple credentials in your Seqera environment. - ::: - -1. Enter a name for the credentials, e.g., _Azure Credentials_. -1. Add the **Batch account** and **Blob Storage** account names and access keys. -1. Select a **Region**, e.g., _eastus_. -1. In the **Pipeline work directory** field, enter the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. +2. Enter a descriptive name, e.g., _Azure Batch (east-us)_. +3. Select **Azure Batch** as the target platform. +4. Choose existing Azure credentials or add a new credential. +5. Select a **Region**, e.g., _eastus_. +6. In the **Pipeline work directory** field, enter the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: -1. Select **Enable Wave containers** to facilitate access to private container repositories and provision containers in your pipelines using the Wave containers service. See [Wave containers][wave-docs] for more information. -1. Select **Enable Fusion v2** to allow access to your Azure Blob Storage data via the [Fusion v2][nf-fusion-docs] virtual distributed file system. This speeds up most data operations. The Fusion v2 file system requires Wave containers to be enabled. See [Fusion file system](../supported_software/fusion/fusion.mdx) for configuration details. -1. Set the **Config mode** to **Batch Forge**. -1. Enter the default **VMs type**, depending on your quota limits set previously. The default is _Standard_D4_v3_. -1. Enter the **VMs count**. If autoscaling is enabled (default), this is the maximum number of VMs you wish the pool to scale up to. If autoscaling is disabled, this is the fixed number of virtual machines in the pool. -1. Enable **Autoscale** to scale up and down automatically, based on the number of pipeline tasks. The number of VMs will vary from **0** to **VMs count**. -1. Enable **Dispose resources** for Seqera to automatically delete the Batch pool if the compute environment is deleted on the platform. -1. Select or create [**Container registry credentials**](../credentials/azure_registry_credentials.mdx) to authenticate a registry (used by the [Wave containers](https://www.nextflow.io/docs/latest/wave.html) service). It is recommended to use an [Azure Container registry](https://azure.microsoft.com/en-gb/products/container-registry) within the same region for maximum performance. -1. Apply [**Resource labels**](../resource-labels/overview.mdx). This will populate the **Metadata** fields of the Azure Batch pool. -1. Expand **Staging options** to include: - - Optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre--post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. - - Global Nextflow configuration settings for all pipeline runs launched with this compute environment. Configuration settings in this field override the same values in the pipeline Nextflow config file. -1. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. -1. Configure any advanced options you need: +7. Select **Enable Wave containers** to facilitate access to private container repositories and provision containers in your pipelines using the Wave containers service. See [Wave containers][wave-docs] for more information. +8. Select **Enable Fusion v2** to allow access to your Azure Blob Storage data via the [Fusion v2][nf-fusion-docs] virtual distributed file system. This speeds up most data operations. The Fusion v2 file system requires Wave containers to be enabled. See [Fusion file system](../supported_software/fusion/fusion.mdx) for configuration details. +9. Set the **Config mode** to **Batch Forge**. +10. Enter the default **VMs type**, depending on your quota limits set previously. The default is _Standard_D4_v3_. +11. Enter the **VMs count**. If autoscaling is enabled (default), this is the maximum number of VMs you wish the pool to scale up to. If autoscaling is disabled, this is the fixed number of virtual machines in the pool. +12. Enable **Autoscale** to scale up and down automatically, based on the number of pipeline tasks. The number of VMs will vary from **0** to **VMs count**. +13. Enable **Dispose resources** for Seqera to automatically delete the Batch pool if the compute environment is deleted on the platform. +14. Select or create [**Container registry credentials**](../credentials/azure_registry_credentials.mdx) to authenticate a registry (used by the [Wave containers](https://www.nextflow.io/docs/latest/wave.html) service). It is recommended to use an [Azure Container registry](https://azure.microsoft.com/en-gb/products/container-registry) within the same region for maximum performance. +15. Apply [**Resource labels**](../resource-labels/overview.mdx). This will populate the **Metadata** fields of the Azure Batch pool. +16. Expand **Staging options** to include optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre-and-post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. +17. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. +18. Configure any advanced options you need: - Use **Jobs cleanup policy** to control how Nextflow process jobs are deleted on completion. Active jobs consume the quota of the Azure Batch account. By default, jobs are terminated by Nextflow and removed from the quota when all tasks succesfully complete. If set to _Always_, all jobs are deleted by Nextflow after pipeline completion. If set to _Never_, jobs are never deleted. If set to _On success_, successful tasks are removed but failed tasks will be left for debugging purposes. - Use **Token duration** to control the duration of the SAS token generated by Nextflow. This must be as long as the longest period of time the pipeline will run. -1. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before the compute environment is ready to launch pipelines. +19. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before the compute environment is ready to launch pipelines. **See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment.** -## Manual +### Manual This section is for users with a pre-configured Azure Batch pool. This requires an existing Azure Batch account with an existing pool. @@ -183,53 +208,30 @@ Your Seqera compute environment uses resources that you may be charged for in yo **Create a manual Seqera Azure Batch compute environment** 1. In a workspace, select **Compute Environments > New Environment**. -1. Enter a descriptive name for this environment, e.g., _Azure Batch (east-us)_. -1. Select **Azure Batch** as the target platform. -1. Select your existing Azure credentials or select **+** to add new credentials. If you choose to use existing credentials, skip to step 7. - - :::tip - You can create multiple credentials in your Seqera environment. - ::: - -1. Enter a name, e.g., _Azure Credentials_. -1. Add the **Batch account** and **Blob Storage** credentials you created previously. -1. Select a **Region**, e.g., _eastus (East US)_. -1. In the **Pipeline work directory** field, add the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. +2. Enter a descriptive name for this environment, e.g., _Azure Batch (east-us)_. +3. Select **Azure Batch** as the target platform. +4. Select your existing Azure credentials or select **+** to add new credentials. +5. Select a **Region**, e.g., _eastus (East US)_. +6. In the **Pipeline work directory** field, add the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: -1. Set the **Config mode** to **Manual**. -1. Enter the **Compute Pool name**. This is the name of the Azure Batch pool you created previously in the Azure Batch account. +7. Set the **Config mode** to **Manual**. +8. Enter the **Compute Pool name**. This is the name of the Azure Batch pool you created previously in the Azure Batch account. :::note The default Azure Batch implementation uses a single pool for head and compute nodes. To use separate pools for head and compute nodes (e.g., to use low-priority VMs for compute jobs), see [this FAQ entry](../faqs.mdx#azure). ::: -1. Enter a user-assigned **Managed identity client ID**, if one is attached to your Azure Batch pool. See [Managed Identity](#managed-identity) below. -1. Apply [**Resource labels**](../resource-labels/overview.mdx). This will populate the **Metadata** fields of the Azure Batch pool. -1. Expand **Staging options** to include: - - Optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre--post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. - - Global Nextflow configuration settings for all pipeline runs launched with this compute environment. Configuration settings in this field override the same values in the pipeline Nextflow config file. -1. Define custom **Environment Variables** for the **Head Job** and/or **Compute Jobs**. -1. Configure any necessary advanced options: +9. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. +10. Configure any advanced options you need: + - Use **Jobs cleanup policy** to control how Nextflow process jobs are deleted on completion. Active jobs consume the quota of the Azure Batch account. By default, jobs are terminated by Nextflow and removed from the quota when all tasks succesfully complete. If set to _Always_, all jobs are deleted by Nextflow after pipeline completion. If set to _Never_, jobs are never deleted. If set to _On success_, successful tasks are removed but failed tasks will be left for debugging purposes. - Use **Token duration** to control the duration of the SAS token generated by Nextflow. This must be as long as the longest period of time the pipeline will run. -1. Select **Add** to complete the compute environment setup. The creation of resources will take a few seconds, after which you can launch pipelines. - -### Managed identity - -Nextflow can authenticate to Azure services using a managed identity. This method offers enhanced security compared to access keys, but must run on Azure infrastructure. - -When you use a manually configured compute environment with a managed identity attached to the Azure Batch Pool, Nextflow can use this managed identity for authentication. However, Platform still needs to use access keys to submit the initial task to Azure Batch to run Nextflow, which will then proceed with the managed identity for subsequent authentication. - -1. In Azure, create a user-assigned managed identity. See [Manage user-assigned managed identities](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-manage-user-assigned-managed-identities) for detailed steps. After creation, record the Client ID of the managed identity. -2. The user-assigned managed identity must have the necessary access roles for Nextflow. See [Required role assignments](https://www.nextflow.io/docs/latest/azure.html#required-role-assignments) for more information. -3. Associate the user-assigned managed identity with the Azure Batch Pool. See [Set up managed identity in your batch pool](https://learn.microsoft.com/en-us/troubleshoot/azure/hpc/batch/use-managed-identities-azure-batch-account-pool#set-up-managed-identity-in-your-batch-pool) for more information. -4. When you set up the Platform compute environment, select the Azure Batch pool by name and enter the managed identity client ID in the specified field as instructed above. -When you submit a pipeline to this compute environment, Nextflow will authenticate using the managed identity associated with the Azure Batch node it runs on, rather than relying on access keys. +11. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before you are ready to launch pipelines. **See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment.** @@ -244,6 +246,7 @@ When you submit a pipeline to this compute environment, Nextflow will authentica [az-learn-jobs]: https://learn.microsoft.com/en-us/azure/batch/jobs-and-tasks [az-create-rg]: https://portal.azure.com/#create/Microsoft.ResourceGroup [az-create-storage]: https://portal.azure.com/#create/Microsoft.StorageAccount-ARM +[az-create-sp](https://learn.microsoft.com/en-us/entra/identity-platform/howto-create-service-principal-portal) [wave-docs]: https://docs.seqera.io/wave -[nf-fusion-docs]: https://www.nextflow.io/docs/latest/fusion.html +[nf-fusion-docs]: https://www.nextflow.io/docs/latest/fusion.html \ No newline at end of file From 9215c0f1701beb273ab2d70bf4ac5179c1c11359 Mon Sep 17 00:00:00 2001 From: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com> Date: Thu, 15 Aug 2024 11:43:05 +0100 Subject: [PATCH 06/42] Add back the MI stuff --- .../version-24.1/compute-envs/azure-batch.mdx | 48 ++++++++++++++----- 1 file changed, 35 insertions(+), 13 deletions(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index c4743269e..7e92b4367 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -197,41 +197,63 @@ Create a Batch Forge Azure Batch compute environment: **See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment.** -### Manual +## Manual This section is for users with a pre-configured Azure Batch pool. This requires an existing Azure Batch account with an existing pool. :::caution Your Seqera compute environment uses resources that you may be charged for in your Azure account. See [Cloud costs](../monitoring/cloud-costs.mdx) for guidelines to manage cloud resources effectively and prevent unexpected costs. ::: - **Create a manual Seqera Azure Batch compute environment** 1. In a workspace, select **Compute Environments > New Environment**. -2. Enter a descriptive name for this environment, e.g., _Azure Batch (east-us)_. -3. Select **Azure Batch** as the target platform. -4. Select your existing Azure credentials or select **+** to add new credentials. -5. Select a **Region**, e.g., _eastus (East US)_. -6. In the **Pipeline work directory** field, add the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. +1. Enter a descriptive name for this environment, e.g., _Azure Batch (east-us)_. +1. Select **Azure Batch** as the target platform. +1. Select your existing Azure credentials or select **+** to add new credentials. If you choose to use existing credentials, skip to step 7. + + :::tip + You can create multiple credentials in your Seqera environment. + ::: + +1. Enter a name, e.g., _Azure Credentials_. +1. Add the **Batch account** and **Blob Storage** credentials you created previously. +1. Select a **Region**, e.g., _eastus (East US)_. +1. In the **Pipeline work directory** field, add the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: -7. Set the **Config mode** to **Manual**. -8. Enter the **Compute Pool name**. This is the name of the Azure Batch pool you created previously in the Azure Batch account. +1. Set the **Config mode** to **Manual**. +1. Enter the **Compute Pool name**. This is the name of the Azure Batch pool you created previously in the Azure Batch account. :::note The default Azure Batch implementation uses a single pool for head and compute nodes. To use separate pools for head and compute nodes (e.g., to use low-priority VMs for compute jobs), see [this FAQ entry](../faqs.mdx#azure). ::: -9. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. -10. Configure any advanced options you need: - +1. Enter a user-assigned **Managed identity client ID**, if one is attached to your Azure Batch pool. See [Managed Identity](#managed-identity) below. +1. Apply [**Resource labels**](../resource-labels/overview.mdx). This will populate the **Metadata** fields of the Azure Batch pool. +1. Expand **Staging options** to include: + - Optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre--post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. + - Global Nextflow configuration settings for all pipeline runs launched with this compute environment. Configuration settings in this field override the same values in the pipeline Nextflow config file. +1. Define custom **Environment Variables** for the **Head Job** and/or **Compute Jobs**. +1. Configure any necessary advanced options: - Use **Jobs cleanup policy** to control how Nextflow process jobs are deleted on completion. Active jobs consume the quota of the Azure Batch account. By default, jobs are terminated by Nextflow and removed from the quota when all tasks succesfully complete. If set to _Always_, all jobs are deleted by Nextflow after pipeline completion. If set to _Never_, jobs are never deleted. If set to _On success_, successful tasks are removed but failed tasks will be left for debugging purposes. - Use **Token duration** to control the duration of the SAS token generated by Nextflow. This must be as long as the longest period of time the pipeline will run. +1. Select **Add** to complete the compute environment setup. The creation of resources will take a few seconds, after which you can launch pipelines. + +### Managed identity + +Nextflow can authenticate to Azure services using a managed identity. This method offers enhanced security compared to access keys, but must run on Azure infrastructure. + +When you use a manually configured compute environment with a managed identity attached to the Azure Batch Pool, Nextflow can use this managed identity for authentication. However, Platform still needs to use access keys to submit the initial task to Azure Batch to run Nextflow, which will then proceed with the managed identity for subsequent authentication. + +1. In Azure, create a user-assigned managed identity. See [Manage user-assigned managed identities](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-manage-user-assigned-managed-identities) for detailed steps. After creation, record the Client ID of the managed identity. +2. The user-assigned managed identity must have the necessary access roles for Nextflow. See [Required role assignments](https://www.nextflow.io/docs/latest/azure.html#required-role-assignments) for more information. +3. Associate the user-assigned managed identity with the Azure Batch Pool. See [Set up managed identity in your batch pool](https://learn.microsoft.com/en-us/troubleshoot/azure/hpc/batch/use-managed-identities-azure-batch-account-pool#set-up-managed-identity-in-your-batch-pool) for more information. +4. When you set up the Platform compute environment, select the Azure Batch pool by name and enter the managed identity client ID in the specified field as instructed above. -11. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before you are ready to launch pipelines. +When you submit a pipeline to this compute environment, Nextflow will authenticate using the managed identity associated with the Azure Batch node it runs on, rather than relying on access keys. **See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment.** From 64eac541f840498081f20dc4fa67fe6b9ca8843d Mon Sep 17 00:00:00 2001 From: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com> Date: Thu, 15 Aug 2024 15:01:29 +0100 Subject: [PATCH 07/42] De-number everything --- .../version-24.1/compute-envs/azure-batch.mdx | 136 +++++++++--------- 1 file changed, 68 insertions(+), 68 deletions(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 7e92b4367..81fc070d3 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -42,10 +42,10 @@ A resource group can be created while creating an Azure Storage Account or Azure ::: 1. Log in to your Azure account, go to the [Create Resource group][az-create-rg] page, and select **Create new resource group**. -2. Enter a name for the resource group, e.g., _towerrg_. -3. Choose the preferred region. -4. Select **Review and Create** to proceed. -5. Select **Create**. +1. Enter a name for the resource group, e.g., _towerrg_. +1. Choose the preferred region. +1. Select **Review and Create** to proceed. +1. Select **Create**. ### Storage account @@ -59,24 +59,24 @@ After creating a resource group, set up an [Azure storage account][az-learn-stor If you haven't created a resource group, you can do so now. ::: -2. Enter a name for the storage account (e.g., _towerrgstorage_). -3. Choose the preferred region (same as the Batch account). -4. The platform supports any performance or redundancy settings — select the most appropriate settings for your use case. -5. Select **Next: Advanced**. -6. Enable _storage account key access_. -7. Select **Next: Networking**. +1. Enter a name for the storage account (e.g., _towerrgstorage_). +1. Choose the preferred region (same as the Batch account). +1. The platform supports any performance or redundancy settings — select the most appropriate settings for your use case. +1. Select **Next: Advanced**. +1. Enable _storage account key access_. +1. Select **Next: Networking**. - Enable public access from all networks. You can enable public access from selected virtual networks and IP addresses, but you will be unable to use Forge to create compute resources. Disabling public access is not supported. -8. Select **Data protection**. +1. Select **Data protection**. - Configure appropriate settings. All settings are supported by the platform. -9. Select **Encryption**. +1. Select **Encryption**. - Only Microsoft-managed keys (MMK) are supported. -10. In **tags**, add any required tags for the storage account. -11. Select **Review and Create**. -12. Select **Create** to create the Azure Storage account. -13. You will need at least one blob storage container to act as a working directory for Nextflow. -14. Go to your new storage account and select **+ Container** to create a new Blob storage container. A new container dialogue will open. Enter a suitable name, e.g., _towerrgstorage-container_. -15. Go to the **Access Keys** section of your new storage account (_towerrgstorage_ in this example). -16. Store the access keys for your Azure Storage account, to be used when you create a Seqera compute environment. +1. In **tags**, add any required tags for the storage account. +1. Select **Review and Create**. +1. Select **Create** to create the Azure Storage account. +1. You will need at least one blob storage container to act as a working directory for Nextflow. +1. Go to your new storage account and select **+ Container** to create a new Blob storage container. A new container dialogue will open. Enter a suitable name, e.g., _towerrgstorage-container_. +1. Go to the **Access Keys** section of your new storage account (_towerrgstorage_ in this example). +1. Store the access keys for your Azure Storage account, to be used when you create a Seqera compute environment. :::caution Blob container storage credentials are associated with the Batch pool configuration. Avoid changing these credentials in your Seqera instance after you have created the compute environment. @@ -89,26 +89,26 @@ After you have created a resource group and storage account, create a [Batch acc ### Create a Batch account 1. Log in to your Azure account and select **Create a batch account** on [this page][az-create-batch]. -2. Select the existing resource group or create a new one. -3. Enter a name for the Batch account, e.g., _towerrgbatch_. -4. Choose the preferred region (same as the storage account). -5. Select **Advanced**. -6. For _Pool allocation mode_, select Batch service. -7. For _Authentication mode_, ensure _Shared Key_ is selected. -8. Select **Networking**. Ensure networking access is sufficient for the platform and any additional required resources. -9. In **tags**, add any required tags for the Batch account. -10. Select **Review and Create**. -11. Select **Create**. -12. Go to your new Batch account, then select **Access Keys**. -13. Store the access keys for your Azure Batch account, to be used when you create a Seqera compute environment. +1. Select the existing resource group or create a new one. +1. Enter a name for the Batch account, e.g., _towerrgbatch_. +1. Choose the preferred region (same as the storage account). +1. Select **Advanced**. +1. For _Pool allocation mode_, select Batch service. +1. For _Authentication mode_, ensure _Shared Key_ is selected. +1. Select **Networking**. Ensure networking access is sufficient for the platform and any additional required resources. +1. In **tags**, add any required tags for the Batch account. +1. Select **Review and Create**. +1. Select **Create**. +1. Go to your new Batch account, then select **Access Keys**. +1. Store the access keys for your Azure Batch account, to be used when you create a Seqera compute environment. :::caution A newly-created Azure Batch account may not be entitled to create virtual machines without making a service request to Azure. See [Azure Batch service quotas and limits][az-batch-quotas] for more information. ::: -14. Select the **+ Quotas** tab of the Azure Batch account to check and increase existing quotas if necessary. -15. Select **+ Request quota increase** and add the quantity of resources you require. Here is a brief guideline: +1. Select the **+ Quotas** tab of the Azure Batch account to check and increase existing quotas if necessary. +1. Select **+ Request quota increase** and add the quantity of resources you require. Here is a brief guideline: - **Active jobs and schedules**: Each Nextflow process will require an active Azure Batch job per pipeline while running, so increase this number to a high level. See [here][az-learn-jobs] to learn more about jobs in Azure Batch. - **Pools**: Each platform compute environment requires one Azure Batch pool. Each pool is composed of multiple machines of one virtual machine size. @@ -132,24 +132,24 @@ The two Azure credential types use entirely different authentication methods. Yo #### Access Keys 1. Navigate to the Azure Portal and sign in. -2. Locate the Azure Batch account and select "Keys" under "Account management." Here, you will see the Primary and Secondary keys. Copy one of the keys and save it in a secure location for later use. -3. Locate the Azure Storage account and, under the "Security and Networking" section, select "Access keys". Here, you will see Key1 and Key2 options. Copy one of them and save it in a secure location for later use. Be sure to delete them after saving them in Seqera Platform. -4. In Seqera Platform, go to your workspace, select "Add a new credential," and choose the "Azure Credentials" type. -5. Enter a name for the credentials, such as _Azure Credentials_. -6. Add the **Batch account** and **Blob Storage** account names and access keys. +1. Locate the Azure Batch account and select "Keys" under "Account management." Here, you will see the Primary and Secondary keys. Copy one of the keys and save it in a secure location for later use. +1. Locate the Azure Storage account and, under the "Security and Networking" section, select "Access keys". Here, you will see Key1 and Key2 options. Copy one of them and save it in a secure location for later use. Be sure to delete them after saving them in Seqera Platform. +1. In Seqera Platform, go to your workspace, select "Add a new credential," and choose the "Azure Credentials" type. +1. Enter a name for the credentials, such as _Azure Credentials_. +1. Add the **Batch account** and **Blob Storage** account names and access keys. #### Service Principal 1. In the Azure Portal, navigate to "Microsoft Entra ID," and under "App registrations," click "New registration." See the [Azure documentation][az-create-sp] for more details. -2. Provide a name for the application. The application will automatically have a Service Principal associated with it. -3. Assign roles to the Service Principal. Go to the Azure Storage account, and under "Access Control (IAM)," click "Add role assignment." -4. Choose the roles "Storage Blob Data Reader" and "Storage Blob Data Contributor", then select "Members", click "Select Members", search for your newly created Service Principal, and assign the role. -5. Repeat the same process for the Azure Batch account, but use the "Azure Batch Contributor" role. -6. Seqera Platform will need credentials to authenticate as the Service Principal. Navigate back to the app registration, and on the "Overview" page, save the "Application (client) ID" value for use in Seqera Platform. -7. Then, click "Certificates & secrets" and select "New client secret." A new secret will be created containing a value and secret ID. Save both of these securely for use in Seqera Platform. Be sure to delete them after saving them in Seqera Platform. -8. In Seqera Platform, go to your workspace, select "Add a new credential," choose the "Azure Credentials" type, then select the "Service Principal" tab. -9. Enter a name for the credentials, such as _Azure Credentials_. -10. Add the Application ID, Secret ID, Secret, **Batch account**, and **Blob Storage** account names to the relevant fields. +1. Provide a name for the application. The application will automatically have a Service Principal associated with it. +1. Assign roles to the Service Principal. Go to the Azure Storage account, and under "Access Control (IAM)," click "Add role assignment." +1. Choose the roles "Storage Blob Data Reader" and "Storage Blob Data Contributor", then select "Members", click "Select Members", search for your newly created Service Principal, and assign the role. +1. Repeat the same process for the Azure Batch account, but use the "Azure Batch Contributor" role. +1. Seqera Platform will need credentials to authenticate as the Service Principal. Navigate back to the app registration, and on the "Overview" page, save the "Application (client) ID" value for use in Seqera Platform. +1. Then, click "Certificates & secrets" and select "New client secret." A new secret will be created containing a value and secret ID. Save both of these securely for use in Seqera Platform. Be sure to delete them after saving them in Seqera Platform. +1. In Seqera Platform, go to your workspace, select "Add a new credential," choose the "Azure Credentials" type, then select the "Service Principal" tab. +1. Enter a name for the credentials, such as _Azure Credentials_. +1. Add the Application ID, Secret ID, Secret, **Batch account**, and **Blob Storage** account names to the relevant fields. ## Seqera Platform @@ -167,33 +167,33 @@ Batch Forge automatically creates resources that you may be charged for in your Create a Batch Forge Azure Batch compute environment: 1. In a workspace, select **Compute Environments > New Environment**. -2. Enter a descriptive name, e.g., _Azure Batch (east-us)_. -3. Select **Azure Batch** as the target platform. -4. Choose existing Azure credentials or add a new credential. -5. Select a **Region**, e.g., _eastus_. -6. In the **Pipeline work directory** field, enter the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. +1. Enter a descriptive name, e.g., _Azure Batch (east-us)_. +1. Select **Azure Batch** as the target platform. +1. Choose existing Azure credentials or add a new credential. +1. Select a **Region**, e.g., _eastus_. +1. In the **Pipeline work directory** field, enter the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: -7. Select **Enable Wave containers** to facilitate access to private container repositories and provision containers in your pipelines using the Wave containers service. See [Wave containers][wave-docs] for more information. -8. Select **Enable Fusion v2** to allow access to your Azure Blob Storage data via the [Fusion v2][nf-fusion-docs] virtual distributed file system. This speeds up most data operations. The Fusion v2 file system requires Wave containers to be enabled. See [Fusion file system](../supported_software/fusion/fusion.mdx) for configuration details. -9. Set the **Config mode** to **Batch Forge**. -10. Enter the default **VMs type**, depending on your quota limits set previously. The default is _Standard_D4_v3_. -11. Enter the **VMs count**. If autoscaling is enabled (default), this is the maximum number of VMs you wish the pool to scale up to. If autoscaling is disabled, this is the fixed number of virtual machines in the pool. -12. Enable **Autoscale** to scale up and down automatically, based on the number of pipeline tasks. The number of VMs will vary from **0** to **VMs count**. -13. Enable **Dispose resources** for Seqera to automatically delete the Batch pool if the compute environment is deleted on the platform. -14. Select or create [**Container registry credentials**](../credentials/azure_registry_credentials.mdx) to authenticate a registry (used by the [Wave containers](https://www.nextflow.io/docs/latest/wave.html) service). It is recommended to use an [Azure Container registry](https://azure.microsoft.com/en-gb/products/container-registry) within the same region for maximum performance. -15. Apply [**Resource labels**](../resource-labels/overview.mdx). This will populate the **Metadata** fields of the Azure Batch pool. -16. Expand **Staging options** to include optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre-and-post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. -17. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. -18. Configure any advanced options you need: +1. Select **Enable Wave containers** to facilitate access to private container repositories and provision containers in your pipelines using the Wave containers service. See [Wave containers][wave-docs] for more information. +1. Select **Enable Fusion v2** to allow access to your Azure Blob Storage data via the [Fusion v2][nf-fusion-docs] virtual distributed file system. This speeds up most data operations. The Fusion v2 file system requires Wave containers to be enabled. See [Fusion file system](../supported_software/fusion/fusion.mdx) for configuration details. +1. Set the **Config mode** to **Batch Forge**. +1. Enter the default **VMs type**, depending on your quota limits set previously. The default is _Standard_D4_v3_. +1. Enter the **VMs count**. If autoscaling is enabled (default), this is the maximum number of VMs you wish the pool to scale up to. If autoscaling is disabled, this is the fixed number of virtual machines in the pool. +1. Enable **Autoscale** to scale up and down automatically, based on the number of pipeline tasks. The number of VMs will vary from **0** to **VMs count**. +1. Enable **Dispose resources** for Seqera to automatically delete the Batch pool if the compute environment is deleted on the platform. +1. Select or create [**Container registry credentials**](../credentials/azure_registry_credentials.mdx) to authenticate a registry (used by the [Wave containers](https://www.nextflow.io/docs/latest/wave.html) service). It is recommended to use an [Azure Container registry](https://azure.microsoft.com/en-gb/products/container-registry) within the same region for maximum performance. +1. Apply [**Resource labels**](../resource-labels/overview.mdx). This will populate the **Metadata** fields of the Azure Batch pool. +1. Expand **Staging options** to include optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre-and-post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. +1. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. +1. Configure any advanced options you need: - Use **Jobs cleanup policy** to control how Nextflow process jobs are deleted on completion. Active jobs consume the quota of the Azure Batch account. By default, jobs are terminated by Nextflow and removed from the quota when all tasks succesfully complete. If set to _Always_, all jobs are deleted by Nextflow after pipeline completion. If set to _Never_, jobs are never deleted. If set to _On success_, successful tasks are removed but failed tasks will be left for debugging purposes. - Use **Token duration** to control the duration of the SAS token generated by Nextflow. This must be as long as the longest period of time the pipeline will run. -19. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before the compute environment is ready to launch pipelines. +1. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before the compute environment is ready to launch pipelines. **See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment.** @@ -249,9 +249,9 @@ Nextflow can authenticate to Azure services using a managed identity. This metho When you use a manually configured compute environment with a managed identity attached to the Azure Batch Pool, Nextflow can use this managed identity for authentication. However, Platform still needs to use access keys to submit the initial task to Azure Batch to run Nextflow, which will then proceed with the managed identity for subsequent authentication. 1. In Azure, create a user-assigned managed identity. See [Manage user-assigned managed identities](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-manage-user-assigned-managed-identities) for detailed steps. After creation, record the Client ID of the managed identity. -2. The user-assigned managed identity must have the necessary access roles for Nextflow. See [Required role assignments](https://www.nextflow.io/docs/latest/azure.html#required-role-assignments) for more information. -3. Associate the user-assigned managed identity with the Azure Batch Pool. See [Set up managed identity in your batch pool](https://learn.microsoft.com/en-us/troubleshoot/azure/hpc/batch/use-managed-identities-azure-batch-account-pool#set-up-managed-identity-in-your-batch-pool) for more information. -4. When you set up the Platform compute environment, select the Azure Batch pool by name and enter the managed identity client ID in the specified field as instructed above. +1. The user-assigned managed identity must have the necessary access roles for Nextflow. See [Required role assignments](https://www.nextflow.io/docs/latest/azure.html#required-role-assignments) for more information. +1. Associate the user-assigned managed identity with the Azure Batch Pool. See [Set up managed identity in your batch pool](https://learn.microsoft.com/en-us/troubleshoot/azure/hpc/batch/use-managed-identities-azure-batch-account-pool#set-up-managed-identity-in-your-batch-pool) for more information. +1. When you set up the Platform compute environment, select the Azure Batch pool by name and enter the managed identity client ID in the specified field as instructed above. When you submit a pipeline to this compute environment, Nextflow will authenticate using the managed identity associated with the Azure Batch node it runs on, rather than relying on access keys. From 63346e7f42c2e0f74b5b2d4e931d1c7813f6a371 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Tue, 20 Aug 2024 17:44:36 +0200 Subject: [PATCH 08/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com> Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 81fc070d3..cee303088 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -123,7 +123,7 @@ After you have created a resource group and storage account, create a [Batch acc ### Credentials -There are two Azure credential options available: primary Access Keys and a Service Principal. Primary Access Keys are simple to use but provide full access to the storage and batch accounts. Additionally, there can only be two keys per account, making them a single point of failure. A Service Principal, on the other hand, provides an account that can be granted access to Azure Batch and Storage resources, allowing for role-based access control with more precise permissions. Moreover, some Azure Batch features are only available when using a Service Principal instead of primary Access Keys. +There are two types of Azure credentials available: primary Access Keys and a Service Principal. Primary Access Keys are simple to use but provide full access to the storage and batch accounts. Additionally, there can only be two keys per account, making them a single point of failure. A Service Principal, on the other hand, provides an account that can be granted access to Azure Batch and Storage resources, allowing for role-based access control with more precise permissions. Moreover, some Azure Batch features are only available when using a Service Principal instead of primary Access Keys. :::note The two Azure credential types use entirely different authentication methods. You can add more than one credential to a workspace, but only one can be used at a time. While they can be used concurrently, they are not cross-compatible, and access granted by one will not be shared with the other. From 6d2ccfd198332380bec1fcf5a5e789f0fae6492e Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Tue, 20 Aug 2024 17:45:05 +0200 Subject: [PATCH 09/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com> Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index cee303088..9ff8a8e47 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -134,7 +134,7 @@ The two Azure credential types use entirely different authentication methods. Yo 1. Navigate to the Azure Portal and sign in. 1. Locate the Azure Batch account and select "Keys" under "Account management." Here, you will see the Primary and Secondary keys. Copy one of the keys and save it in a secure location for later use. 1. Locate the Azure Storage account and, under the "Security and Networking" section, select "Access keys". Here, you will see Key1 and Key2 options. Copy one of them and save it in a secure location for later use. Be sure to delete them after saving them in Seqera Platform. -1. In Seqera Platform, go to your workspace, select "Add a new credential," and choose the "Azure Credentials" type. +1. In Seqera Platform, go to your workspace, select "Add a new credential," and choose the "Azure Credentials" type. Use keys as the authentication method. 1. Enter a name for the credentials, such as _Azure Credentials_. 1. Add the **Batch account** and **Blob Storage** account names and access keys. From 6238d2f58fd7e007f8bf9b459ec2b71882018c11 Mon Sep 17 00:00:00 2001 From: llewellyn-sl Date: Tue, 20 Aug 2024 21:21:36 +0200 Subject: [PATCH 10/42] Formatting and language review --- .../version-24.1/compute-envs/azure-batch.mdx | 142 +++++++++--------- 1 file changed, 74 insertions(+), 68 deletions(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 9ff8a8e47..ad9e503eb 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -11,56 +11,61 @@ For details, visit [Azure Free Account][az-create-account]. Ensure you have sufficient permissions to create resource groups, an Azure Storage account, and a Batch account. ::: -## Concepts +## Azure concepts -### Regions +
+ **Regions** -Azure regions are distinct geographic areas that contain multiple data centers, strategically located around the world, to provide high availability, fault tolerance, and low latency for cloud services. Each region offers a wide range of Azure services, and by choosing a specific region, users can optimize performance, ensure data residency compliance, and meet regulatory requirements. Azure regions also enable redundancy and disaster recovery options by allowing resources to be replicated across different regions, enhancing the resilience of applications and data. + Azure regions are distinct geographic areas that contain multiple data centers strategically located around the world to provide high availability, fault tolerance, and low latency for cloud services. Each region offers a wide range of Azure services — choose a specific region to optimize performance, ensure data residency compliance, and meet regulatory requirements. Azure regions also enable redundancy and disaster recovery options by allowing resources to be replicated across different regions, enhancing the resilience of applications and data. -### Resource group +
+
+ **Resource groups** + + An Azure resource group is a logical container that holds related Azure resources such as virtual machines, storage accounts, databases, and more. A resource group serves as a management boundary to organize, deploy, monitor, and manage the resources within it as a single entity. Resources in a resource group share the same lifecycle, meaning they can be deployed, updated, and deleted together. This also enables easier access control, monitoring, and cost management, making resource groups a foundational element in organizing and managing cloud infrastructure in Azure. -An Azure Resource Group is a logical container that holds related Azure resources such as virtual machines, storage accounts, databases, and more. It serves as a management boundary, allowing you to organize, deploy, monitor, and manage all the resources within it as a single entity. Resources in a Resource Group share the same lifecycle, meaning they can be deployed, updated, and deleted together. This grouping also enables easier access control, monitoring, and cost management, making it a foundational element in organizing and managing cloud infrastructure in Azure. +
+
+ **Accounts** -### Accounts + Seqera Platform relies on an existing Azure Storage and Azure Batch account. You need at least one valid Azure Storage and Azure Batch account in your Azure subscription. -Seqera Platform relies on an existing Azure Storage and Azure Batch account. You need at least 1 valid Azure Storage account and Azure Batch account within your subscription. + Azure uses accounts for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. While you can have multiple Azure Storage and Azure Batch accounts in an Azure subscription, a Platform compute environment can only use one of each (one Storage and one Batch account). You can set up multiple Platform compute environments to use different credentials, Storage accounts, and Batch accounts. -Azure uses 'accounts' for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. While you can have multiple Azure Storage and Azure Batch accounts in an Azure subscription, each compute environment on the platform can only use one of each (one storage and one Batch account). You can set up multiple compute environments on the platform with different credentials, storage accounts, and Batch accounts. +
+
+ **Service principals** -### Service Principal + An Azure service principal is an identity created for use with applications, hosted services, or automated tools to access Azure resources. A service principal is effectively a user identity with a set of permissions assigned to it. Platform can use an Azure service principal to authenticate and authorize access to Azure Batch for job execution and Azure Storage for data management. By assigning the necessary roles to the service principal, Platform can securely interact with these Azure services. This ensures that only authorized operations are performed during pipeline execution. -An Azure Service Principal is an identity created for use with applications, hosted services, or automated tools to access Azure resources. It acts like a "user identity" with a specific set of permissions assigned to it. Seqera Platform can use an Azure Service Principal to authenticate and authorize access to Azure Batch for job execution and Azure Storage for data management. By assigning the necessary roles to the Service Principal, Seqera can securely interact with these Azure services, ensuring that only authorized operations are performed during pipeline execution. +
-## Azure Resources +## Create Azure resources ### Resource group -An Azure Batch and Azure Storage account need to be linked to a resource group in Azure, so it is necessary to create one. +Create a resource group to link your Azure Batch and Azure Storage account: :::note A resource group can be created while creating an Azure Storage Account or Azure Batch account. ::: 1. Log in to your Azure account, go to the [Create Resource group][az-create-rg] page, and select **Create new resource group**. -1. Enter a name for the resource group, e.g., _towerrg_. +1. Enter a name for the resource group, such as _towerrg_. 1. Choose the preferred region. 1. Select **Review and Create** to proceed. 1. Select **Create**. ### Storage account -After creating a resource group, set up an [Azure storage account][az-learn-storage]. - -### Create a storage account +After creating a resource group, set up an [Azure storage account][az-learn-storage]: 1. Log in to your Azure account, go to the [Create storage account][az-create-storage] page, and select **Create a storage account**. - :::note If you haven't created a resource group, you can do so now. ::: - -1. Enter a name for the storage account (e.g., _towerrgstorage_). -1. Choose the preferred region (same as the Batch account). +1. Enter a name for the storage account, such as _towerrgstorage_. +1. Choose the preferred region. This must be the same region as the Batch account. 1. The platform supports any performance or redundancy settings — select the most appropriate settings for your use case. 1. Select **Next: Advanced**. 1. Enable _storage account key access_. @@ -84,74 +89,82 @@ Blob container storage credentials are associated with the Batch pool configurat ### Batch account -After you have created a resource group and storage account, create a [Batch account][az-learn-batch]. - -### Create a Batch account +After you have created a resource group and storage account, create a [Batch account][az-learn-batch]: 1. Log in to your Azure account and select **Create a batch account** on [this page][az-create-batch]. 1. Select the existing resource group or create a new one. -1. Enter a name for the Batch account, e.g., _towerrgbatch_. -1. Choose the preferred region (same as the storage account). +1. Enter a name for the Batch account, such as _towerrgbatch_. +1. Choose the preferred region. This must be the same region as the Storage account. 1. Select **Advanced**. -1. For _Pool allocation mode_, select Batch service. -1. For _Authentication mode_, ensure _Shared Key_ is selected. -1. Select **Networking**. Ensure networking access is sufficient for the platform and any additional required resources. +1. For **Pool allocation mode**, select Batch service. +1. For **Authentication mode**, select _Shared Key_. +1. Select **Networking**. Ensure networking access is sufficient for Platform and any additional required resources. 1. In **tags**, add any required tags for the Batch account. 1. Select **Review and Create**. 1. Select **Create**. 1. Go to your new Batch account, then select **Access Keys**. 1. Store the access keys for your Azure Batch account, to be used when you create a Seqera compute environment. - :::caution A newly-created Azure Batch account may not be entitled to create virtual machines without making a service request to Azure. See [Azure Batch service quotas and limits][az-batch-quotas] for more information. ::: - 1. Select the **+ Quotas** tab of the Azure Batch account to check and increase existing quotas if necessary. 1. Select **+ Request quota increase** and add the quantity of resources you require. Here is a brief guideline: - - **Active jobs and schedules**: Each Nextflow process will require an active Azure Batch job per pipeline while running, so increase this number to a high level. See [here][az-learn-jobs] to learn more about jobs in Azure Batch. - **Pools**: Each platform compute environment requires one Azure Batch pool. Each pool is composed of multiple machines of one virtual machine size. - :::note To use separate pools for head and compute nodes, see [this FAQ entry](../faqs.mdx#azure). ::: - - **Batch accounts per region per subscription**: Set this to the number of Azure Batch accounts per region per subscription. Only one is required. - **Spot/low-priority vCPUs**: Platform does not support spot or low-priority machines when using Forge, so when using Forge this number can be zero. When manually setting up a pool, select an appropriate number of concurrent vCPUs here. - **Total Dedicated vCPUs per VM series**: See the Azure documentation for [virtual machine sizes][az-vm-sizes] to help determine the machine size you need. We recommend the latest version of the ED series available in your region as a cost-effective and appropriately-sized machine for running Nextflow. However, you will need to select alternative machine series that have additional requirements, such as those with additional GPUs or faster storage. Increase the quota by the number of required concurrent CPUs. In Azure, machines are charged per cpu minute so there is no additional cost for a higher number. ### Credentials -There are two types of Azure credentials available: primary Access Keys and a Service Principal. Primary Access Keys are simple to use but provide full access to the storage and batch accounts. Additionally, there can only be two keys per account, making them a single point of failure. A Service Principal, on the other hand, provides an account that can be granted access to Azure Batch and Storage resources, allowing for role-based access control with more precise permissions. Moreover, some Azure Batch features are only available when using a Service Principal instead of primary Access Keys. +There are two types of Azure credentials available: primary access keys and service principals. Primary access keys are simple to use but provide full access to the Storage and Batch accounts. Azure allows only two access keys per account, making them a single point of failure. A service principal provides an account which can be granted access to Azure Batch and Storage resources. Service principals therefore enable role-based access control with more precise permissions. Moreover, some Azure Batch features are only available when using a service principal. :::note -The two Azure credential types use entirely different authentication methods. You can add more than one credential to a workspace, but only one can be used at a time. While they can be used concurrently, they are not cross-compatible, and access granted by one will not be shared with the other. +The two Azure credential types use entirely different authentication methods. You can add more than one credential to a workspace, but only one can be used at a time. While credentials can be used concurrently, they are not cross-compatible — access granted by one credential will not be shared with the other. ::: -#### Access Keys +#### Access keys + +To create a primary access key: 1. Navigate to the Azure Portal and sign in. -1. Locate the Azure Batch account and select "Keys" under "Account management." Here, you will see the Primary and Secondary keys. Copy one of the keys and save it in a secure location for later use. -1. Locate the Azure Storage account and, under the "Security and Networking" section, select "Access keys". Here, you will see Key1 and Key2 options. Copy one of them and save it in a secure location for later use. Be sure to delete them after saving them in Seqera Platform. -1. In Seqera Platform, go to your workspace, select "Add a new credential," and choose the "Azure Credentials" type. Use keys as the authentication method. -1. Enter a name for the credentials, such as _Azure Credentials_. -1. Add the **Batch account** and **Blob Storage** account names and access keys. - -#### Service Principal - -1. In the Azure Portal, navigate to "Microsoft Entra ID," and under "App registrations," click "New registration." See the [Azure documentation][az-create-sp] for more details. -1. Provide a name for the application. The application will automatically have a Service Principal associated with it. -1. Assign roles to the Service Principal. Go to the Azure Storage account, and under "Access Control (IAM)," click "Add role assignment." -1. Choose the roles "Storage Blob Data Reader" and "Storage Blob Data Contributor", then select "Members", click "Select Members", search for your newly created Service Principal, and assign the role. -1. Repeat the same process for the Azure Batch account, but use the "Azure Batch Contributor" role. -1. Seqera Platform will need credentials to authenticate as the Service Principal. Navigate back to the app registration, and on the "Overview" page, save the "Application (client) ID" value for use in Seqera Platform. -1. Then, click "Certificates & secrets" and select "New client secret." A new secret will be created containing a value and secret ID. Save both of these securely for use in Seqera Platform. Be sure to delete them after saving them in Seqera Platform. -1. In Seqera Platform, go to your workspace, select "Add a new credential," choose the "Azure Credentials" type, then select the "Service Principal" tab. -1. Enter a name for the credentials, such as _Azure Credentials_. -1. Add the Application ID, Secret ID, Secret, **Batch account**, and **Blob Storage** account names to the relevant fields. - -## Seqera Platform +1. Locate the Azure Batch account and select **Keys** under **Account management**. The Primary and Secondary keys are listed here. Copy one of the keys and save it in a secure location for later use. +1. Locate the Azure Storage account and, under the **Security and Networking** section, select **Access keys**. Key1 and Key2 options are listed here. Copy one of them and save it in a secure location for later use. Delete the stored key after adding it to a credential in Platform. +1. In your Platform workspace **Credentials** tab, select the **Add credentials** button and complete the following fields: + - Enter a **Name** for the credentials + - **Provider**: Azure + - Select the **Shared key** tab + - Add the **Batch account** and **Blob Storage account** names and access keys to the relevant fields. + +#### Service principal + +:::note +See [Create a service principal][az-create-sp] for more details. +::: + +To create a service principal: + +1. In the Azure Portal, navigate to **Microsoft Entra ID**. Under **App registrations**, select **New registration**. +1. Provide a name for the application. The application will automatically have a service principal associated with it. +1. Assign roles to the service principal: + 1. Go to the Azure Storage account. Under **Access Control (IAM)**, select **Add role assignment**. + 1. Select the **Storage Blob Data Reader** and **Storage Blob Data Contributor** roles. + 1. Select **Members**, then **Select Members**. Search for your newly created service principal and assign the role. + 1. Repeat the same process for the Azure Batch account, using the **Azure Batch Contributor** role. +1. Platform will need credentials to authenticate as the service principal: + 1. Navigate back to the app registration. On the **Overview** page, save the **Application (client) ID** value for use in Platform. + 1. Select **Certificates & secrets**, then **New client secret**. A new secret is created containing a value and secret ID. Save both values securely for use in Platform. Delete these stored values after saving them in Platform. +1. In your Platform workspace **Credentials** tab, select the **Add credentials** button and complete the following fields: + - Enter a **Name** for the credentials + - **Provider**: Azure + - Select the **Service principal** tab + - Add the Application ID, Secret ID, Secret, **Batch account**, and **Blob Storage account** names to the relevant fields. + +## Platform compute environment There are two ways to create an Azure Batch compute environment in Seqera Platform: @@ -172,11 +185,9 @@ Create a Batch Forge Azure Batch compute environment: 1. Choose existing Azure credentials or add a new credential. 1. Select a **Region**, e.g., _eastus_. 1. In the **Pipeline work directory** field, enter the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. - :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: - 1. Select **Enable Wave containers** to facilitate access to private container repositories and provision containers in your pipelines using the Wave containers service. See [Wave containers][wave-docs] for more information. 1. Select **Enable Fusion v2** to allow access to your Azure Blob Storage data via the [Fusion v2][nf-fusion-docs] virtual distributed file system. This speeds up most data operations. The Fusion v2 file system requires Wave containers to be enabled. See [Fusion file system](../supported_software/fusion/fusion.mdx) for configuration details. 1. Set the **Config mode** to **Batch Forge**. @@ -186,13 +197,13 @@ Create a Batch Forge Azure Batch compute environment: 1. Enable **Dispose resources** for Seqera to automatically delete the Batch pool if the compute environment is deleted on the platform. 1. Select or create [**Container registry credentials**](../credentials/azure_registry_credentials.mdx) to authenticate a registry (used by the [Wave containers](https://www.nextflow.io/docs/latest/wave.html) service). It is recommended to use an [Azure Container registry](https://azure.microsoft.com/en-gb/products/container-registry) within the same region for maximum performance. 1. Apply [**Resource labels**](../resource-labels/overview.mdx). This will populate the **Metadata** fields of the Azure Batch pool. -1. Expand **Staging options** to include optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre-and-post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. +1. Expand **Staging options** to include: + - Optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre--post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. + - Global Nextflow configuration settings for all pipeline runs launched with this compute environment. Configuration settings in this field override the same values in the pipeline Nextflow config file. 1. Specify custom **Environment variables** for the **Head job** and/or **Compute jobs**. 1. Configure any advanced options you need: - - Use **Jobs cleanup policy** to control how Nextflow process jobs are deleted on completion. Active jobs consume the quota of the Azure Batch account. By default, jobs are terminated by Nextflow and removed from the quota when all tasks succesfully complete. If set to _Always_, all jobs are deleted by Nextflow after pipeline completion. If set to _Never_, jobs are never deleted. If set to _On success_, successful tasks are removed but failed tasks will be left for debugging purposes. - Use **Token duration** to control the duration of the SAS token generated by Nextflow. This must be as long as the longest period of time the pipeline will run. - 1. Select **Add** to finalize the compute environment setup. It will take a few seconds for all the resources to be created before the compute environment is ready to launch pipelines. **See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment.** @@ -204,33 +215,28 @@ This section is for users with a pre-configured Azure Batch pool. This requires :::caution Your Seqera compute environment uses resources that you may be charged for in your Azure account. See [Cloud costs](../monitoring/cloud-costs.mdx) for guidelines to manage cloud resources effectively and prevent unexpected costs. ::: + **Create a manual Seqera Azure Batch compute environment** 1. In a workspace, select **Compute Environments > New Environment**. 1. Enter a descriptive name for this environment, e.g., _Azure Batch (east-us)_. 1. Select **Azure Batch** as the target platform. 1. Select your existing Azure credentials or select **+** to add new credentials. If you choose to use existing credentials, skip to step 7. - :::tip You can create multiple credentials in your Seqera environment. ::: - 1. Enter a name, e.g., _Azure Credentials_. 1. Add the **Batch account** and **Blob Storage** credentials you created previously. 1. Select a **Region**, e.g., _eastus (East US)_. 1. In the **Pipeline work directory** field, add the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. - :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: - 1. Set the **Config mode** to **Manual**. 1. Enter the **Compute Pool name**. This is the name of the Azure Batch pool you created previously in the Azure Batch account. - :::note The default Azure Batch implementation uses a single pool for head and compute nodes. To use separate pools for head and compute nodes (e.g., to use low-priority VMs for compute jobs), see [this FAQ entry](../faqs.mdx#azure). ::: - 1. Enter a user-assigned **Managed identity client ID**, if one is attached to your Azure Batch pool. See [Managed Identity](#managed-identity) below. 1. Apply [**Resource labels**](../resource-labels/overview.mdx). This will populate the **Metadata** fields of the Azure Batch pool. 1. Expand **Staging options** to include: @@ -268,7 +274,7 @@ When you submit a pipeline to this compute environment, Nextflow will authentica [az-learn-jobs]: https://learn.microsoft.com/en-us/azure/batch/jobs-and-tasks [az-create-rg]: https://portal.azure.com/#create/Microsoft.ResourceGroup [az-create-storage]: https://portal.azure.com/#create/Microsoft.StorageAccount-ARM -[az-create-sp](https://learn.microsoft.com/en-us/entra/identity-platform/howto-create-service-principal-portal) +[az-create-sp]: (https://learn.microsoft.com/en-us/entra/identity-platform/howto-create-service-principal-portal) [wave-docs]: https://docs.seqera.io/wave [nf-fusion-docs]: https://www.nextflow.io/docs/latest/fusion.html \ No newline at end of file From 9a96c32843b20949abcbfc871458c6dbb7f45285 Mon Sep 17 00:00:00 2001 From: llewellyn-sl Date: Wed, 21 Aug 2024 12:52:12 +0200 Subject: [PATCH 11/42] Align credential steps with UI copy --- .../version-24.1/compute-envs/azure-batch.mdx | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index ad9e503eb..287364a3d 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -121,24 +121,25 @@ After you have created a resource group and storage account, create a [Batch acc ### Credentials -There are two types of Azure credentials available: primary access keys and service principals. Primary access keys are simple to use but provide full access to the Storage and Batch accounts. Azure allows only two access keys per account, making them a single point of failure. A service principal provides an account which can be granted access to Azure Batch and Storage resources. Service principals therefore enable role-based access control with more precise permissions. Moreover, some Azure Batch features are only available when using a service principal. +There are two types of Azure credentials available: access keys and service principals. Access keys are simple to use but provide full access to the Storage and Batch accounts. Azure allows only two access keys per account, making them a single point of failure. A service principal provides an account which can be granted access to Azure Batch and Storage resources. Service principals therefore enable role-based access control with more precise permissions. Moreover, some Azure Batch features are only available when using a service principal. :::note -The two Azure credential types use entirely different authentication methods. You can add more than one credential to a workspace, but only one can be used at a time. While credentials can be used concurrently, they are not cross-compatible — access granted by one credential will not be shared with the other. +The two Azure credential types use entirely different authentication methods. You can add more than one credential to a workspace, but Platform compute environments use only one credential at any given time. While separate credentials can be used by separate compute environments concurrently, they are not cross-compatible — access granted by one credential will not be shared with the other. ::: #### Access keys -To create a primary access key: +To create an access key: 1. Navigate to the Azure Portal and sign in. 1. Locate the Azure Batch account and select **Keys** under **Account management**. The Primary and Secondary keys are listed here. Copy one of the keys and save it in a secure location for later use. -1. Locate the Azure Storage account and, under the **Security and Networking** section, select **Access keys**. Key1 and Key2 options are listed here. Copy one of them and save it in a secure location for later use. Delete the stored key after adding it to a credential in Platform. +1. Locate the Azure Storage account and, under the **Security and Networking** section, select **Access keys**. Key1 and Key2 options are listed here. Copy one of them and save it in a secure location for later use. 1. In your Platform workspace **Credentials** tab, select the **Add credentials** button and complete the following fields: - Enter a **Name** for the credentials - **Provider**: Azure - Select the **Shared key** tab - Add the **Batch account** and **Blob Storage account** names and access keys to the relevant fields. +1. Delete the copied keys from their temporary location after they have been added to a credential in Platform. #### Service principal @@ -157,12 +158,13 @@ To create a service principal: 1. Repeat the same process for the Azure Batch account, using the **Azure Batch Contributor** role. 1. Platform will need credentials to authenticate as the service principal: 1. Navigate back to the app registration. On the **Overview** page, save the **Application (client) ID** value for use in Platform. - 1. Select **Certificates & secrets**, then **New client secret**. A new secret is created containing a value and secret ID. Save both values securely for use in Platform. Delete these stored values after saving them in Platform. + 1. Select **Certificates & secrets**, then **New client secret**. A new secret is created containing a value and secret ID. Save both values securely for use in Platform. 1. In your Platform workspace **Credentials** tab, select the **Add credentials** button and complete the following fields: - Enter a **Name** for the credentials - **Provider**: Azure - - Select the **Service principal** tab - - Add the Application ID, Secret ID, Secret, **Batch account**, and **Blob Storage account** names to the relevant fields. + - Select the **Entra** tab + - Complete the remaining fields: **Batch account name**, **Blob Storage account name**, **Tenant ID** (Application (client) ID in Azure), **Client ID** (Client secret ID in Azure), **Client secret** (Client secret value in Azure). +1. Delete the ID and secret values from their temporary location after they have been added to a credential in Platform. ## Platform compute environment From 9dd8764d003611dfc89fc86fcb435404b4f99c47 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Mon, 26 Aug 2024 15:06:44 +0200 Subject: [PATCH 12/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Christopher Hakkaart Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 287364a3d..8b0939376 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -58,7 +58,7 @@ A resource group can be created while creating an Azure Storage Account or Azure ### Storage account -After creating a resource group, set up an [Azure storage account][az-learn-storage]: +After creating a resource group, set up an [Azure Storage account][az-learn-storage]: 1. Log in to your Azure account, go to the [Create storage account][az-create-storage] page, and select **Create a storage account**. :::note From 763fbb18463e3c38af12fb7e510ff778495b1ca7 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Mon, 26 Aug 2024 15:07:02 +0200 Subject: [PATCH 13/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Christopher Hakkaart Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 8b0939376..3683eba49 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -28,7 +28,7 @@ Ensure you have sufficient permissions to create resource groups, an Azure Stora
**Accounts** - Seqera Platform relies on an existing Azure Storage and Azure Batch account. You need at least one valid Azure Storage and Azure Batch account in your Azure subscription. + Seqera Platform relies on an existing Azure Storage and Azure Batch account. At least one valid Azure Storage and Azure Batch account is required per Azure subscription. Azure uses accounts for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. While you can have multiple Azure Storage and Azure Batch accounts in an Azure subscription, a Platform compute environment can only use one of each (one Storage and one Batch account). You can set up multiple Platform compute environments to use different credentials, Storage accounts, and Batch accounts. From 1791337f820dc2dd95cb9dfbc654743f208438dd Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Mon, 26 Aug 2024 15:07:24 +0200 Subject: [PATCH 14/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Christopher Hakkaart Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 3683eba49..50fec3672 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -30,7 +30,7 @@ Ensure you have sufficient permissions to create resource groups, an Azure Stora Seqera Platform relies on an existing Azure Storage and Azure Batch account. At least one valid Azure Storage and Azure Batch account is required per Azure subscription. - Azure uses accounts for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. While you can have multiple Azure Storage and Azure Batch accounts in an Azure subscription, a Platform compute environment can only use one of each (one Storage and one Batch account). You can set up multiple Platform compute environments to use different credentials, Storage accounts, and Batch accounts. + Azure uses accounts for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. An Azure subscription can have multiple Azure Storage and Azure Batch accounts. However, a Platform compute environment can only use one of each (one Storage and one Batch account). Multiple Platform compute environments can be set to use different credentials, Storage accounts, and Batch accounts.
From 603c333e985d5ecf6bef0312e2e24384c0f86726 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Mon, 26 Aug 2024 17:11:30 +0200 Subject: [PATCH 15/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Christopher Hakkaart Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 50fec3672..9c9160019 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -16,7 +16,7 @@ Ensure you have sufficient permissions to create resource groups, an Azure Stora
**Regions** - Azure regions are distinct geographic areas that contain multiple data centers strategically located around the world to provide high availability, fault tolerance, and low latency for cloud services. Each region offers a wide range of Azure services — choose a specific region to optimize performance, ensure data residency compliance, and meet regulatory requirements. Azure regions also enable redundancy and disaster recovery options by allowing resources to be replicated across different regions, enhancing the resilience of applications and data. + Azure regions are specific geographic locations around the world where Microsoft has established data centers to host its cloud services. Each Azure region is a collection of data centers that provide users with high availability, fault tolerance, and low latency for cloud services. Each region offers a wide range of Azure services that can be chosen to optimize performance, ensure data residency compliance, and meet regulatory requirements. Azure regions also enable redundancy and disaster recovery options by allowing resources to be replicated across different regions, enhancing the resilience of applications and data.
From f63d1e9eda23675bb588e05dea45ea929a844bf1 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Mon, 26 Aug 2024 17:12:17 +0200 Subject: [PATCH 16/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Christopher Hakkaart Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 9c9160019..7ef4e71c7 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -124,7 +124,7 @@ After you have created a resource group and storage account, create a [Batch acc There are two types of Azure credentials available: access keys and service principals. Access keys are simple to use but provide full access to the Storage and Batch accounts. Azure allows only two access keys per account, making them a single point of failure. A service principal provides an account which can be granted access to Azure Batch and Storage resources. Service principals therefore enable role-based access control with more precise permissions. Moreover, some Azure Batch features are only available when using a service principal. :::note -The two Azure credential types use entirely different authentication methods. You can add more than one credential to a workspace, but Platform compute environments use only one credential at any given time. While separate credentials can be used by separate compute environments concurrently, they are not cross-compatible — access granted by one credential will not be shared with the other. +The two Azure credential types use different authentication methods. You can add more than one credential to a workspace, but Platform compute environments use only one credential at any given time. While separate credentials can be used by separate compute environments concurrently, they are not cross-compatible — access granted by one credential will not be shared with the other. ::: #### Access keys From 1d588cca0cb56431a7a61b1e7357437ed9e67918 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Tue, 27 Aug 2024 17:50:25 +0200 Subject: [PATCH 17/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Christopher Hakkaart Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 7ef4e71c7..1705e724c 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -78,7 +78,7 @@ After creating a resource group, set up an [Azure Storage account][az-learn-stor 1. In **tags**, add any required tags for the storage account. 1. Select **Review and Create**. 1. Select **Create** to create the Azure Storage account. -1. You will need at least one blob storage container to act as a working directory for Nextflow. + - You will need at least one blob storage container to act as a working directory for Nextflow. 1. Go to your new storage account and select **+ Container** to create a new Blob storage container. A new container dialogue will open. Enter a suitable name, e.g., _towerrgstorage-container_. 1. Go to the **Access Keys** section of your new storage account (_towerrgstorage_ in this example). 1. Store the access keys for your Azure Storage account, to be used when you create a Seqera compute environment. From ce99f5402c16d1e137467186531eabc9328eca0e Mon Sep 17 00:00:00 2001 From: llewellyn-sl Date: Wed, 28 Aug 2024 11:25:51 +0200 Subject: [PATCH 18/42] Update azure-batch.mdx --- .../version-24.1/compute-envs/azure-batch.mdx | 84 ++++++++++--------- 1 file changed, 46 insertions(+), 38 deletions(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 1705e724c..2c5d0d4f6 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -13,32 +13,23 @@ Ensure you have sufficient permissions to create resource groups, an Azure Stora ## Azure concepts -
- **Regions** +#### Regions - Azure regions are specific geographic locations around the world where Microsoft has established data centers to host its cloud services. Each Azure region is a collection of data centers that provide users with high availability, fault tolerance, and low latency for cloud services. Each region offers a wide range of Azure services that can be chosen to optimize performance, ensure data residency compliance, and meet regulatory requirements. Azure regions also enable redundancy and disaster recovery options by allowing resources to be replicated across different regions, enhancing the resilience of applications and data. +Azure regions are specific geographic locations around the world where Microsoft has established data centers to host its cloud services. Each Azure region is a collection of data centers that provide users with high availability, fault tolerance, and low latency for cloud services. Each region offers a wide range of Azure services that can be chosen to optimize performance, ensure data residency compliance, and meet regulatory requirements. Azure regions also enable redundancy and disaster recovery options by allowing resources to be replicated across different regions, enhancing the resilience of applications and data. -
-
- **Resource groups** +#### Resource groups - An Azure resource group is a logical container that holds related Azure resources such as virtual machines, storage accounts, databases, and more. A resource group serves as a management boundary to organize, deploy, monitor, and manage the resources within it as a single entity. Resources in a resource group share the same lifecycle, meaning they can be deployed, updated, and deleted together. This also enables easier access control, monitoring, and cost management, making resource groups a foundational element in organizing and managing cloud infrastructure in Azure. +An Azure resource group is a logical container that holds related Azure resources such as virtual machines, storage accounts, databases, and more. A resource group serves as a management boundary to organize, deploy, monitor, and manage the resources within it as a single entity. Resources in a resource group share the same lifecycle, meaning they can be deployed, updated, and deleted together. This also enables easier access control, monitoring, and cost management, making resource groups a foundational element in organizing and managing cloud infrastructure in Azure. -
-
- **Accounts** +#### Accounts - Seqera Platform relies on an existing Azure Storage and Azure Batch account. At least one valid Azure Storage and Azure Batch account is required per Azure subscription. +Seqera Platform relies on an existing Azure Storage and Azure Batch account. At least one valid Azure Storage and Azure Batch account is required per Azure subscription. - Azure uses accounts for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. An Azure subscription can have multiple Azure Storage and Azure Batch accounts. However, a Platform compute environment can only use one of each (one Storage and one Batch account). Multiple Platform compute environments can be set to use different credentials, Storage accounts, and Batch accounts. +Azure uses accounts for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. An Azure subscription can have multiple Azure Storage and Azure Batch accounts. However, a Platform compute environment can only use one of each (one Storage and one Batch account). Multiple Platform compute environments can be set to use different credentials, Storage accounts, and Batch accounts. -
-
- **Service principals** +#### Service principals - An Azure service principal is an identity created for use with applications, hosted services, or automated tools to access Azure resources. A service principal is effectively a user identity with a set of permissions assigned to it. Platform can use an Azure service principal to authenticate and authorize access to Azure Batch for job execution and Azure Storage for data management. By assigning the necessary roles to the service principal, Platform can securely interact with these Azure services. This ensures that only authorized operations are performed during pipeline execution. - -
+An Azure service principal is an identity created for use with applications, hosted services, or automated tools to access Azure resources. A service principal is effectively a user identity with a set of permissions assigned to it. Platform can use an Azure service principal to authenticate and authorize access to Azure Batch for job execution and Azure Storage for data management. By assigning the necessary roles to the service principal, Platform can securely interact with these Azure services. This ensures that only authorized operations are performed during pipeline execution. ## Create Azure resources @@ -121,7 +112,17 @@ After you have created a resource group and storage account, create a [Batch acc ### Credentials -There are two types of Azure credentials available: access keys and service principals. Access keys are simple to use but provide full access to the Storage and Batch accounts. Azure allows only two access keys per account, making them a single point of failure. A service principal provides an account which can be granted access to Azure Batch and Storage resources. Service principals therefore enable role-based access control with more precise permissions. Moreover, some Azure Batch features are only available when using a service principal. +There are two types of Azure credentials available: access keys and Entra service principals. + +Access keys are simple to use but have several limitations: +- Access keys are long-lived. +- Access keys provide full access to the Storage and Batch accounts. +- Azure allows only two access keys per account, making them a single point of failure. + +Entra service principals are accounts which can be granted access to Azure Batch and Storage resources: +- Service principals enable role-based access control with more precise permissions. +- Service principals map to a many-to-many relationship with Batch and Storage accounts. +- Some Azure Batch features are only available when using a service principal. :::note The two Azure credential types use different authentication methods. You can add more than one credential to a workspace, but Platform compute environments use only one credential at any given time. While separate credentials can be used by separate compute environments concurrently, they are not cross-compatible — access granted by one credential will not be shared with the other. @@ -129,6 +130,10 @@ The two Azure credential types use different authentication methods. You can add #### Access keys +:::info +Batch Forge compute environments must use access keys for authentication. Service principals are only supported in manual compute environments. +::: + To create an access key: 1. Navigate to the Azure Portal and sign in. @@ -141,13 +146,15 @@ To create an access key: - Add the **Batch account** and **Blob Storage account** names and access keys to the relevant fields. 1. Delete the copied keys from their temporary location after they have been added to a credential in Platform. -#### Service principal +#### Entra service principal -:::note -See [Create a service principal][az-create-sp] for more details. +:::info +Batch Forge compute environments must use access keys for authentication. Service principals are only supported in manual compute environments. ::: -To create a service principal: +See [Create a service principal][az-create-sp] for more details. + +To create an Entra service principal: 1. In the Azure Portal, navigate to **Microsoft Entra ID**. Under **App registrations**, select **New registration**. 1. Provide a name for the application. The application will automatically have a service principal associated with it. @@ -182,11 +189,14 @@ Batch Forge automatically creates resources that you may be charged for in your Create a Batch Forge Azure Batch compute environment: 1. In a workspace, select **Compute Environments > New Environment**. -1. Enter a descriptive name, e.g., _Azure Batch (east-us)_. +1. Enter a descriptive name, such as _Azure Batch (east-us)_. 1. Select **Azure Batch** as the target platform. 1. Choose existing Azure credentials or add a new credential. -1. Select a **Region**, e.g., _eastus_. -1. In the **Pipeline work directory** field, enter the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. + :::info + Batch Forge compute environments must use access keys for authentication. Entra service principals are only supported in manual compute environments. + ::: +1. Select a **Region**, such as _eastus_. +1. In the **Pipeline work directory** field, enter the Azure blob container created previously. For example, `az://towerrgstorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: @@ -218,26 +228,24 @@ This section is for users with a pre-configured Azure Batch pool. This requires Your Seqera compute environment uses resources that you may be charged for in your Azure account. See [Cloud costs](../monitoring/cloud-costs.mdx) for guidelines to manage cloud resources effectively and prevent unexpected costs. ::: -**Create a manual Seqera Azure Batch compute environment** +Create a manual Seqera Azure Batch compute environment: 1. In a workspace, select **Compute Environments > New Environment**. -1. Enter a descriptive name for this environment, e.g., _Azure Batch (east-us)_. -1. Select **Azure Batch** as the target platform. -1. Select your existing Azure credentials or select **+** to add new credentials. If you choose to use existing credentials, skip to step 7. - :::tip - You can create multiple credentials in your Seqera environment. +1. Enter a descriptive name for this environment, such as _Azure Batch (east-us)_. +1. For **Provider**, select **Azure Batch**. +1. Select your existing Azure credentials (access keys or Entra service principal) or select **+** to add new credentials. + :::note + To authenticate using an Entra service principal, you must include a user-assigned managed identity. See [Managed identity](#managed-identity) below. ::: -1. Enter a name, e.g., _Azure Credentials_. -1. Add the **Batch account** and **Blob Storage** credentials you created previously. -1. Select a **Region**, e.g., _eastus (East US)_. -1. In the **Pipeline work directory** field, add the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. +1. Select a **Region**, such as _eastus (East US)_. +1. In the **Pipeline work directory** field, add the Azure blob container created previously. For example, `az://towerrgstorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: 1. Set the **Config mode** to **Manual**. 1. Enter the **Compute Pool name**. This is the name of the Azure Batch pool you created previously in the Azure Batch account. :::note - The default Azure Batch implementation uses a single pool for head and compute nodes. To use separate pools for head and compute nodes (e.g., to use low-priority VMs for compute jobs), see [this FAQ entry](../faqs.mdx#azure). + The default Azure Batch implementation uses a single pool for head and compute nodes. To use separate pools for head and compute nodes (for example, to use low-priority VMs for compute jobs), see [this FAQ entry](../faqs.mdx#azure). ::: 1. Enter a user-assigned **Managed identity client ID**, if one is attached to your Azure Batch pool. See [Managed Identity](#managed-identity) below. 1. Apply [**Resource labels**](../resource-labels/overview.mdx). This will populate the **Metadata** fields of the Azure Batch pool. @@ -258,7 +266,7 @@ When you use a manually configured compute environment with a managed identity a 1. In Azure, create a user-assigned managed identity. See [Manage user-assigned managed identities](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-manage-user-assigned-managed-identities) for detailed steps. After creation, record the Client ID of the managed identity. 1. The user-assigned managed identity must have the necessary access roles for Nextflow. See [Required role assignments](https://www.nextflow.io/docs/latest/azure.html#required-role-assignments) for more information. -1. Associate the user-assigned managed identity with the Azure Batch Pool. See [Set up managed identity in your batch pool](https://learn.microsoft.com/en-us/troubleshoot/azure/hpc/batch/use-managed-identities-azure-batch-account-pool#set-up-managed-identity-in-your-batch-pool) for more information. +1. Associate the user-assigned managed identity with the Azure Batch Pool. See [Set up managed identity in your Batch pool](https://learn.microsoft.com/en-us/troubleshoot/azure/hpc/batch/use-managed-identities-azure-batch-account-pool#set-up-managed-identity-in-your-batch-pool) for more information. 1. When you set up the Platform compute environment, select the Azure Batch pool by name and enter the managed identity client ID in the specified field as instructed above. When you submit a pipeline to this compute environment, Nextflow will authenticate using the managed identity associated with the Azure Batch node it runs on, rather than relying on access keys. From 02fe628a03e11942975acbd249d2cb1a6a9d38db Mon Sep 17 00:00:00 2001 From: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com> Date: Wed, 28 Aug 2024 10:33:06 +0100 Subject: [PATCH 19/42] Replace all towerrg instances with seqeraplatform Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 1705e724c..342c19d59 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -51,7 +51,7 @@ A resource group can be created while creating an Azure Storage Account or Azure ::: 1. Log in to your Azure account, go to the [Create Resource group][az-create-rg] page, and select **Create new resource group**. -1. Enter a name for the resource group, such as _towerrg_. +1. Enter a name for the resource group, such as _seqeracompute_. 1. Choose the preferred region. 1. Select **Review and Create** to proceed. 1. Select **Create**. @@ -64,7 +64,7 @@ After creating a resource group, set up an [Azure Storage account][az-learn-stor :::note If you haven't created a resource group, you can do so now. ::: -1. Enter a name for the storage account, such as _towerrgstorage_. +1. Enter a name for the storage account, such as _seqeracomputestorage_. 1. Choose the preferred region. This must be the same region as the Batch account. 1. The platform supports any performance or redundancy settings — select the most appropriate settings for your use case. 1. Select **Next: Advanced**. @@ -79,8 +79,8 @@ After creating a resource group, set up an [Azure Storage account][az-learn-stor 1. Select **Review and Create**. 1. Select **Create** to create the Azure Storage account. - You will need at least one blob storage container to act as a working directory for Nextflow. -1. Go to your new storage account and select **+ Container** to create a new Blob storage container. A new container dialogue will open. Enter a suitable name, e.g., _towerrgstorage-container_. -1. Go to the **Access Keys** section of your new storage account (_towerrgstorage_ in this example). +1. Go to your new storage account and select **+ Container** to create a new Blob storage container. A new container dialogue will open. Enter a suitable name, e.g., _seqeracomputestorage-container_. +1. Go to the **Access Keys** section of your new storage account (_seqeracomputestorage_ in this example). 1. Store the access keys for your Azure Storage account, to be used when you create a Seqera compute environment. :::caution @@ -93,7 +93,7 @@ After you have created a resource group and storage account, create a [Batch acc 1. Log in to your Azure account and select **Create a batch account** on [this page][az-create-batch]. 1. Select the existing resource group or create a new one. -1. Enter a name for the Batch account, such as _towerrgbatch_. +1. Enter a name for the Batch account, such as _seqeracomputebatch_. 1. Choose the preferred region. This must be the same region as the Storage account. 1. Select **Advanced**. 1. For **Pool allocation mode**, select Batch service. @@ -186,7 +186,7 @@ Create a Batch Forge Azure Batch compute environment: 1. Select **Azure Batch** as the target platform. 1. Choose existing Azure credentials or add a new credential. 1. Select a **Region**, e.g., _eastus_. -1. In the **Pipeline work directory** field, enter the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. +1. In the **Pipeline work directory** field, enter the Azure blob container created previously, e.g., `az://seqeracomputestorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: @@ -230,7 +230,7 @@ Your Seqera compute environment uses resources that you may be charged for in yo 1. Enter a name, e.g., _Azure Credentials_. 1. Add the **Batch account** and **Blob Storage** credentials you created previously. 1. Select a **Region**, e.g., _eastus (East US)_. -1. In the **Pipeline work directory** field, add the Azure blob container created previously, e.g., `az://towerrgstorage-container/work`. +1. In the **Pipeline work directory** field, add the Azure blob container created previously, e.g., `az://seqeracomputestorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: From 17e12a05b19605f8cebca1d380a2c403af54ef84 Mon Sep 17 00:00:00 2001 From: llewellyn-sl Date: Wed, 28 Aug 2024 12:43:43 +0200 Subject: [PATCH 20/42] Add Entra, clarify service principal requiring managed identity --- .../version-24.1/compute-envs/azure-batch.mdx | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 2c5d0d4f6..a87b1418b 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -149,7 +149,9 @@ To create an access key: #### Entra service principal :::info -Batch Forge compute environments must use access keys for authentication. Service principals are only supported in manual compute environments. +Batch Forge compute environments must use access keys for authentication. Service principals are only supported in manual compute environments. + +The use of Entra service principals in manual compute environments requires the use of a [managed identity](#managed-identity). ::: See [Create a service principal][az-create-sp] for more details. @@ -262,7 +264,7 @@ Create a manual Seqera Azure Batch compute environment: Nextflow can authenticate to Azure services using a managed identity. This method offers enhanced security compared to access keys, but must run on Azure infrastructure. -When you use a manually configured compute environment with a managed identity attached to the Azure Batch Pool, Nextflow can use this managed identity for authentication. However, Platform still needs to use access keys to submit the initial task to Azure Batch to run Nextflow, which will then proceed with the managed identity for subsequent authentication. +When you use a manually configured compute environment with a managed identity attached to the Azure Batch Pool, Nextflow can use this managed identity for authentication. However, Platform still needs to use access keys or an Entra service principal to submit the initial task to Azure Batch to run Nextflow, which will then proceed with the managed identity for subsequent authentication. 1. In Azure, create a user-assigned managed identity. See [Manage user-assigned managed identities](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-manage-user-assigned-managed-identities) for detailed steps. After creation, record the Client ID of the managed identity. 1. The user-assigned managed identity must have the necessary access roles for Nextflow. See [Required role assignments](https://www.nextflow.io/docs/latest/azure.html#required-role-assignments) for more information. From 7d01e024fba33ea51c594f9b50fb65d1ef6b2d40 Mon Sep 17 00:00:00 2001 From: llewellyn-sl Date: Wed, 28 Aug 2024 14:45:35 +0200 Subject: [PATCH 21/42] Update azure-batch.mdx --- .../version-24.1/compute-envs/azure-batch.mdx | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 60eb56811..7b0f3ac19 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -254,6 +254,9 @@ Create a manual Seqera Azure Batch compute environment: 1. Expand **Staging options** to include: - Optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre--post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. - Global Nextflow configuration settings for all pipeline runs launched with this compute environment. Configuration settings in this field override the same values in the pipeline Nextflow config file. + :::info + To use managed identities, Platform requires requires Nextflow version 24.06.0-edge or later. Add `export NXF_VER=24.06.0-edge` to the **Global Nextflow config** field for your compute environment to use this Nextflow version by default. + ::: 1. Define custom **Environment Variables** for the **Head Job** and/or **Compute Jobs**. 1. Configure any necessary advanced options: - Use **Jobs cleanup policy** to control how Nextflow process jobs are deleted on completion. Active jobs consume the quota of the Azure Batch account. By default, jobs are terminated by Nextflow and removed from the quota when all tasks succesfully complete. If set to _Always_, all jobs are deleted by Nextflow after pipeline completion. If set to _Never_, jobs are never deleted. If set to _On success_, successful tasks are removed but failed tasks will be left for debugging purposes. @@ -262,6 +265,10 @@ Create a manual Seqera Azure Batch compute environment: ### Managed identity +:::info +To use managed identities, Platform requires requires Nextflow version 24.06.0-edge or later. Add `export NXF_VER=24.06.0-edge` to the **Global Nextflow config** field in advanced options for your compute environment to use this Nextflow version by default. +::: + Nextflow can authenticate to Azure services using a managed identity. This method offers enhanced security compared to access keys, but must run on Azure infrastructure. When you use a manually configured compute environment with a managed identity attached to the Azure Batch Pool, Nextflow can use this managed identity for authentication. However, Platform still needs to use access keys or an Entra service principal to submit the initial task to Azure Batch to run Nextflow, which will then proceed with the managed identity for subsequent authentication. From aa21ad3d333e7abdbfcf5954e0811a20c38c49b3 Mon Sep 17 00:00:00 2001 From: llewellyn-sl Date: Wed, 28 Aug 2024 15:06:36 +0200 Subject: [PATCH 22/42] Update azure-batch.mdx --- .../version-24.1/compute-envs/azure-batch.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 7b0f3ac19..83bb36955 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -255,7 +255,7 @@ Create a manual Seqera Azure Batch compute environment: - Optional [pre- or post-run Bash scripts](../launch/advanced.mdx#pre--post-run-scripts) that execute before or after the Nextflow pipeline execution in your environment. - Global Nextflow configuration settings for all pipeline runs launched with this compute environment. Configuration settings in this field override the same values in the pipeline Nextflow config file. :::info - To use managed identities, Platform requires requires Nextflow version 24.06.0-edge or later. Add `export NXF_VER=24.06.0-edge` to the **Global Nextflow config** field for your compute environment to use this Nextflow version by default. + To use managed identities, Platform requires Nextflow version 24.06.0-edge or later. Add `export NXF_VER=24.06.0-edge` to the **Global Nextflow config** field for your compute environment to use this Nextflow version by default. ::: 1. Define custom **Environment Variables** for the **Head Job** and/or **Compute Jobs**. 1. Configure any necessary advanced options: @@ -266,7 +266,7 @@ Create a manual Seqera Azure Batch compute environment: ### Managed identity :::info -To use managed identities, Platform requires requires Nextflow version 24.06.0-edge or later. Add `export NXF_VER=24.06.0-edge` to the **Global Nextflow config** field in advanced options for your compute environment to use this Nextflow version by default. +To use managed identities, Platform requires requires Nextflow version 24.06.0-edge or later. Add `export NXF_VER=24.06.0-edge` to the **Global Nextflow config** field in advanced options for your compute environment to use this Nextflow version by default (see manual instructions above). ::: Nextflow can authenticate to Azure services using a managed identity. This method offers enhanced security compared to access keys, but must run on Azure infrastructure. From e5144af8fa3cf472c8027f686167288df41ba543 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Wed, 28 Aug 2024 15:10:33 +0200 Subject: [PATCH 23/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com> Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 83bb36955..aed71cf10 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -90,7 +90,7 @@ After you have created a resource group and storage account, create a [Batch acc 1. For **Pool allocation mode**, select Batch service. 1. For **Authentication mode**, select _Shared Key_. 1. Select **Networking**. Ensure networking access is sufficient for Platform and any additional required resources. -1. In **tags**, add any required tags for the Batch account. +1. In **tags**, if necessary, you can add any tags to the Batch account 1. Select **Review and Create**. 1. Select **Create**. 1. Go to your new Batch account, then select **Access Keys**. From 70ac0a5d977915126307625f9e0f7cea20ad6e2e Mon Sep 17 00:00:00 2001 From: llewellyn-sl Date: Wed, 28 Aug 2024 15:22:06 +0200 Subject: [PATCH 24/42] Update azure-batch.mdx --- .../version-24.1/compute-envs/azure-batch.mdx | 2 -- 1 file changed, 2 deletions(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 83bb36955..ec0ebee8b 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -23,8 +23,6 @@ An Azure resource group is a logical container that holds related Azure resource #### Accounts -Seqera Platform relies on an existing Azure Storage and Azure Batch account. At least one valid Azure Storage and Azure Batch account is required per Azure subscription. - Azure uses accounts for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. An Azure subscription can have multiple Azure Storage and Azure Batch accounts. However, a Platform compute environment can only use one of each (one Storage and one Batch account). Multiple Platform compute environments can be set to use different credentials, Storage accounts, and Batch accounts. #### Service principals From 7848704e2b267436754c09138c592f0356309816 Mon Sep 17 00:00:00 2001 From: llewellyn-sl Date: Wed, 28 Aug 2024 15:26:48 +0200 Subject: [PATCH 25/42] Fix link --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 4d2519b68..fb9d5d382 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -291,7 +291,7 @@ When you submit a pipeline to this compute environment, Nextflow will authentica [az-learn-jobs]: https://learn.microsoft.com/en-us/azure/batch/jobs-and-tasks [az-create-rg]: https://portal.azure.com/#create/Microsoft.ResourceGroup [az-create-storage]: https://portal.azure.com/#create/Microsoft.StorageAccount-ARM -[az-create-sp]: (https://learn.microsoft.com/en-us/entra/identity-platform/howto-create-service-principal-portal) +[az-create-sp]: https://learn.microsoft.com/en-us/entra/identity-platform/howto-create-service-principal-portal [wave-docs]: https://docs.seqera.io/wave [nf-fusion-docs]: https://www.nextflow.io/docs/latest/fusion.html \ No newline at end of file From 546d170644f482629cfb2df49a4177e459cfd50a Mon Sep 17 00:00:00 2001 From: Mattia Bosio Date: Thu, 29 Aug 2024 13:43:59 +0200 Subject: [PATCH 26/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com> Signed-off-by: Mattia Bosio --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index fb9d5d382..a062f1c09 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -27,7 +27,7 @@ Azure uses accounts for each service. For example, an [Azure Storage account][az #### Service principals -An Azure service principal is an identity created for use with applications, hosted services, or automated tools to access Azure resources. A service principal is effectively a user identity with a set of permissions assigned to it. Platform can use an Azure service principal to authenticate and authorize access to Azure Batch for job execution and Azure Storage for data management. By assigning the necessary roles to the service principal, Platform can securely interact with these Azure services. This ensures that only authorized operations are performed during pipeline execution. +An Azure service principal is an identity created specifically for applications, hosted services, or automated tools to access Azure resources. It acts like a user identity with a defined set of permissions, enabling resources authenticated through the service principal to perform actions within the Azure account. The platform can utilize an Azure service principal to authenticate and access Azure Batch for job execution and Azure Storage for data management. ## Create Azure resources From 44526e6ac4e9937eb02f09f69304cf531e33196f Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:14:36 +0200 Subject: [PATCH 27/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Justine Geffen Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index cf38ab36a..b929b8f7a 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -23,7 +23,7 @@ An Azure resource group is a logical container that holds related Azure resource #### Accounts -Azure uses accounts for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. An Azure subscription can have multiple Azure Storage and Azure Batch accounts. However, a Platform compute environment can only use one of each (one Storage and one Batch account). Multiple Platform compute environments can be set to use different credentials, Storage accounts, and Batch accounts. +Azure uses accounts for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. An Azure subscription can have multiple Azure Storage and Azure Batch accounts - however, a Platform compute environment can only use one of each. Multiple Platform compute environments can be set to use different credentials, Azure Storage accounts, and Azure Batch accounts. #### Service principals From 919d58c886125d576252651b2bd707a8d9f6ea56 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:14:53 +0200 Subject: [PATCH 28/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Justine Geffen Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index b929b8f7a..2647bb9ab 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -85,7 +85,7 @@ After you have created a resource group and storage account, create a [Batch acc 1. Enter a name for the Batch account, such as _seqeracomputebatch_. 1. Choose the preferred region. This must be the same region as the Storage account. 1. Select **Advanced**. -1. For **Pool allocation mode**, select Batch service. +1. For **Pool allocation mode**, select **Batch service**. 1. For **Authentication mode**, select _Shared Key_. 1. Select **Networking**. Ensure networking access is sufficient for Platform and any additional required resources. 1. In **tags**, if necessary, you can add any tags to the Batch account From 88c2dbfa448936e5417cb9692e4039e8ead0ded6 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:15:09 +0200 Subject: [PATCH 29/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Justine Geffen Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 2647bb9ab..418a6131f 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -114,7 +114,7 @@ There are two types of Azure credentials available: access keys and Entra servic Access keys are simple to use but have several limitations: - Access keys are long-lived. -- Access keys provide full access to the Storage and Batch accounts. +- Access keys provide full access to the Azure Storage and Azure Batch accounts. - Azure allows only two access keys per account, making them a single point of failure. Entra service principals are accounts which can be granted access to Azure Batch and Storage resources: From 6fe50eb88a66213940cbf0b6c517d5a6c50ac055 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:15:24 +0200 Subject: [PATCH 30/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Justine Geffen Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 418a6131f..448fe017e 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -117,7 +117,7 @@ Access keys are simple to use but have several limitations: - Access keys provide full access to the Azure Storage and Azure Batch accounts. - Azure allows only two access keys per account, making them a single point of failure. -Entra service principals are accounts which can be granted access to Azure Batch and Storage resources: +Entra service principals are accounts which can be granted access to Azure Batch and Azure Storage resources: - Service principals enable role-based access control with more precise permissions. - Service principals map to a many-to-many relationship with Batch and Storage accounts. - Some Azure Batch features are only available when using a service principal. From 105939efbe8d2b55adcaec71c5d17103276feab1 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:15:33 +0200 Subject: [PATCH 31/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Co-authored-by: Justine Geffen Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 448fe017e..d8ae23378 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -119,7 +119,7 @@ Access keys are simple to use but have several limitations: Entra service principals are accounts which can be granted access to Azure Batch and Azure Storage resources: - Service principals enable role-based access control with more precise permissions. -- Service principals map to a many-to-many relationship with Batch and Storage accounts. +- Service principals map to a many-to-many relationship with Azure Batch and Azure Storage accounts. - Some Azure Batch features are only available when using a service principal. :::note From a9b8d32e8807d7f2224355e242eb8c35bfc6710f Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:35:41 +0200 Subject: [PATCH 32/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index d8ae23378..3510300ee 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -8,7 +8,7 @@ tags: [azure, batch, compute environment] :::note This guide assumes you already have an Azure account with a valid Azure Subscription. For details, visit [Azure Free Account][az-create-account]. -Ensure you have sufficient permissions to create resource groups, an Azure Storage account, and a Batch account. +Ensure you have sufficient permissions to create resource groups, an Azure Storage account, and an Azure Batch account. ::: ## Azure concepts From 10094895d86adc9a32ab7f0d679faa3d74deebd7 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:35:51 +0200 Subject: [PATCH 33/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 3510300ee..84e3d2161 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -23,7 +23,7 @@ An Azure resource group is a logical container that holds related Azure resource #### Accounts -Azure uses accounts for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. An Azure subscription can have multiple Azure Storage and Azure Batch accounts - however, a Platform compute environment can only use one of each. Multiple Platform compute environments can be set to use different credentials, Azure Storage accounts, and Azure Batch accounts. +Azure uses accounts for each service. For example, an [Azure Storage account][az-learn-storage] will house a collection of blob containers, file shares, queues, and tables. An Azure subscription can have multiple Azure Storage and Azure Batch accounts - however, a Platform compute environment can only use one of each. Multiple Platform compute environments can be created to use separate credentials, Azure Storage accounts, and Azure Batch accounts. #### Service principals From cbe5950e6125da9ae2766ed6da966de9730ef4ca Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:36:00 +0200 Subject: [PATCH 34/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 84e3d2161..98f1d635d 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -36,7 +36,7 @@ An Azure service principal is an identity created specifically for applications, Create a resource group to link your Azure Batch and Azure Storage account: :::note -A resource group can be created while creating an Azure Storage Account or Azure Batch account. +A resource group can be created while creating an Azure Storage account or Azure Batch account. ::: 1. Log in to your Azure account, go to the [Create Resource group][az-create-rg] page, and select **Create new resource group**. From e09bd1f48c0bc2b8d69200d3692358d2d953a2b2 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:36:09 +0200 Subject: [PATCH 35/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 98f1d635d..6ef83f764 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -55,7 +55,7 @@ After creating a resource group, set up an [Azure Storage account][az-learn-stor ::: 1. Enter a name for the storage account, such as _seqeracomputestorage_. 1. Choose the preferred region. This must be the same region as the Batch account. -1. The platform supports any performance or redundancy settings — select the most appropriate settings for your use case. +1. Platform supports all performance or redundancy settings — select the most appropriate settings for your use case. 1. Select **Next: Advanced**. 1. Enable _storage account key access_. 1. Select **Next: Networking**. From 7ea37db03e099a6e1a6833c2285c4197a270855c Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:36:18 +0200 Subject: [PATCH 36/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 6ef83f764..f14e69d38 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -67,7 +67,7 @@ After creating a resource group, set up an [Azure Storage account][az-learn-stor 1. In **tags**, add any required tags for the storage account. 1. Select **Review and Create**. 1. Select **Create** to create the Azure Storage account. - - You will need at least one blob storage container to act as a working directory for Nextflow. + - You will need at least one Blob Storage container to act as a working directory for Nextflow. 1. Go to your new storage account and select **+ Container** to create a new Blob storage container. A new container dialogue will open. Enter a suitable name, e.g., _seqeracomputestorage-container_. 1. Go to the **Access Keys** section of your new storage account (_seqeracomputestorage_ in this example). 1. Store the access keys for your Azure Storage account, to be used when you create a Seqera compute environment. From 7b2dc3042be0313556e14a2a95f7a54034c94944 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:36:56 +0200 Subject: [PATCH 37/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index f14e69d38..40c7a0587 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -68,7 +68,7 @@ After creating a resource group, set up an [Azure Storage account][az-learn-stor 1. Select **Review and Create**. 1. Select **Create** to create the Azure Storage account. - You will need at least one Blob Storage container to act as a working directory for Nextflow. -1. Go to your new storage account and select **+ Container** to create a new Blob storage container. A new container dialogue will open. Enter a suitable name, e.g., _seqeracomputestorage-container_. +1. Go to your new storage account and select **+ Container** to create a new Blob Storage container. A new container dialogue will open. Enter a suitable name, such as _seqeracomputestorage-container_. 1. Go to the **Access Keys** section of your new storage account (_seqeracomputestorage_ in this example). 1. Store the access keys for your Azure Storage account, to be used when you create a Seqera compute environment. From 390e5885b3496b529d3d99bdd5be0a160d55d0de Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:37:20 +0200 Subject: [PATCH 38/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 40c7a0587..aaba5314e 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -78,7 +78,7 @@ Blob container storage credentials are associated with the Batch pool configurat ### Batch account -After you have created a resource group and storage account, create a [Batch account][az-learn-batch]: +After you have created a resource group and Storage account, create a [Batch account][az-learn-batch]: 1. Log in to your Azure account and select **Create a batch account** on [this page][az-create-batch]. 1. Select the existing resource group or create a new one. From 51c82665b0ec44762f33b5167a5b740d35391111 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:37:43 +0200 Subject: [PATCH 39/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index aaba5314e..ae9578889 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -88,7 +88,7 @@ After you have created a resource group and Storage account, create a [Batch acc 1. For **Pool allocation mode**, select **Batch service**. 1. For **Authentication mode**, select _Shared Key_. 1. Select **Networking**. Ensure networking access is sufficient for Platform and any additional required resources. -1. In **tags**, if necessary, you can add any tags to the Batch account +1. Add any **Tags** to the Batch account, if needed. 1. Select **Review and Create**. 1. Select **Create**. 1. Go to your new Batch account, then select **Access Keys**. From d946b67dcb17d0bebd0bda5ba5581acde6419fba Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:38:10 +0200 Subject: [PATCH 40/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index ae9578889..59153c0f7 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -197,7 +197,7 @@ Create a Batch Forge Azure Batch compute environment: ::: 1. Add the **Batch account** and **Blob Storage** account names and access keys. 1. Select a **Region**, such as _eastus_. -1. In the **Pipeline work directory** field, enter the Azure blob container created previously. For example, `az://towerrgstorage-container/work`. +1. In the **Pipeline work directory** field, enter the Azure blob container created previously. For example, `az://seqeracomputestorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: From 9bf74c0fea3947ee03ce04b463d2d5fdd888c45e Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:38:20 +0200 Subject: [PATCH 41/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index 59153c0f7..fc561e074 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -263,7 +263,7 @@ Create a manual Seqera Azure Batch compute environment: To authenticate using an Entra service principal, you must include a user-assigned managed identity. See [Managed identity](#managed-identity) below. ::: 1. Select a **Region**, such as _eastus (East US)_. -1. In the **Pipeline work directory** field, add the Azure blob container created previously. For example, `az://towerrgstorage-container/work`. +1. In the **Pipeline work directory** field, add the Azure blob container created previously. For example, `az://seqeracomputestorage-container/work`. :::note When you specify a Blob Storage bucket as your work directory, this bucket is used for the Nextflow [cloud cache](https://www.nextflow.io/docs/latest/cache-and-resume.html#cache-stores) by default. You can specify an alternative cache location with the **Nextflow config file** field on the pipeline [launch](../launch/launchpad.mdx#launch-form) form. ::: From 3de8eea4529884ff31ccdc311457ca3aa110d463 Mon Sep 17 00:00:00 2001 From: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> Date: Fri, 11 Oct 2024 19:38:31 +0200 Subject: [PATCH 42/42] Update platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx Signed-off-by: Llewellyn vd Berg <113503285+llewellyn-sl@users.noreply.github.com> --- .../version-24.1/compute-envs/azure-batch.mdx | 1 - 1 file changed, 1 deletion(-) diff --git a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx index fc561e074..e52e967ac 100644 --- a/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx +++ b/platform_versioned_docs/version-24.1/compute-envs/azure-batch.mdx @@ -292,7 +292,6 @@ Create a manual Seqera Azure Batch compute environment:
->>>>>>> master 1. Set the **Config mode** to **Manual**. 1. Enter the **Compute Pool name**. This is the name of the Azure Batch pool you created previously in the Azure Batch account. :::note