title | description | author | ms.author | ms.reviewer | ms.date | ms.service | ms.subservice | ms.topic | ms.custom | |
---|---|---|---|---|---|---|---|---|---|---|
Configure FCI - SQL Server on Linux (RHEL) |
Learn to configure a failover cluster instance (FCI) on Red Hat Enterprise Linux (RHEL) for SQL Server. |
rwestMSFT |
randolphwest |
vanto |
11/18/2024 |
sql |
linux |
install-set-up-deploy |
|
[!INCLUDE SQL Server - Linux]
A [!INCLUDE ssnoversion-md] two-node shared disk failover cluster instance provides server-level redundancy for high availability. In this tutorial, you learn how to create a two-node failover cluster instance of [!INCLUDE ssnoversion-md] on Linux. The specific steps that you'll complete include:
[!div class="checklist"]
- Set up and configure Linux
- Install and configure [!INCLUDE ssnoversion-md]
- Configure the hosts file
- Configure shared storage and move the database files
- Install and configure Pacemaker on each cluster node
- Configure the failover cluster instance
This article explains how to create a two-node shared disk failover cluster instance (FCI) for [!INCLUDE ssnoversion-md]. The article includes instructions and script examples for Red Hat Enterprise Linux (RHEL). Ubuntu distributions are similar to RHEL so the script examples will normally also work on Ubuntu.
For conceptual information, see Failover Cluster Instances - SQL Server on Linux.
To complete the following end-to-end scenario, you need two machines to deploy the two nodes cluster and another server for storage. Below steps outline how these servers will be configured.
The first step is to configure the operating system on the cluster nodes. On each node in the cluster, configure a linux distribution. Use the same distribution and version on both nodes. Use either one or the other of the following distributions:
- RHEL with a valid subscription for the HA add-on
-
Install and set up [!INCLUDE ssnoversion-md] on both nodes. For detailed instructions, see Installation guidance for SQL Server on Linux.
-
Designate one node as primary and the other as secondary, for purposes of configuration. Use these terms for the following this guide.
-
On the secondary node, stop and disable [!INCLUDE ssnoversion-md]. The following example stops and disables [!INCLUDE ssnoversion-md]:
sudo systemctl stop mssql-server sudo systemctl disable mssql-server
[!NOTE]
At set up time, a Server Master Key is generated for the [!INCLUDE ssnoversion-md] instance and placed atvar/opt/mssql/secrets/machine-key
. On Linux, [!INCLUDE ssnoversion-md], always runs as a local account calledmssql
. Because it's a local account, its identity isn't shared across nodes. Therefore, you need to copy the encryption key from primary node to each secondary node so each localmssql
account can access it to decrypt the Server Master Key. -
On the primary node, create a [!INCLUDE ssnoversion-md] login for Pacemaker and grant the login permission to run
sp_server_diagnostics
. Pacemaker uses this account to verify which node is running [!INCLUDE ssnoversion-md].sudo systemctl start mssql-server
Connect to the [!INCLUDE ssnoversion-md]
master
database with thesa
account and run the following:USE [master]; GO CREATE LOGIN [<loginName>] WITH PASSWORD = N'<password>'; ALTER SERVER ROLE [sysadmin] ADD MEMBER [<loginName>];
[!CAUTION]
[!INCLUDE password-complexity]Alternatively, you can set the permissions at a more granular level. The Pacemaker login requires
VIEW SERVER STATE
to query health status withsp_server_diagnostics
, setupadmin, and ALTER ANY LINKED SERVER to update the FCI instance name with the resource name, by runningsp_dropserver
andsp_addserver
. -
On the primary node, stop and disable [!INCLUDE ssnoversion-md].
On each cluster node, configure the hosts file. The hosts file must include the IP address and name of every cluster node.
-
Check the IP address for each node. The following script shows the IP address of your current node.
sudo ip addr show
-
Set the computer name on each node. Give each node a unique name that is 15 characters or less. Set the computer name by adding it to
/etc/hosts
. The following script lets you edit/etc/hosts
withvi
.sudo vi /etc/hosts
The following example shows
/etc/hosts
with additions for two nodes namedsqlfcivm1
andsqlfcivm2
.127.0.0.1 localhost localhost4 localhost4.localdomain4 ::1 localhost localhost6 localhost6.localdomain6 10.128.18.128 sqlfcivm1 10.128.16.77 sqlfcivm2
You need to provide storage that both nodes can access. You can use iSCSI, NFS, or SMB. Configure storage, present the storage to the cluster nodes, and then move the database files to the new storage. The following articles explain the steps for each storage type:
- Configure failover cluster instance - iSCSI - SQL Server on Linux
- Configure failover cluster instance - NFS - SQL Server on Linux
- Configure SMB storage failover cluster instance - SQL Server on Linux
-
On both cluster nodes, create a file to store the [!INCLUDE ssnoversion-md] username and password for the Pacemaker login.
The following command creates and populates this file:
sudo touch /var/opt/mssql/secrets/passwd sudo echo '<loginName>' >> /var/opt/mssql/secrets/passwd sudo echo '<loginPassword>' >> /var/opt/mssql/secrets/passwd sudo chown root:root /var/opt/mssql/secrets/passwd sudo chmod 600 /var/opt/mssql/secrets/passwd
-
On both cluster nodes, open the Pacemaker firewall ports. To open these ports with
firewalld
, run the following command:sudo firewall-cmd --permanent --add-service=high-availability sudo firewall-cmd --reload
If you're using another firewall that doesn't have a built-in high-availability configuration, the following ports need to be opened for Pacemaker to be able to communicate with other nodes in the cluster:
- TCP: Ports 2224, 3121, 21064
- UDP: Port 5405
-
Install Pacemaker packages on each node.
sudo yum install pacemaker pcs fence-agents-all resource-agents
-
Set the password for the default user that is created when installing Pacemaker and Corosync packages. Use the same password on both nodes.
sudo passwd hacluster
-
Enable and start
pcsd
service and Pacemaker. This will allow nodes to rejoin the cluster after the reboot. Run the following command on both nodes.sudo systemctl enable pcsd sudo systemctl start pcsd sudo systemctl enable pacemaker
-
Install the FCI resource agent for [!INCLUDE ssnoversion-md]. Run the following commands on both nodes.
sudo yum install mssql-server-ha
The FCI will be created in a resource group. This is slightly easier since the resource group alleviates the need for constraints. However, add the resources into the resource group in the order they should start. The order they should start is:
- Storage resource
- Network resource
- Application resource
This example creates an FCI in the group NewLinFCIGrp. The name of the resource group must be unique from any resource created in Pacemaker.
-
Create the disk resource. You get no response back if there isn't a problem. The way to create the disk resource depends on the storage type. The following section shows examples for each storage type (iSCSI, NFS, and SMB). Use the example that applies to the storage type for your clustered storage.
sudo pcs resource create <iSCSIDiskResourceName> Filesystem device="/dev/<VolumeGroupName>/<LogicalVolumeName>" directory="<FolderToMountiSCSIDisk>" fstype="<FileSystemType>" --group RGName
<iSCSIDIskResourceName>
is the name of the resource associated with the iSCSI disk<VolumeGroupName>
is the name of the volume group<LogicalVolumeName>
is the name of the logical volume that was created<FolderToMountiSCSIDIsk>
is the folder to mount the disk (for system databases and the default location, it would be/var/opt/mssql/data
)<FileSystemType>
would be EXT4 or XFS, depending on how things were formatted and what the distribution supports.
sudo pcs resource create <NFSDiskResourceName> Filesystem device="<IPAddressOfNFSServer>:<FolderOnNFSServer>" directory="<FolderToMountNFSShare>" fstype=nfs4 options=" nfsvers=4.2,timeo=14,intr" --group RGName mount -t nfs4 IPAddressOfNFSServer:FolderOnNFSServer /var/opt/mssql/data -o
<NFSDIskResourceName>
is the name of the resource associated with the NFS share<IPAddressOfNFSServer>
is the IP address of the NFS server that you're going to use<FolderOnNFSServer>
is the name of the NFS share<FolderToMountNFSShare>
is the folder to mount the disk (for system databases and the default location, it would be /var/opt/mssql/data)
An example is shown here:
mount -t nfs4 200.201.202.63:/var/nfs/fci1 /var/opt/mssql/data -o nfsvers=4.2,timeo=14,intr
sudo pcs resource create SMBDiskResourceName Filesystem device="//<ServerName>/<ShareName>" directory="<FolderName>" fstype=cifs options="vers=3.0,username=<UserName>,password=<Password>,domain=<ADDomain>,uid=<mssqlUID>,gid=<mssqlGID>,file_mode=0777,dir_mode=0777" --group <RGName>
<ServerName>
is the name of the server with the SMB share<ShareName>
is the name of the share<FolderName>
is the name of the folder created in the last step<UserName>
is the name of the user to access the share<Password>
is the password for the user<ADDomain>
is the Active Directory DS domain (if applicable when using a Windows Server-based SMB share)<mssqlUID>
is the UID of themssql
user<mssqlGID>
is the GID of themssql
user<RGName>
is the name of the resource group
-
Create the IP address that will be used by the FCI. You get no response back if there isn't a problem.
sudo pcs resource create <IPResourceName> ocf:heartbeat:IPaddr2 ip=<IPAddress> nic=<NetworkCard> cidr_netmask=<NetMask> --group <RGName>
<IPResourceName>
is the name of the resource associated with the IP address<IPAddress>
is the IP address for the FCI<NetworkCard>
is the network card associated with the subnet (that is, eth0)<NetMask>
is the netmask of the subnet (that is, 24)<RGName>
is the name of the resource group
-
Create the FCI resource. You get no response back if there isn't a problem.
sudo pcs resource create FCIResourceName ocf:mssql:fci op defaults timeout=60s --group RGName
<FCIResourceName>
isn't only the name of the resource, but the friendly name that is associated with the FCI. This is what users and applications use to connect.<RGName>
is the name of the resource group.
-
Run the command
sudo pcs resource
. The FCI should be online. -
Connect to the FCI with SSMS or sqlcmd using the DNS/resource name of the FCI.
-
Issue the statement
SELECT @@SERVERNAME
. It should return the name of the FCI. -
Issue the statement
SELECT SERVERPROPERTY('ComputerNamePhysicalNetBIOS')
. It should return the name of the node that the FCI is running on. -
Manually fail the FCI to the other node(s). See the instructions under Operate failover cluster instance - SQL Server on Linux.
-
Finally, fail the FCI back to the original node and remove the colocation constraint.
In this tutorial, you completed the following tasks.
[!div class="checklist"]
- Set up and configure Linux
- Install and configure [!INCLUDE ssnoversion-md]
- Configure the hosts file
- Configure shared storage and move the database files
- Install and configure Pacemaker on each cluster node
- Configure the failover cluster instance