Microsoft Multi-Site Failover Cluster for DR & Business Continuity

Not every organisation looses millions of dollar per second but some does. An organisation may not loose millions of dollar per second but consider customer service and reputation are number one priority. These type of business wants their workflow to be seamless and downtime free. This article is for them who consider business continuity equals money well spent. Here is how it is done:

Multi-Site Failover Cluster

Microsoft Multi-Site Failover Cluster is a group of Clustered Nodes distribution through multiple sites in a region or separate region connected with low latency network and storage. As per the diagram illustrated below, Data Center A Cluster Nodes are connected to a local SAN Storage, while replicated to a SAN Storage on the Data Center B. Replication is taken care by a identical software defined storage on each site.  Software defined storage will replicate volumes or Logical Unit Number (LUN) from primary site in this example Data Center A to Disaster Recovery Site B. Microsoft Failover cluster is configured with pass-through storage i.e. volumes and these volumes are replication to DR site. In the Primary and DR sites, physical network is configured using Cisco Nexus 7000. Data network and virtual machine network are logically segregated in Microsoft System Center VMM and physical switch using virtual local area network or VLAN.  A separate Storage Area Network (SAN) is created in each site with low latency storage. Volumes of pass-through storage are replicated to DR site using identical size of volumes.

image

                                     Figure: Highly Available Multi-site Cluster

image

                           Figure: Software Defined Storage in Each Site

 Design Components of Storage:

  • SAN to SAN replication must be configured correctly
  • Initial must be complete before Failover Cluster is configured
  • MPIO software must be installed on the cluster Nodes (N1, N2…N6)
  • Physical and logical multipathing must be configured
  • If Storage is presented directly to virtual machines or cluster nodes then NPIV must configured on the Fabric Zones.
  • All Storage and Fabric Firmware must up to date with manufacturer latest software
  • An identical software defined storage must be used on the both sites 
  • If a third party software is used to replicate storage between sites then storage vendor must be consulted before the replication. 

Further Reading:

Understanding Software Defined Storage (SDS)

How to configure SAN replication between IBM Storwize V3700 systems

Install and Configure IBM V3700, Brocade 300B Fabric and ESXi Host Step by Step

Application Scale-out File Systems

Design Components of Network:

  • Isolate management, virtual and data network using VLAN
  • Use a reliable IPVPN or Fibre optic provider for the replication over the network
  • Eliminate all single point of failure from all network components
  • Consider stretched VLAN for multiple sites 

Further Reading:

Understanding Network Virtualization in SCVMM 2012 R2

Understanding VLAN, Trunk, NIC Teaming, Virtual Switch Configuration in Hyper-v Server 2012 R2

Design failover Cluster Quorum

  • Use Node & File Share Witness (FSW) Quorum for even number of Cluster Nodes
  • Connect File Share Witness on to the third Site
  • Do not host File Share Witness on a virtual machine on same site
  • Alternatively use Dynamic Quorum

Further Reading:

Understanding Dynamic Quorum in a Microsoft Failover Cluster

Design of Compute

  • Use reputed vendor to supply compute hardware compatible with Microsoft Hyper-v
  • Make sure all latest firmware updates are applied to Hyper-v host
  • Make manufacture provide you with latest HBA software to be installed on Hyper-v host

Further Reading:

Windows Server 2012: Failover Clustering Deep Dive Part II

Implementing a Multi-Site Failover Cluster

Step1: Prepare Network, Storage and Compute

Understanding Network Virtualization in SCVMM 2012 R2

Understanding VLAN, Trunk, NIC Teaming, Virtual Switch Configuration in Hyper-v Server 2012 R2

Install and Configure IBM V3700, Brocade 300B Fabric and ESXi Host Step by Step

Step2: Configure Failover Cluster on Each Site

Windows Server 2012: Failover Clustering Deep Dive Part II

Understanding Dynamic Quorum in a Microsoft Failover Cluster

Multi-Site Clustering & Disaster Recovery

Step3: Replicate Volumes

How to configure SAN replication between IBM Storwize V3700 systems

How to create a Microsoft Multi-Site cluster with IBM Storwize replication

Use Cases:

Use case can be determined by current workloads and future workloads plus business continuity. Deploy Veeam One to determine current workloads on your infrastructure and propose a future workload plus business continuity.  Here is a list of use cases of multi-site cluster.

  • Scale-Out File Server for application data-  To store server application data, such as Hyper-V virtual machine files, on file shares, and obtain a similar level of reliability, availability, manageability, and high performance that you would expect from a storage area network. All file shares are simultaneously online on all nodes. File shares associated with this type of clustered file server are called scale-out file shares. This is sometimes referred to as active-active.

  • File Server for general use – This type of clustered file server, and therefore all the shares associated with the clustered file server, is online on one node at a time. This is sometimes referred to as active-passive or dual-active. File shares associated with this type of clustered file server are called clustered file shares.

  • Business Continuity Plan

  • Disaster Recovery Plan

  • DFS Replication Namespace for Unstructured Data i.e. user profile, home drive, Citrix profile

  • Highly Available File Server Replication 

Install and Configure IBM V3700, Brocade 300B Fabric and ESXi Host Step by Step

Step1: Hardware Installation

Follow the official IBM video tutorial to rack and stack IBM V3700.

 

image

image

image

Cabling V3700, ESX Host and Fabric.

  • Connect each canister of V3700 Storage to two Fabric. Canister1 FC Port 1—>Fabric1 and Canister 1 FC Port 2—>Fabric2. Canister2 FC Port 1—>Fabric1 and Canister 2 FC Port 2—>Fabric2.
  • Connect two HBAs of each Host to two Fabric.
  • Connect Disk Enclosure to two canister
  • Connect Both Power cable of each devices to multiple power supply

Step2: Initial Setup

Management IP Connection: Once you racked the storage, connect Canister1 Port 1 and Canister 2 Port 1 into two Gigabit Ethernet port in same VLAN. Make LACP is configured in both port in your switch.

image

Follow the official IBM video tutorial to setup management IP or download Redbook Implementation Guide and follow section 2.4.1

 

One initial setup is complete. Log on to V3700 storage using Management IP.

Click Monitoring>System>Enclosure>Canister. Record serial number, part number and machine signature.

image

Step3: Configure Event Notification

Click Settings>Event Notification>Edit. Add correct email address. You must add management IP address of v3700 into Exchange Relay Connector. Now Click test button to check you received email.

image

image

Step4: Licensing Features

Click Settings>General>Licensing>Actions>Automatic>Add>Type Activation Key>Activate. repeat the step for all licensed features.

image

IBM V3700 At a Glance

image

Simple Definition:

MDisk: Bounce of disk configured in preset RAID or User defined RAID. For example you have 4 SSD hard drive and 20 SAS hard drive. You can chose automatic configuration of storage which will be configured by Wizard. However you can chose to have 1 RAID5 single parity MDisk for SSD and 2x RAID5 MDisk with parity for 20 SAS hard drive. In this case you will have three MDisk in your controller. Simply MDisk is RAID Group.

Pool: Storage Pool act as container for MDisks and available capacity ready to be provisioned.

Volume: Volumes are logical segregation of Pools. Volume can defines as LUNs and ready to be mapped or connected to ESX host or Hyper-v Host.

Step5: Configure Storage

Click Pools>Internal Storage>Configure Storage>Select recommended or user defined. Follow the wizard to complete MDisk.

image

image

Step6: Configure Pool

image

Click Pools>Click Volumes by Pool>Click New Volume>Generic>Select Quantity, Size, Pool then Click Create.

image

image

image

Repeat the steps to create multiple volumes for Hyper-v Host or ESXI host.

Important! In IBM V3700, if you configure storage using GUI, GUI console automatically select default functionality of the storage for example number of mdisk. you will not have option to select disks and mdisk you want. Example, you may want create large mdisk containing 15 disks but GUI will create 3xmdisk which means you will loose lot of capacity by doing so. To avoid this, you can create mdisk, raid and pool using command line.

Telnet to your storage using putty.  type username: superuser and password is your password. Type the following command.
Creating RAID5 arrays and a pool

svctask mkmdiskgrp -ext 1024 -guiid 0 -name Pool01 -warning 80%

Creating a spare with drive 6

svctask chdrive -use spare 6

Creating a RAID Group (12 disk)

svctask mkarray -drive 4:11:5:9:3:7:8:10:12:13:14 -level raid5 -sparegoal 1 0

Creating SSD Raid5 with easy tier (disk0, disk1, disk2 are SSD)

svctask mkarray -drive 1:2:0 -level raid5 -sparegoal 0 Pool01

Now go back to GUI and check you have created two mdisk and a Pool with easy tier activated.

Step7: Configure Fabric

Once you rack and stack Brocade 300B switches. Connect both Brocade switches using CAT6 cable to your Ethernet Switch. This could be your core switch or an access switch. Connect your laptop to network as well.

References are here. Quick Start Guide and Setup Guide

Default Passwords

username: root password: fibranne
username: admin password: password

Connect the console cable provided within brocade box to your laptop. Insert EZSetup cd into cdrom of your laptop. Run Easy Setup Wizard.

Alternatively, you can connect the switch to the network and connect to it via http://10.77.77.77 (Give yourself an IP address of 10.77.77.1/24). Use the username root and the password fibranne

If you don’t want to do that, connect the console cable (provided) to your PC and launch the EZ Setup Software supplied with the switch. Select English > OK.

At the Welcome Screen > Click Next > Click Next > accept the EULA > Install > Done > Select Serial Cable > Click Next > Click Next (make sure HyperTerminal is NOT on or it will fail). It should find the switch > Set its IP details.

image

Next > Follow Instructions.

Install Java on your laptop. Open browser, Type IP address of Brocade, you can connect to the switch via its web console Once logged in using default username:root and password: fibranne

image

Click Switch Admin>Type DNS Server, Domain Name, Apply. Click License>Add new License, Apply.

image

image

Click Zone Admin>Click Alias>New Alias>Type the name of Alias. Click Ok. Expand Switch Port, Select WWNN>Add members. Repeat for Canister Node 1, Canister Node 2, ESX Host1, ESX Host2, ESX Host 3….

image

image

Select Zone Tab> New Zone > Type the Name of Zone, Example vSphere or Hyper_v. Select Aliases>Click Add Members> Add all Aliases.

image

Select Zone Config Tab>New Zone Config>Type the name of Zone Config>Select vSphere or Hyper_V Zone>Add Members

image

Click Save Config. Click Enable Config.

image

image

Repeat above steps for all fabric.

Step8: Configure Hosts in V3700

Click Hosts>New Host>Fibre Channel Host>Generic>Type the Name of the Host>Select Port>Create Host.

image

image

image

image

Right Click on each host>Map Volumes.

image

Step9: Configure ESXi Host

Right Click on ESX Cluster>Rescan for datastores>

image

image

Click ESX Host>Configuration>Storage>Add Storage>Click Next>Select Correct and Matching UID>Type the matching Name same as volume name within IBM V3700. Click Next>Select VMFS5>Finish.

image

Rescan for datatores for all Host again. you will see same data store popped up in all Hosts.

Finding HBA, WWNN

Select ESX Host>Configuration>Storage Adapters

image

image

Verifying Multipathing

Right Click on Storage>property>Manage Paths

image

image

Finding and Matching UID

Log on to IBM V3700 Storage>Volume>Volumes

image

Click ESX Host>Configuration>Storage>Devices

image