NUTANIX ACROPOLIS BLOCK SERVICES – PART02

I hope you had a chance to review Acropolis Block Services Introduction covered in Part-01 of this series. If not, please consider it as it is the foundation of Part-02 i.e. this post. In this post, I will explain pre-requisites, Terminology, and some considerations to keep in mind.

Prerequisites

  1. Ports 3260 and 3205 are opened between iSCSI Initiator and iSCSI Target
  2. IP Address for iSCSI Target. By the way, Acropolis Block Services (ABS) refers this IP Address as External Data services IP Address
  3. AOS is 4.7 & Above
  4. iSCSI Initiator Name is also known as IQN (iSCSI Qualified Name)
  5. Volume Group

What is iSCSI Initiator and iSCSI Target

While iSCSI initiator is the client sitting on Operating system where the iSCSI target is on the storage side waiting to be discovered and to respond to queries of the iSCSI initiator.

Basics of iSCSI (initiator & Target)
Basics of iSCSI (initiator & Target)

What is volume group?

The Volume Group (VG) as the name denotes the group of Virtual Disks (vDisks). VG are created on containers. vDisk can also be referred as LUN (old school).

Few Considerations in Acropolis Block Services

  • vDisk is owned by Single Controller Virtual Machine (CVM). As a result, vDisks are always reached from this CVM which is referred as preferred CVM and therefore preferred Path to reach VG/vDisks.  To put it in another way, each iSCSI Target has preferred CVM. Preferred CVM is automatically selected based on Load Balancing Algorithm. Although in some particular cases you can select CVM of your choice.
  • Starting with AOS 4.7, iSCSI initiator no longer initiates a direct connection with CVM. Instead, iSCSI initiator discovers VG using External Data Services IP. External Data Services IP Address is configured at the cluster level. Hint: Cluster design considerations.
  • External Data services IP act as Discovery Portal. External Data services IP Address is responsible for Path management and load balancing. MPIO of native OS is not required.
  • Login redirection occurs on per Target basis. Target can be multiple VGs, or Single VG with multiple vDisks also referred as Virtual Targets
  • Total 32 Virtual targets can be configured per VG. In other words, if VG contains 32 or more vDisks, then an iSCSI client will see only 32 vDisks.
  • CVMs CPU utilization will be decided not only by the number of vDisks accessed but also by the number of VMs accessing it. Subsequently, there might be a situation where one CVM will consume more CPU than others. However, there is no sizing/design consideration required Since ABS and ADS are tightly integrated, and there is 85% threshold configured by default on CPU utilization of CVM. Let me reiterate here ADS is set by default.
  • By default, vDisks are thinly provisioned.(Design Consideration: Right-Sizing for storage)
  • Online expansion of vDisks is possible if you are at AOS 5.0. (Forget!! Design Consideration for Right-Sizing for Storage)
  • Name of the VG target starts with VG name and ends with Virtual Target number. Virtual Target Number starts with “0.”
CVM Failure Scenario

Whenever CVM fails/unavailable, there is Zero impact to storage connectivity between iSCSI Initiator (VMs/Physical Server connectivity) and iSCSI Target (VG). For instance, there is an interruption of only 15-20 seconds which is fairly within the usual disk timeout sustained by various Operating Systems. Let me explain this using some simple picture.

imageThe iscsi initiator sends a discovery request to External Data Services IP Address. External Data Services IP Address responds with discovery Target (VG1).

Acropolis Block Services
Acropolis Block Services

image iSCSI initiators send login request (using CHAP) credential to access VG1. External Data Services IP Address redirects Login request to CVM01.

imageCVM responds back to VM1  with login success. Here on all request to access storage goes via CVM01 till CVM01 either fails or ADS intervenes whichever occurs first.

When CVM Fails

imageCVM01 goes down. TCP session is lost. Since CVM01 is unreachable disk timeout errors will be observed inside the Guest OS until new iSCSI session is established (less 20 seconds). Login request is sent to External Data Services IP Address

imageThis time External Data Services IP Address redirects the login request to CVM02

Acropolis Block Services Failure Scenario
Acropolis Block Services CVM Failure Scenario

imageCVM02 acknowledge the request and respond with Target VG1 upon successful login.

Path failure (1 to 3 tasks) are executed in less than 20 seconds which is well within 60 seconds.

If VMs are sharing vDisks, then both the VMs are directed to preferred CVM.Automatic failback is configured e.g. If CVM goes down and comes back, the path will fail back to preferred CVM. When iSCSI Target is shared between VMs especially for configuring WSFC, both the nodes of the cluster are redirected to Preferred CVM.

Recommendations

External data services IP address must be on the same subnet as CVM IP address to avoid any delays in path failover. To clarify no routing should occur between iSCSI Initiator and iSCSI Target. I think it is better illustrated in figures above.

Since iSCSI Initiator and iSCSI Target can establish only single iSCSI connection per target (as there is only External Data Services IP Address), Nutanix strongly recommends configuring NIC teaming (Bonding) especially for iSCSI initiators on the Physical server.

Receive Side Scaling (RSS) recommendations are no different than previously recommended by VMware and Microsoft. I will state here for the sake of completing the post

  1. VMware ESXi – VMXNET3 driver must be installed on VM to leverage RSS.
  2. Hyper-V enable VMQ to take advantage of RSS

Likewise, Jumbo frames recommendation remains unchanged. If you wish to enable Jumbo Frames, configure it end to end i.e. right from VM – CVM – Virtual Switch – Physical Switch – Physical Server

Lastly, at least set up one-way CHAP at the minimum.