July 20, 2015

View Path Selection Policy

View Path Selection Policy In Use

via vSphere Client:
Procedure
  • Connect to vSphere Client
  • Select a Host
  • Click Configuration
  • Select Storage
  • Select a Datastore
  • View "Datastore Details" for Path Selection
    In the above example the PSP in use is "Fixed (VMware)".
via vSphere Web Client:

Browse to Datastores in the vSphere Web Client navigator.
Procedure
  • Select the datastore.
  • Click the Manage tab, and click Settings.
  • Click Connectivity and Multipathing.
  • If the datastore is shared, select a host to view multipathing details for its devices.
  • Under Multipathing Details, review the multipathing policies and paths for the storage device that backs your datastore.

Display Storage Devices for a Host

Display Storage Devices for a Host

Display Storage Devices for a Host in the vSphere Web Client

Display all storage devices available to a host. If you use any third-party multipathing plug-ins, the storage devices available through the plug-ins also appear on the list.

The Storage Devices view allows you to list the hosts' storage devices, analyze their information, and modify properties.

Procedure
  • Browse to the host in the vSphere Web Client navigator.
  • Click the Manage tab, and click Storage.
  • Click Storage Devices.
    All storage devices available to the host are listed under Storage Devices.
  • To view details for a specific device, select the device from the list.
  • Use tabs under Device Details to access additional information and modify properties for the selected device.

Migrate VM with svMotion

Migrate VM with svMotion

Migrate a Virtual Machine with Storage vMotion in the vSphere Web Client
Use migration with Storage vMotion to relocate a virtual machine’s configuration file and virtual disks while the virtual machine is powered on.

You can change the virtual machine’s execution host during a migration with Storage vMotion.


Prerequisites:
  • Ensure that you are familiar with the requirements for Storage vMotion.
  • Required privilege: Resource.Migrate powered on virtual machine
Procedure:
  • Right-click the virtual machine and select Migrate.
  • To locate a virtual machine, select a datacenter, folder, cluster, resource pool, host, or vApp.
  • Click the Related Objects tab and click Virtual Machines.
  • Select Change datastore and click Next.
  • Select the format for the virtual machine's disks.
  • Select a virtual machine storage policy from the VM Storage Policy drop-down menu.
    Storage policies specify storage requirements for applications that run on the virtual machine.
  • Select the datastore location where you want to store the virtual machine files.
  • Review the information on the Review Selections page and click Finish.
vCenter Server moves the virtual machine to the new storage location. Names of migrated virtual machine files on the destination datastore match the inventory name of the virtual machine. Event messages appear in the Events tab.

The data displayed on the Summary tab shows the status and state throughout the migration. If errors occur during migration, the virtual machines revert to their original states and locations.

July 15, 2015

Storage I/O Control (SIOC)

SIOC

VMware vSphere Storage I/O Control (SIOC) provides I/O prioritization of virtual machines running on a cluster of ESXi hosts with access to shared storage. It extends the constructs of shares and limits, which existed for CPU and memory, to manage storage utilization.

Use SIOC to configure rules and policies to specify the business priority of each virtual machine using shares and limits. When I/O congestion is detected, Storage I/O Control dynamically allocates the available I/O resources to virtual machines according to the rules and policies, improving service levels and consolidation ratios.

At a basic level SIOC is monitoring the end to end latency of a datastore. When there is congestion, SIOC reduces the latency by throttling back virtual machines that are using excessive I/O. SIOC will use the share values assigned to the virtual machine’s VMDKs to prioritize access to the datastore.

The purpose of SIOC is to address the noisy neighbor problem, i.e. a low priority virtual machine impacting other higher priority virtual machines due to the nature of the application and the I/O characteristics of the low priority VM.

SIOC allows vSphere administrators to assign relative priority to storage I/O as well as assign storage I/O limits to VMs.

Once enabled for a specific datastore, SIOC will monitor the datastore, summing up the disk shares for each of its VMDK files. When an ESXi host detects storage congestion through an increase of latency beyond a user-configured threshold, it will apply the settings configured for that VM.


SIOC resolves imbalances by limiting the amount of I/O operations a host can issue for a datastore. The result is that VMware administrators can ensure that VMs that need priority access to storage resources get the resources they need.

Storage I/O Control (SIOC) requires Enterprise Plus licensing..

Reference:

Universally Unique Identifier (UUID)

UUID

A Universally Unique IDentifier (UUID) is a 16-octet (128-bit) number. In its canonical form, a UUID is represented by 32 lowercase hexadecimal digits, displayed in five groups individually separated by hyphens, in the form 8-4-4-4-12. In general, UUID is used to uniquely identify an object or entity on the Internet.

VMware storage architecture has multiple, unique identifiers:
  • NAA & EUI (most common):
    • Network Address Authority  & Extended Unique Identifier
    • Guaranteed to be unique to the LUN
    • The preferred method of identifying LUNs
    • Generated by the storage device
  • MPX  (local datastores):
    • For devices that do not provide an NAA number, ESXi generates an MPX identifier
    • Represents the local LUN or disk
    • Takes the form mpx.vmhba<Adapter>:C<Channel>:T<Target>:L<LUN>, e.g. mpx.vmhba33:C0:T1:L0
    • Can be used in the exact same way as the NAA identifier
  • VML: Can be used interchangeably with the NAA identifier and the MPX identifier
    • Generally used for operations with utilities such as vmkfstools
  • Path identifiers, e.g. vmhba1:C0:T0:L1
    • Used exclusively to identify a path to the LUN
    • Generally used for operations with utilities such as vmkfstools
  • Target identifiers, e.g. fc.200100e08ba5ff63:210100e08ba5ff63
We are now adding UUID – Universally Unique IDentifier to the list.

"In addition to being universally unique, no centralized authority is required to administer them."

In vSphere, the UUID is a 16-octet (128-bit) number. It is represented as 16-hexidecimal number pairs. The 16 bytes of this value are separated by spaces, except for a dash between the eighth and ninth hexadecimal pairs.

An example UUID looks like this:
uuid.bios = "00 11 22 33 44 55 66 77-88 99 aa bb cc dd ee ff"

VMware uses the UUID to generate a MAC address for a VM.

The UUID value is based on the physical host’s System Management BIOS (SMBIOS) and the path to the virtual machine’s configuration (.vmx) file.

The UUID is stored in the SMBIOS system information (the BIOS of the VM) descriptor and in the file system superblock.

Within the VMX configuration file the UUID information is stored in three variables: uuid.bios, uuid.location and vc.uuid.
  • uuid.bios
    • globally unique identifier
    • generated when a VM is powered on or reset
  • uuid.location
    • hash based on the current path of the VM
    • generated whenever the VM is migrated
  • vc.uuid
    • used by vCenter to identify VM
    • generated when you add VM to inventory (or create VM
The UUID value must be surrounded by quotation marks. A sample configuration is shown below:
Associating a UUID with a virtual machine allows that virtual machine to be uniquely identified even if its network configuration is changed.

Note: VMware uses the form 8-8-4-12 instead of the formal 8-4-4-4-12 form.

References:

SCSI Reservations & Atomic Test and Set (ATS)

SCSI Reservations

SCSI reservations are used to control access to a shared SCSI device such as a LUN. An initiator or host sets a reservation/lock on a LUN in order to prevent another host from making changes to it. This is similar to the file-locking concept.

A SCSI reservation conflict occurs if a host tries to access a datastore that was locked by another host.

A Logical Unit Number (LUN) is an individual, unique, block-based storage device, the term LUN is often used interchangeably with disk and datastore, depending on the context.

In a shared storage environment, when multiple hosts access the same Virtual Machine File System (VMFS) datastore, specific locking mechanisms are used. These locking mechanisms prevent multiple hosts from concurrently writing to the metadata and ensure that no data corruption occurs.

SCSI Reservations
SCSI reservation is a technique that manages disk contention by preventing I/O on an entire LUN for any ESXi host or VMs (other than the host/VM issuing the SCSI reservation). It is important not to have frequent SCSI reservations, as this could hinder performance.

VMFS uses SCSI reservations to give it exclusive access when it performs certain VMFS operations that modify the metadata of a datastore. Examples of VMFS operations include creating a virtual machine or template, turning a VM on, or creating or deleting a file. The host releases the lock once the operation is complete.

SCSI reservations lock an entire storage device while an operation that requires metadata protection is performed. After the operation completes, VMFS releases the reservation and other operations can continue. Because this lock is exclusive, excessive SCSI reservations by a host can cause performance degradation on other hosts that are accessing the same VMFS.

There are two main categories of operation under which VMFS makes use of SCSI reservations.
  • Category 1: is for VMFS data-store level operations. These include opening, creating, resignaturing, and expanding/extending of VMFS data-store.
  • Category 2: involves the acquisition of locks. These are locks related to VMFS specific metadata (called cluster locks) and locks related to files (including directories).
These are some examples of VMFS operations that require locking metadata:
  • Creating a VMFS datastore
  • Expanding a VMFS datastore onto additional extents
  • Powering on a virtual machine
  • Acquiring a lock on a file
  • Creating or deleting a file
  • Creating a template
  • Deploying a virtual machine from a template
  • Creating a new virtual machine
  • Migrating a virtual machine with vMotion
  • Growing a file, for example, a snapshot file or a thin provisioned virtual disk
  • For the zeroed thick type of virtual disk the reservation is required only when zeroing the blocks.
VMFS uses SCSI reservations on storage devices that do not support hardware acceleration.

Atomic Test and Set (ATS)
The Atomic Test and Set (ATS) primitive is used for locking on VMFS datastores for VMware vSphere Storage APIs for Array Integration (VAAI) compatible storage arrays. It is superior to the SCSI Reservation locking technique.

VMFS Locking Mechanisms
VMFS supports SCSI reservations and ATS locking. At vSphere 6 and depending on its configuration and the type of underlying storage, a VMFS datastore can exclusively use the ATS locking mechanism (ATS-only), or use a combination of ATS and SCSI reservations (ATS+SCSI).

For storage devices that support hardware acceleration, VMFS uses the ATS algorithm, also called hardware assisted locking. This reduces SCSI reservations and enables the ESXi host to integrate with compliant storage arrays and unload locking activities directly to the storage array controller. Unlike with SCSI reservations, ATS supports discrete locking per disk sector.
  • ATS-Only Mechanism
    For compliant storage devices (i.e. those that support T10 standard-based VAAI specifications), VMFS provides ATS locking, also called hardware assisted locking.

    The ATS algorithm supports discrete locking per disk sector. This is in contrast with SCSI reservation which locks an entire storage device while an operation that requires metadata protection is performed.

    After the operation completes, VMFS releases the reservation and other operations can continue.

    All newly formatted VMFS5 datastores use the ATS-only mechanism if the underlying storage supports it, and never use SCSI reservations.

  • ATS+SCSI Mechanism
    A VMFS datastore that supports the ATS+SCSI mechanism is configured to use ATS and attempts to use it when possible. If ATS fails, the VMFS datastore reverts to SCSI reservations.

    Datastores that use the ATS+SCSI mechanism include VMFS5 datastores that were upgraded from VMFS3 and new VMFS5 datastores on storage devices that do not support ATS.
With the storage hardware assistance, your host performs these operations faster and consumes less CPU, memory, and storage fabric bandwidth.

References:

July 09, 2015

Asymmetric Logical Unit Access (ALUA)

ALUA

“A storage controller manages the flow of data between the server and the LUN, assigning two paths, in case one of the paths becomes unavailable.”

An Active controller is available at all times. A passive controller sits idle until the active controller becomes unavailable.

A dictionary definition of asymmetric is “not identical on both sides of a central line”. An Asymmetric Logical Unit Access (ALUA) suggests unequal paths between the server to the LUN.
ALUA is implemented on active/active controllers. There are two types of active active controllers:
  • Asymmetric Active Active
  • Symmetric Active Active
In an Asymmetric Active/Active storage controller architecture (also known as ALUA compliant devices), there is a path to the LUN via either controller and both controllers are defined as “active”, however only one of the paths is defined as an optimal (direct) path. This controller is also referred to as the preferred controller.

IO requests arriving at the preferred controller are sent directly to the LUN. The other path is unoptimized (indirect) and is available only as a standby path, in case the optimized path becomes unavailable. IO requests arriving at the non-preferred controller are first forwarded to the preferred controller before being sent to the LUN.

In a Symmetric Active/Active storage controller architecture, there are no preferred or primary controllers.

ALUA
  • ALUA (Asymmetric Logical Unit Access)
  • Is a SCSI standard.
  • Typically implemented on mid-range storage arrays.
  • The LUN is reachable across both storage processors at the same time
  • All controllers are defined as “active”, however only one controller provides an optimal path to the LUN, this is the controller that owns the LUN
  • Rebalancing across the controllers as workloads change is a manual task
  • ALUA compliance required at the array and at the host multipathing layer
  • Multipathing software can query ALUA compliant arrays to load balance and failover
With ALUA, any given LUN is accessible via both storage processors however only one of these storage processors owns the LUN. This owning controller provides the optimal path to the LUN. The other controller provides the unoptimized path to the LUN. Paths for the non-owning storage processor transit data across the internal interconnect architecture of the mid-range arrays.

Terms to consider: Optimized vs. unoptimized paths; Direct vs. indirect path; Storage processor; Owned; asymmetric active-active architecture vs. symmetric active-active architecture.

Reference:

VMDirectPath I/O

VMDirectPath I/O

VMDirectPath I/O allows guest operating systems to directly access an I/O device, bypassing the virtualization layer. This direct connection frees up CPU cycles and improves performance for VMware ESXi hosts that utilize high-speed I/O devices, such as 10 GbE devices.

A single VM can connect to up to four VMDirectPath PCI/PCIe devices.

The disadvantages of using VMDirectPath I/O on a VM include:
  • Unavailability or restrictions of vSphere features such as vMotion and DRS
  • The adapter can no longer be used by any other virtual machine on the ESXi host
A known exception to this is when ESXi is running on Cisco Unified Computing Systems (UCS) through Cisco Virtual Machine Fabric Extender (VM-FEX) distributed switches.

DirectPath I/O allows virtual machine access to physical PCI functions on platforms with an I/O Memory Management Unit.

The following features are unavailable for virtual machines configured with DirectPath:
  • Hot adding and removing of virtual devices
  • Suspend and resume
  • Record and replay
  • Fault tolerance
  • High availability
  • DRS (limited availability. The virtual machine can be part of a cluster, but cannot migrate across hosts)
  • Snapshots
"VMDirectPath I/O enables a virtual machine to directly connect to a physical device such as a network card or storage adapter."

In the case of networking, instead of using an emulated network device, such as E10000, VMDirectPath I/O enables the virtual machine to bypass the hypervisor and directly access a physical NIC.

Reference:

July 05, 2015

Claim Rule

Claim Rules

Multiple Multipath Plugins (MPPs) cannot manage the same storage device. As such, claim rules allow you to designate which MPP is assigned to which storage device. Each claim rule identifies the following parameters:

Vendor/model strings
  • Transport, i.e. SATA, IDE, FC
  • Adaptor, target, or LUN location
  • Device driver
Claim rules are defined within /etc/vmware/esx.conf on each ESX/ESXi host and can be managed via the vSphere CLI.

Multipath policies (Fixed, MRU, RR) can be changed within vSphere, however any claim changes are conducted at the command line.

"The PSA uses claim rules to determine which multipathing module should claim the paths to a particular device and to manage the device. esxcli storage core claimrule manages claim rules.
Claim rule modification commands do not operate on the VMkernel directly. Instead they operate on the configuration file by adding and removing rules."

Claim Rule commands:

 List Claim Rules:

Claim rules are numbered as follows.
  • Rules 0 – 100 are reserved for internal use by VMware.
  • Rules 101 – 65435 are available for general use. Any third party multipathing plugins installed on your system use claim rules in this range.
  • Rules 65436 – 65535 are reserved for internal use by VMware.
When claiming a path, the PSA runs through the rules starting from the lowest number and determines if a path matches the claim rule specification. If the PSA finds a match, it gives the path to the corresponding plugin.

A given path might match several claim rules.

Reference:

Fixed PSP

Fixed – VMW_PSP_FIXED

With the Fixed (VMW_PSP_FIXED) path selection policy, the host always uses the preferred path to the LUN when that path is available. If the host cannot access the LUN through the preferred path, it tries one of the alternative paths. The host automatically returns to the previously defined preferred path as soon as it becomes available again.

A preferred path is a setting that NMP honors for devices claimed by the VMW_PSP_FIXED path selection policy.

The first path discovered and claimed by the PSP is set as the preferred path.

This is the default policy for LUNs presented from an Active/Active storage array.

Fixed
  • The default policy used with a SAN that is set to Active/Active
  • Uses the designated preferred path whenever available
  • If the preferred path should fail, another path is used until the preferred path is restored
  • Once the preferred path is restored the data moves back onto the preferred path
If you want the host to use a particular preferred path, specify it through the vSphere Web Client, or by using esxcli storage nmp psp fixed deviceconfig set.

"NOTE If the host uses a default preferred path and the path's status turns to Dead, a new path is selected as preferred. However, if you manually designate the preferred path, it will remain preferred even when it becomes inaccessible."


Policy
Active/Active Array Active/Passive Array
Round Robin No fail back Next path in round robin scheduling is selected
Most Recently Used Administrator action is required to failback after path failure Administrator action is required to fail back after path failure
Fixed VMkernel resumes using the preferred path when connectivity is restored VMkernel attempts to resume by using the preferred path. This action can cause path thrashing or failure when another SP now owns the LUN

When using this policy you can maximize the utilization of your bandwidth to the storage array by designating preferred paths to each LUN through different storage controllers. For optimal performance with these arrays you might also consider the Round Robin path policy.

Reference:

Most Recently Used (MRU) PSP

Most Recently Used (MRU) – VMW_PSP_MRU

“The VMW_PSP_MRU policy selects the first working path, discovered at system boot time. If this path becomes unavailable, the ESXi/ESX host switches to an alternative path and continues to use the new path while it is available.

This is the default policy for LUNs presented from an Active/Passive array.
ESXi/ESX does not return to the previous path if, or when, it returns; it remains on the working path until it, for any reason, fails.”

"If the active path fails, then an alternative path will take over, becoming active. When the original path comes back online, it will now be the alternative path."

MRU
  • The ESXi host selects the path that it most recently used
  • This is the default used with a SAN that is set to Active/Passive
  • With this policy, a path is chosen and continues to be used so long as it does not fail
  • If it fails, another path is used, and it continues to be used so long as it does not fail, even if the previous path becomes available again.
  • The host does not revert back to the original path when that path becomes available again
  • There is no preferred path setting with the MRU policy
  • The policy is displayed in the client as the Most Recently Used (VMware) path selection policy
The VMW_PSP_MRU ranking capability allows you to assign ranks to individual paths. To set ranks to individual paths, use the esxcli storage nmp psp generic pathconfig set command.

If you want the host to use a particular preferred path, specify it through the vSphere Web Client, or by using esxcli storage nmp psp generic deviceconfig set.


Policy
Active/Active Array Active/Passive Array
Round Robin No fail back Next path in round robin scheduling is selected
Most Recently Used Administrator action is required to failback after path failure Administrator action is required to fail back after path failure
Fixed VMkernel resumes using the preferred path when connectivity is restored VMkernel attempts to resume by using the preferred path. This action can cause path thrashing or failure when another SP now owns the LUN

For optimal performance with the arrays for which ESXi defaults to MRU you might also consider the Round Robin path policy.

Reference:

Round Robin PSP

Round Robin - VMW_PSP_RR

The ESXi host uses an automatic path selection algorithm rotating through all active paths when connecting to active-passive arrays, or through all available paths when connecting to active-active arrays. On supported arrays multiple paths can be active simultaneously; otherwise, the default is to rotate between the paths.

"This is the only path selection policy that uses more than one path during a data transfer session. Data is divided into multiple paths, and the paths are alternated to send data. Even though data is sent on only one path at a time, this increases the size of the pipe and  allows more data transfer in the same period of time."

"Round Robin rotates the path selection among all available optimized paths and enables basic load balancing across the paths and fabrics."

Round Robin policy provides load balancing by cycling I/O requests through all Active paths, sending a fixed (but configurable) number of I/O requests through each one in turn.

If you want the host to use a particular preferred path, specify it through the vSphere Web Client, or by using esxcli storage nmp psp roundrobin deviceconfig set.

Policy Active/Active Array Active/Passive Array
Round Robin No fail back Next path in round robin scheduling is selected
Most Recently Used Administrator action is required to failback after path failure Administrator action is required to fail back after path failure
Fixed VMkernel resumes using the preferred path when connectivity is restored VMkernel attempts to resume by using the preferred path. This action can cause path thrashing or failure when another SP now owns the LUN

Reference:

Path Selection Plug-In (PSP)

PSP

Path Selection Plug-Ins (PSPs) are subplug-ins of the VMware NMP and are responsible for choosing a physical path for I/O requests.

“The VMware NMP assigns a default PSP for each logical device based on the SATP associated with the physical paths for that device."

"The Path Selection Plug-in (PSP) performs the task of selecting which physical path to use for storage transport. The NMP assigns a default PSP from the claim rules based on the SATP associated with the physical device."

Since multiple MPP’s cannot manage the same storage device, claim rules allow you to designate which MPP is assigned to which storage device.

One way to think of PSP is which multipathing solution you are using to load balance.
There are three Path Selection Plug-ins (PSPs) pathing policies included in vSphere:
  • Fixed
    • VMW_PSP_FIXED
      • The host will use a fixed path that is either, set as the preferred path by the administrator, or is the first path discovered by the host during the boot process
      • Default for active/active arrays
  • MRU
    • VMW_PSP_MRU
      • The host will use the path that is most recently used (MRU). When a path fails and another one is activated, the host will continue to use this new active path even when the original path comes back up
      • Default for active/passive arrays
      • Default for ALUA devices
  • Round Robin
    • VMW_PSP_RR
      • The host will use all active paths in a round robin (RR) fashion. It uses an algorithm to iterate through all active paths. The default number of I/Os that are issued to a particular path is 1000 before moving on to the next active/available path
No default array types are listed for this PSP.

Reference:

Storage Array Type Plug-In (SATP)

SATP

Storage Array Type Plug-Ins (SATPs) run in conjunction with the VMware NMP and are responsible for array-specific operations.

ESXi offers a Storage Array Type Plug-in (SATP) for every type of array that VMware supports in the VMware Hardware Compatibility List (HCL). It also provides default SATPs that support non-specific active-active and ALUA storage arrays, and the local SATP for direct-attached devices.

Each SATP accommodates special characteristics of a certain class of storage arrays and can perform the array-specific operations required to detect path state and to activate an inactive path. As a result, the NMP module itself can work with multiple storage arrays without having to be aware of the storage device specifics.

The SATP monitors the health of each physical path and can respond to error messages from the storage array to handle path failover. There are third-party SATPs that the storage vendor can provide to take advantage of unique storage properties.

After the NMP determines which SATP to use for a specific storage device and associates the SATP with the physical paths for that storage device, the SATP implements the tasks that include the following:
  • Monitors the health of each physical path.
  • Reports changes in the state of each physical path.
  • Activates an inactive path
  • Performs array-specific actions necessary for storage fail-over, e.g. activates passive paths.
Managing SATPs

The esxcli storage nmp satp list command lists the SATPs that are currently available to the NMP system and displays information about those SATPs:

Reference:

Native Multipathing Plugin (NMP)

Native Multipathing Plugin (NMP)

Native Multipathing Plugin (NMP) is a generic VMkernel Multipathing Plugin (MPP) provided by default from VMware and built into ESX/ESXi. NMP is used when the storage array does not have a third-party MPP solution.

VMware provides a generic Multipathing Plugin (MPP) called Native Multipathing Plugin (NMP).

What does NMP do?
  • Manages physical path claiming and unclaiming
  • Registers and de-registers logical storage devices
  • Associates a set of physical paths with a specific logical storage device, or LUN
  • Processes I/O requests to storage devices:
    • Selects an optimal physical path for the request (load balance)
    • Performs actions necessary to handle failures and request retries
  • Supports management tasks such as abort or reset of logical storage devices.
NMP is an extensible module that manages subplugins: Storage Array Type Plugins (SATPs) and Path Selection Plugins (PSPs).

Storage Array Type Plugins (SATPs) is responsible for handling path failover for a given storage array. The appropriate SATP for an array you use will be installed automatically.

Path Selection Plugins (PSPs) determines which physical path is used to issue an I/O request to a storage device. SATPs and PSPs are sub plug-ins within the NMP module.

A Storage Array Type Plugin (SATP) determines how path failover is handled for a specific storage array.

SATPs and PSPs can be built-in and provided by VMware, or can be provided by a third party.

If more multipathing functionality is required, a third party can also provide an MPP to run in addition to, or as a replacement for, the default NMP.

The VMware NMP supports all storage arrays listed on the VMware Hardware Compatibility List (HCL).

You can use esxcli storage nmp to manage devices associated with NMP and to set path policies.

"VMware NMP" namespace:
esxcli storage nmp device list
The list command lists the devices controlled by VMware NMP and shows the SATP and PSP information associated with each device:
“One of the tasks of NMP is to associate physical storage paths with an SATP and associate a PSP that chooses the best available path. NMP provides a path selection algorithm based on the array type.“

VMware NMP Flow of I/O
When a virtual machine issues an I/O request to a storage device managed by the NMP, the following process takes place.
  • The NMP calls the PSP assigned to this storage device
  • The PSP selects an appropriate physical path on which to issue the I/O
  • The NMP issues the I/O request on the path selected by the PSP
  • If the I/O operation is successful, the NMP reports its completion
  • If the I/O operation reports an error, the NMP calls the appropriate SATP
  • The SATP interprets the I/O command errors and, when appropriate, activates the inactive paths
  • The PSP is called to select a new path on which to issue the I/O
  • If the I/O operation is successful, the NMP reports its completion
Reference:

MPP - Multipathing Plug-in

MPP - Multipathing Plug-in

The top-level plug-in in Pluggable Storage Architecture (PSA) is the Multipathing Plug-in (MPP). The MPP can be either the internal MPP, which is called the Native Multipathing Plug-in (NMP), or a third-party MPP supplied by a storage vendor. In ESXi storage is accessed through a VMware built-in MPP (NMP) or a third-party MPP.

VMware’s Native Multipathing Plug-in is also a MPP.

The process for connecting a storage array to VMware includes:
  • Check the VMware Hardware Compatibility List (HCL) to determine if it is a supported array
  • Use the built-in NMP to handle multipathing and load balancing, if in the HCL
  • Use a supported third-party MPP, if there is need for additional functionality  provided by the MPP
Third-party MPP solutions such as Symantec DMP and EMC PowerPath/VE, replace the behavior of the NMP, SATP, and PSP. It takes control of the path failover and the load-balancing operations for specified storage devices. Third-party MPPs might provide better load-balancing performance than the built-in NMP solution.

The MPP module replaces the NMP, SATP, and PSP.

“The MPP can change the path selection normally handled by the PSP. As a result, it can provide more sophisticated path selection, notable performance increases or other new functionality not present in vSphere by default."

MPPs can coexist with NMP.

A storage path is a possible route from an initiator to a given LUN through which the I/O may travel. A path can be in one of the following states:

Path State
Description
Active A path via an Active storage controller. The path can be used for I/O. Operational and in use.
Standby A path via a Passive or Standby storage controller. The path can be used for I/O if active paths fail. Operational but not in use.
Disabled Path has been administratively disabled, usually by the vSphere Administrator.
Dead Path cannot be used for I/O.
Unknown Path is in unknown error state.

Reference:

July 03, 2015

Pluggable Storage Architecture (PSA)

PSA - (Storage API - Multipathing)

With a SAN, to improve availability, the administrator can create multiple, redundant paths between hosts and storage targets or LUNs. However, multiple, redundant paths can introduce confusion if not properly managed. Multipathing software was created to minimize the confusion. Multipathing software takes control of all I/O requests; it chooses the best path for data transmission and manages path failover if an active path becomes unavailable.

The multipathing software in ESXi is known collectively as the Pluggable Storage Arctitecture (PSA). The PSA is used to manage storage multipathing in ESXi. It is an open, modular framework that coordinates the simultaneous operation of other multipathing plug-ins (MPPs) created by VMware and/or third-party software developers.

Pluggable Storage Architecture (PSA)
vSphere Pluggable Storage Architecture (PSA) Framework is a special VMkernel layer in vSphere that manages storage multi-pathing. It is a collection of VMkernel APIs that allow storage partners or third-party software developers to enable and certify their arrays asynchronous to ESXi release schedules. Using PSA, partners can design their own load balancing techniques and failover mechanisms for a particular storage array.

When a virtual machine sends a SCSI command to access data on a SAN, the VMkernel needs to know how to access the storage and which path it should choose. PSA manages this function and defines how multipathing works within vSphere.


PSA enables the delivery of performance-enhancing, multipathing and load-balancing behaviors that are optimized per array.

“The PSA acts as a base for two types of storage plug-ins: VMware’s Native Multipathing Plug-in (NMP) and a third-party vendor’s Multipathing Plug-in (MPP). NMP itself acts as a management layer for additional sub-plug-ins: Storage Array Type Plug-in (SATP) and Path Selection Plug-ins (PSP).”

The PSA performs two primary tasks:
  1. Discovers available storage and the physical paths to that storage.
  2. Assigns each storage device a Multipathing Plug-in (MPP) by using predefined claim rules
Since multiple MPP’s cannot manage the same storage device, claim rules allow you to designate which MPP is assigned to which storage device.

“PSA discovers the storage and figures out which multipathing driver will be in charge of communicating with that storage. All the error codes, I/O requests, and I/O queuing to the HBA will be handled by the MPP."


Third-party vendors can create and integrate SATPs and PSPs that run alongside VMware’s SATP and PSP.

PSA
  • Pluggable Storage Architecture (PSA)
  • introduced in vSphere 4.0
  • a set of APIs
  • manages storage multipathing
  • enables the function of MPPs, NMPs, SATPs and PSPs
  • allows third-party storage vendor to add code for managing multipathing and access in ESXi
  • sits in the SCSI middle layer of the VMKernel I/O Stack
  • coordinates the operation of the VMware NMP and third-party MPP
Before PSA, the only multipathing options were VMware’s Fixed or Most Recently Used (MRU) policies. The Pluggable Storage Architecture gave third-party storage vendors a means to add policies and to recognize the type of storage deployed.

VMware PSA namespace:

PSA Overview
The PSA consists of plug-ins (MPP and NMP) and sub plug-ins (SATP and PSP):
  • Multipathing Plug-in (MPP)
    • These are provided by third-party vendors. An example of of a MPP is EMCs PowerPath/VE. VMware’s Native Multipathing Plug-in is also an MPP.
  • VMware Native Multipathing Plug-in (NMP) – Handles overall MPIO (multipath I/O) behavior and array identification
    • Storage Array Type Plug-ins (SATP) – also called Storage Array Type Policy
      • Determines and monitors the physical path states to the storage array
      • Activates new physical paths when the active path(s) has failed
      • Handles path failover for a given array and determines the failover type for a LUN
      • Allows the NMP to be modular by hosting rules on how to handle array-specific actions or behavior as well as any specific operations needed to manage array paths
        • Additional modules for new arrays can be added in the SATP without changing the NMP
      • Perform any other necessary array specific actions required during a storage fail-over
    • Path Selection Plug-in (PSP) – also called Path Selection Policy
      • Handles path selection for a given storage device
      • If the active path fails, PSP determines the next path to use for I/O requests
"PSA selects an MPP, and if that MPP is the NMP, the NMP selects an SATP and PSP."

Reference:

June 22, 2015

LUN Masking

LUN Masking

“If you only implement SAN zoning, a host could gain access to every LUN that is available on any storage array in the zone.”

Beyond zoning, LUN masking allows the administrator to further lock down access to the storage unit. LUN Masking is done on the storage controller and it hides specific LUNs from specific servers.

LUN masking defines relationships between LUNs and individual servers and is used to further limit what LUNs are presented to a host.

"Zoning is controlling which HBAs can see which array service processors through the switch fabric. LUN masking is controlling what the service processors tell the host with regard to the LUNs that they can provide. In other words, the storage administrator can configure the service processor to lie about which LUNs are accessible.

"LUN masking is the ability of a host or an array to intentionally ignore WWNs that it can actively see (in other words, that are zoned to it)."

Reference

Zoning

Zoning

Zoning is a logical separation of  traffic between host and resources. A SAN zone is similar to an Ethernet VLAN.  It creates a logical, exclusive path between nodes on the SAN.

The SAN  makes storage available to servers in the form of LUNs. The LUN is potentially  accessible by every server on the SAN. In almost every case, having a LUN  accessible by multiple servers can create problems such as data corruption as  multiple servers contend for the same disk resources. To minimize such issues,  zoning and or LUN masking can be employed to isolate and protect SAN storage  devices. Zoning and LUN masking allow the administrator to dedicate storage  devices on the SAN to specific server(s).

A SAN is populated by nodes. Nodes can be either  servers or storage devices. Servers are typically referred to as initiators,  storage devices typically are the targets. Zoning creates a relationship  between initiator and target nodes on the SAN. With zoning, you create  relationships that map initiators to targets.


“Zoning lets you isolate a single server to a group of storage devices or a single storage  device, or associate a grouping of multiple servers with one or more storage  devices, as might be needed in a server cluster deployment.”

“Zoning is  implemented at the SAN switch level either on a port basis (hard-zoning) or on  a World-Wide Name (WWN) basis (soft-zoning).”

Soft-zoning  controls which WWNs can see which other WWNs through the switch.
Hard-zones are port-based and determine which ports of the switch will be  connected to storage processors.

Soft-Zoning
  • Also  known as name server zoning
  • Implemented  at the switch level
  • Allows  access to the node via any port on the switch
SAN zone  have been described with analogies such as:

  • A  zone as a container, into which you place a set of SCSI initiators (HBAs) and a  set of SCSI targets (array ports).
  • A  zone is like “laying out the roads on a map: it defines where traffic is permitted  to flow.”
Zoning and  masking are two different methods of accomplishing the same thing. I.e. to  prevent or minimize the chance that a LUN is accessed by unauthorized hosts and  that the data on them is protected.

Reference:

June 16, 2015

Disk Shares

Disk shares

"Proportional share" method.

If multiple VMs access the same VMFS datastore and the same logical unit number (LUN), there may be contention as they try to access the same virtual disk resource at the same time. Under certain conditions, the administrator may need to prioritize disk access for specific virtual machines; this can be done using disk shares.

If you want to give priority to specific VMs when there is access contention, you can do so using disk shares.

Using disk shares, the administrator can ensure that the more important virtual machines get preference over less important virtual machines for I/O bandwidth allocation.

“Shares is a value that represents the relative metric for controlling disk bandwidth to all virtual machines. The values are compared to the sum of all shares of all virtual machines on the server.”

“Disk shares are relevant only within a given host. The shares assigned to virtual machines on one host have no effect on virtual machines on other hosts.”

Disk shares control the I/O operations per second (IOPS) that the virtual disk is allowed to take from the physical resource. The values are Low, Normal, High, or Custom, with the following relative share values:
  • High:  2,000 shares
  • Normal:  1,000 shares
  • Low:  500 shares
The more “shares” a virtual machine is assigned, the more priority it will have for access to the physical disk resource, when there is contention.

To allocate the host disk’s I/O bandwidth to the virtual hard disk of a virtual machine via vSphere Web Client:

1 Right-click a virtual machine in the inventory and select Edit Settings.
2 On the Virtual Hardware tab, expand Hard disk to view the disk options.
3 In the Shares drop-down menu, select a value for the shares to allocate to the virtual machine.
4 If you selected Custom, enter a number of shares in the text box.
5 In the Limit - IOPs box, enter the upper limit of storage resources to allocate to the virtual machine, or selectUnlimited.
6 Click OK.

Reference:

Virtual Machine Storage Policies

VM Storage Policies

Virtual machine storage policies enable the administrator to define storage requirements for the virtual machine and determine:
  • Which storage/datastore is provided for the virtual machine
  • How the virtual machine is placed within the storage
  • Which data services are offered to the virtual machine
Storage policies define the storage requirements for the virtual machine, or more specifically, they define the storage requirements for the applications running in the virtual machine. Applying a storage policy to a virtual machine determines whether or not the datastore meets all the requirements of the VM as defined by the storage policy.

Storage policies identify the appropriate storage to use for a given virtual machine.

“In software-defined storage environments, such as Virtual SAN and Virtual Volumes, the storage policy also determines how the virtual machine storage objects are provisioned and allocated within the storage resource to guarantee the required level of service.”

“The virtual machine home files (.vmx, .vmsd, .nvram, .log, and so on) and the virtual disks (.vmdk) can have separate storage policies.”

As of vSphere 5.5 virtual machine storage profiles are called virtual machine storage policies.
Storage Policy are based on SLA, performance, and other metrics which are used during provisioning, cloning, Storage vMotion, and Storage DRS. It leverages VASA for metrics and characterization and supports all arrays in the hardware configuration list, regardless of whether they’re NFS, iSCSI, or FC. It also enables compliance status reporting in vCenter.

VMware vStorage APIs – Storage Awareness (VASA) is an API that allows vSphere visibility into the storage array to query storage configurations and to set storage properties for compliant arrays.
VASA is a set of APIs that a storage vendor can use to advertise information about their storage array. VASA can advertise that an array volume is a RAID-5 disk set, its health status, and whether any disks in the LUN have failed.

“Storage vendors use VASA to provide vSphere with information about specific disk arrays for tighter integration between storage and the virtual infrastructure. The shared information includes details on storage virtualization, including health status, configuration, capacity and thin provisioning.”

The goal of VASA is to enable storage array to advertise its capabilities. Applications such as vCenter can query the APIs and use the result to define VM storage policies. VASA provides vCenter visibility into the storage array and expose informationn on array features such as:
  • Snapshot
  • Deduplication
  • Replication state
  • RAID levels
  • Disk provisioning
And status information such as:
  • Health
  • Alerts
Using Storage Policies, storage capabilities can be described in terms of Capacity, Performance, Fault tolerance, Replication etc. These capability information is provided either by the storage vendor via “vSphere Storage APIs – Storage Awareness” or VASA or is manually defined by the administrator as user-defined tags.

Capabilities are attributes of the storage. They are accessed via VASA or from user defined tags.
Profiles describe what capabilities a virtual machine requires for storage. If a datastore meets all the requirements of a VM or virtual disk, it is said to be compliant with the VM.

Using Policy-Driven Storage, various storage characteristics, can be specified in a virtual machine storage policy. These policies are used during provisioning, cloning and Storage vMotion to ensure that only those datastores or datastore clusters meet the VM’s storage policy requirements are used. It leverages VASA for array charactrization and requires an enterprise plus license.

Policy-Driven Storage allows storage tiers to be defined in vCenter.

Policy-driven storage allows a vSphere administrator to create storage policies that describe the storage requirements for a VM in terms of storage attributes and capabilities. Storage policies simplify the process of choosing datastores that meet the storage requirements of a virtual machine.

“Once a VM is up and running, vCenter monitors and sends an alert if a VM happens to be in breach of the assigned storage policy."

Reference:

Virtual Disk Alignment

Align Virtual Disks

In a properly aligned storage architecture, the units of data in the various storage layers are aligned in such a way as to maximize I/O efficiency. In an unaligned storage architecture, accessing a file from the OS layer results in extra I/O operations at the other storage layers.

Shown here is a properly aligned structure with Windows guest OS clusters, VMFS blocks and SAN chunks.


I/O access at the guest OS layer results in a minimum amount of I/O access at the other layers: VMFS and SAN.

I/O operation on Cluster 1 results in I/O operations of one block at the VMFS layer and one chunk at the SAN layer. No extra SAN data access required.
In an unaligned structure the units of data at the other layers are not laid out on even boundaries.

I/O operation on a single cluster may result in many additional I/O operations at the other storage layers.

I/O access (Cluster 2) from the guest OS layer results in extra I/O access at the other layers: VMFS (two Blocks) and SAN (two Chunks). Reading one cluster at the guest OS layer results in reading multiple blocks and chunks.

"An unaligned architecture incurs latency and throughput penalties. The additional I/O (especially if small) can impact system resources significantly on some host types."

Virtual disk alignment issues occur when the starting offset of the VMFS partition does not align with the physical segmentation of the underlying disks.

"Using the vSphere Client to create VMFS partitions avoids this problem since, beginning with ESXi 5.0, it automatically aligns VMFS3 or VMFS5 partitions along the 1MB boundary."

The purpose of alignment is to minimize extraneous internal array operations. All arrays have internal constructs that are generally a function of the RAID model (and also the filesystem alignment, and in some cases logical page table constructs in virtually provisioned models).

"The improper alignment of VMFS file system partitions may impact performance. The recommended practice is to add VMFS storage to ESXi hosts using the vSphere Client, as it automatically aligns VMFS partitions when it creates them. For ESXi 5, VMFS3 and VMFS5 file systems that are created using the vSphere Client are automatically aligned on a 1 MB boundary.

VMFS3 file systems created with a previous version of ESX/ESXi used 64 KB alignment. Partitions that are created using vmkfstools may be aligned manually using the partedUtil tool from the command line. For detailed instructions on using partedUtil, refer to the VMware Knowledge Base entry: http://kb.vmware.com/kb/1036609."

Reference:

June 10, 2015

VSAN - Virtual SAN

VSAN

VMware Virtual SAN abstracts and pools server-side flash and disk into shared pools of capacity with policy-based management and application-centric data services.

Virtual SAN pools server-attached hard disk drives and flash (SSDs, and PCI-e) drives to create a distributed shared datastore that abstracts the storage hardware and provides a Software-Defined Storage (SDS) tier for virtual machines.

VSAN leverages local storage from a number of ESXi hosts in a cluster and creates a distributed shared datastore. This shared datastore can be used for VM placement and core vSphere technologies such as vMotion, DRS, VMware Site Recovery Manager, etc.

“VSAN leverages the power of any solid state drives (SSDs) on the hosts in the cluster for read caching and write buffering to improve performance."

In a hybrid architecture VSAN has both Flash (SSDs and PCI-e) and hard disk drive (HDD) devices. The Flash devices are utilized as a read cache and the HDDs are pooled to ceate a distributed shared datastore.

For applications that require high performance, Virtual SAN 6.0 can be deployed in an All-Flash storage architecture in which flash-based devices are intelligently utilized only as a write cache while other flash-based devices provide high endurance for data persistence.
VSAN
  • Introduced with vSphere 5.5
  • Creates Virtual SAN shared datastore
  • Embedded in the hypervisor kernel
  • Hybrid or all-Flash architecture
  • Flash provides caching
  • Managed via the vSphere Web Client
  • Per-VM policies
  • Minimum of three hosts per VSAN cluster
  • Maximum of eight hosts per VSAN cluster
  • Requires at least one VMkernel port per host
  • One or more SSDs per host
  • One or more HDDs per host
  • Uses a proprietary protocol of IP over a VMKernel port to communicate between the nodes
  • 1 Gbps network between hosts (10 Gbps recommended)
VSAN uses the SSD as a read/write cache, the capacity of the SSD is not actually added to the overall usable space of the VSAN datastore.

Caching: When blocks are written to the underlying datastore, they are first written to the SSDs, and later the data can be relocated to the HDDs.

VSAN uses "RAIN, or Reliable Array of Independent Nodes" instead of RAID. With RAIN, all the disks across all the hosts are combined into a large pool of storage. This storage is accessible across all the hosts in the cluster. RAIN enables HA across disk or host failures.

VSA (vSphere Storage Appliance) runs in a VM while VMware’s Virtual SAN (VSAN) is integrated into vSphere hypervisor (ESXi).

vSphere Storage Appliance VMware Virtual SAN
Description Low cost, simple shared storage for small deployments Scale-out distributed storage designed for virtualized/cloud environments
Form Factor Virtual Appliance Built into vSphere kernel
Scalability 2 to 3 vSphere servers
Does not scale beyond 3 hosts
Minimum 3 node deployment
Scale out to 8 nodes
Performance No SSD requirement SSD Caching requirement

"VSAN employs algorithms to help protect against data loss, such as ensuring that the data exists on multiple participating VSAN nodes at the same time."

VSAN is conceptually similar to the VSA, however VSAN provides better scaling, does not require NFS mounts, is embedded into the hypervisor, thus eliminating the complexity of deploying and managing a virtual appliance.

Reference:

VSA - vSphere Storage Appliance

VSA

"Shared Storage for Everyone"

The vSphere Storage Appliance (VSA) allows local storage on ESXi hosts to be used as shared storage, enabling the use of many storage-dependent virtualization features, such as vMotion, distributed resource scheduling (DRS), and high availability (HA), without the need for a SAN.

VSA also functions as a storage cluster, providing continued availability of all the data it stores even if one node in the cluster fails. The VSA is a VM on each host in the cluster. If one host fails, the VSA fails over automatically to one of the other hosts.

vSphere Storage Appliance (VSA) is VMware software that transforms existing, local server storage into shared storage that can be shared by up to three vSphere hosts.

vSphere Storage Appliance VMware Virtual SAN
Description Low cost, simple shared storage for small deployments Scale-out distributed storage designed for virtualized/cloud environments
Form Factor Virtual Appliance Built-in vSphere kernel
Scalability
  • 2 to 3 vSphere servers
  • Does not scale beyond 3 hosts
  • Minimum 3 hosts deployments
  • Scale out to vSphere cluster size
Performance No SSD requirement SSD Caching (high performance)

vSphere Storage Appliance (VSA) takes local storage and presents it back to ESXi hosts as a shared NFS mount.

VSA cluster with 3 members has 3 VSA datastores and maintains a replica of each datastore
VMware announced that the End of Availability of all vSphere Storage Appliance versions was April 1, 2014.


Product Release
General Availability End of General Support End of Technical Guidance End of Availability
VMware vSphere Storage Appliance 1.0 2011/08/24 2013/08/24 N/A 2014/04/01
VMware vSphere Storage Appliance 5.1 2012/09/10 2014/09/10 N/A 2014/04/01
VMware vSphere Storage Appliance 5.5 2013/09/19 2018/09/19 2020/09/19 2014/04/01

VSA
  • A VM deployed from an OVF
  • Leverages existing internal local direct attached storage (DAS) to present an NFS volume as shared storage to the VSA cluster
  • Uses Raid 1 across different nodes and Raid 10 within each node
  • The VSA is run on each host in the cluster
  • Clusters storage across server nodes
  • Designed for small environments without shared storage
  • End of Availability for vSphere Storage Appliance reached April 1, 2014
Reference:

Array & Virtual Disk Thin Provisioning

Array and Virtual Disk Thin Provisioning

Array Thin Provisioning enables the creation of a datastore that is logically larger than what the array can support. The consequence is there may not be enough physical space available when needed.
Array thin provisioning is done in the storage arrays before and/or independent of the virtualization layer.

Array thin provisioning allows the organization to maximize space utilization and delay the purchase of additional capacity. It minimizes CAPEX.

Array Thin Provisioning:
  • You can overallocate or oversubscribe the storage by allowing a server to claim more storage than has physically been set aside for it.

    This increases flexibility when you don’t know which hosts will grow, yet you are sure there will be growth
  • Physical storage capacity is dedicated to each host only when data is actually written to the disk blocks

Virtual Disk Thin Provisioning

"Virtual disk thin provisioning controls how much of the datastore capacity will actually be dedicated to the VM’s disk. If you elect to thin provision a VM’s disk, the size of the disk will indicate how much of the datastore is dedicated to it, but only the amount that is written to it will be subtracted from the datastore capacity.”

The VMDKs on NFS datastores in vSphere are always thin provisioned.

VMware vs. Array?

Thin Provisioning - Should it be done in the Array or in VMware? The general answer is that both are right.

If your array supports thin provisioning, it's generally more efficient to use array-level thin provisioning in most operational models.

Thin provisioning tends to be more efficient the larger the scale of the thin pool.
On an array, the pool tends to be larger than a single datastore and therefore more efficient because thin provisioning is more efficient at larger scales.

Is there a downside to thin on thin? Thin-on-thin is the concept of thin provisioning at both the array and the vritualization layer. One consideration is the need for increased monitoring as the consequence of running out of space without notice can be disasterous. Careful monitoring of disk usage at both the vSphere layer and the storage layer is a requirement.

VMware oversubscribes guest memory, however it also provides the ability to use VM swap as a backdoor. There is no backdoor with Thin Provisioned disks.

Thin Provisioning Concerns

There are a few concerns with Thin Provisioning.
  • Space Management: Running out of space on a device that is Thin Provisioned at the back-end is a major concern. Starting at vSphere 5.0 a number of enhancements were made through VAAI to provide notification of space issues:
    • "VAAI will now automatically raise an alarm in vSphere if a Thin Provisioned datastore becomes 75% full"
    • Only VMs which require additional disk space on datastores with no more space are paused, unlike before when all VMs on the datastore were paused
    • Storage DRS is effectively disabled on a datastore if the 75% alarm is triggered on it
  • Dead space reclamation: The inability to reuse space on Thin Provisioned datastore.

    "Prior to vSphere 5.0, if a VM's file are deleted or if a VM is Storage vMotioned, there was no way to  inform the array that the disk space was no longer being used. At 5.0, a new VAAI primitive called UNMAP was introduced. It informs the array about blocks that are no longer used."
  • Metadata updates:

    "If the VMDK is provisioned as thin, then each time the VMDK grows (new blocks added), the VMFS datastore would have to be locked so that it's metadata could be updated with the new size information. Historically this was done with SCSI reservations, and could cause some performance related issues if a lot of thinly provisioned VMs were growing at the same time. VAAI and the Atomic Test & Set primitive (ATS) replaced SCSI reservations for metadata updates, minimizing the performance concerns with metadata updates."
Reference:

Thin Provisioning - Provisioning - Storage Features

Thin Provisioning

Array Thin Provisioning allows you to create a datastore that is logically larger than what the array can actually support.

In a general sense, thin provisioning of disks allows you to overpromise what you can possibly deliver.

"Space required for thin-provisioned virtual disk is allocated and zeroed on demand as the space is used. Unused space is available for use by other virtual machines."

For example, if an administrator allocates 200 GB to a new virtual machine, and if the virtual machine uses only 40 GB, the remaining 160 GB are available for allocation to other virtual machines. As a virtual machine requires more space, vSphere provides additional blocks (if available) to it up to the originally allocated size, 200 GB in this case.

By using thin provisioning, administrators can create virtual machines with virtual disks of a size that is necessary in the long-term without having to immediately commit the total disk space that is necessary to support that allocation.

“The thin provisioning format is similar to the lazy-zeroed format in that the blocks and pointers are not zeroed or formatted on the storage area at the time of creation. In addition, the blocks used by the virtual disk are not preallocated for the VMFS datastore at the time of creation. When storage capacity is required by the virtual disk, the VMDK allocates storage in chunks equal to the size of the file system block.”

"As I/O occurs in the guest, the VMkernel zeroes out the space needed right before the guest I/O is committed and grows the VMDK file similarly. Sometimes, this is referred to as a sparse file. Note that space deleted from the guest OS's file system won't necessarily be released from the VMDK; if you added 50 GB of data and then deleted 50 GB of data, the space wouldn't necessarily be released to the hypervisor so that the VMDK can shrink in size."

The T10 SCSI UNMAP command is needed to address this situation.

Thin provisioned:
  • Supported at VMware vSphere 4.0 and later
  • Space required for a thin-provisioned virtual disk is allocated and zeroed upon first write, as opposed to upon creation.
  • There is a higher I/O cost (similar to that of lazy-zeroed thick disks) during the first write to an unwritten file block.
  • The use of VAAI-capable SAN storage can improve thin-provisioned disk first-time-write performance by improving file locking capability and offloading zeroing operations to the storage array.
  • If necessary, the disk can be manually converted to a Thick Disk later.
  • Thin provisioning provides storage on demand, and the amount of space consumed by the virtual disk on the VMFS datastore grows as data is written to the disk.
  • Thin-provisioning must be carefully managed, as multiple virtual machines may be using thin provisioned disks on the same VMFS datastore.
  • Results in CAPEX savings (no need to purchase additional disk space)
Reference: