June 10, 2015

VSAN - Virtual SAN


VMware Virtual SAN abstracts and pools server-side flash and disk into shared pools of capacity with policy-based management and application-centric data services.

Virtual SAN pools server-attached hard disk drives and flash (SSDs, and PCI-e) drives to create a distributed shared datastore that abstracts the storage hardware and provides a Software-Defined Storage (SDS) tier for virtual machines.

VSAN leverages local storage from a number of ESXi hosts in a cluster and creates a distributed shared datastore. This shared datastore can be used for VM placement and core vSphere technologies such as vMotion, DRS, VMware Site Recovery Manager, etc.

“VSAN leverages the power of any solid state drives (SSDs) on the hosts in the cluster for read caching and write buffering to improve performance."

In a hybrid architecture VSAN has both Flash (SSDs and PCI-e) and hard disk drive (HDD) devices. The Flash devices are utilized as a read cache and the HDDs are pooled to ceate a distributed shared datastore.

For applications that require high performance, Virtual SAN 6.0 can be deployed in an All-Flash storage architecture in which flash-based devices are intelligently utilized only as a write cache while other flash-based devices provide high endurance for data persistence.
  • Introduced with vSphere 5.5
  • Creates Virtual SAN shared datastore
  • Embedded in the hypervisor kernel
  • Hybrid or all-Flash architecture
  • Flash provides caching
  • Managed via the vSphere Web Client
  • Per-VM policies
  • Minimum of three hosts per VSAN cluster
  • Maximum of eight hosts per VSAN cluster
  • Requires at least one VMkernel port per host
  • One or more SSDs per host
  • One or more HDDs per host
  • Uses a proprietary protocol of IP over a VMKernel port to communicate between the nodes
  • 1 Gbps network between hosts (10 Gbps recommended)
VSAN uses the SSD as a read/write cache, the capacity of the SSD is not actually added to the overall usable space of the VSAN datastore.

Caching: When blocks are written to the underlying datastore, they are first written to the SSDs, and later the data can be relocated to the HDDs.

VSAN uses "RAIN, or Reliable Array of Independent Nodes" instead of RAID. With RAIN, all the disks across all the hosts are combined into a large pool of storage. This storage is accessible across all the hosts in the cluster. RAIN enables HA across disk or host failures.

VSA (vSphere Storage Appliance) runs in a VM while VMware’s Virtual SAN (VSAN) is integrated into vSphere hypervisor (ESXi).

vSphere Storage Appliance VMware Virtual SAN
Description Low cost, simple shared storage for small deployments Scale-out distributed storage designed for virtualized/cloud environments
Form Factor Virtual Appliance Built into vSphere kernel
Scalability 2 to 3 vSphere servers
Does not scale beyond 3 hosts
Minimum 3 node deployment
Scale out to 8 nodes
Performance No SSD requirement SSD Caching requirement

"VSAN employs algorithms to help protect against data loss, such as ensuring that the data exists on multiple participating VSAN nodes at the same time."

VSAN is conceptually similar to the VSA, however VSAN provides better scaling, does not require NFS mounts, is embedded into the hypervisor, thus eliminating the complexity of deploying and managing a virtual appliance.


VSA - vSphere Storage Appliance


"Shared Storage for Everyone"

The vSphere Storage Appliance (VSA) allows local storage on ESXi hosts to be used as shared storage, enabling the use of many storage-dependent virtualization features, such as vMotion, distributed resource scheduling (DRS), and high availability (HA), without the need for a SAN.

VSA also functions as a storage cluster, providing continued availability of all the data it stores even if one node in the cluster fails. The VSA is a VM on each host in the cluster. If one host fails, the VSA fails over automatically to one of the other hosts.

vSphere Storage Appliance (VSA) is VMware software that transforms existing, local server storage into shared storage that can be shared by up to three vSphere hosts.

vSphere Storage Appliance VMware Virtual SAN
Description Low cost, simple shared storage for small deployments Scale-out distributed storage designed for virtualized/cloud environments
Form Factor Virtual Appliance Built-in vSphere kernel
  • 2 to 3 vSphere servers
  • Does not scale beyond 3 hosts
  • Minimum 3 hosts deployments
  • Scale out to vSphere cluster size
Performance No SSD requirement SSD Caching (high performance)

vSphere Storage Appliance (VSA) takes local storage and presents it back to ESXi hosts as a shared NFS mount.

VSA cluster with 3 members has 3 VSA datastores and maintains a replica of each datastore
VMware announced that the End of Availability of all vSphere Storage Appliance versions was April 1, 2014.

Product Release
General Availability End of General Support End of Technical Guidance End of Availability
VMware vSphere Storage Appliance 1.0 2011/08/24 2013/08/24 N/A 2014/04/01
VMware vSphere Storage Appliance 5.1 2012/09/10 2014/09/10 N/A 2014/04/01
VMware vSphere Storage Appliance 5.5 2013/09/19 2018/09/19 2020/09/19 2014/04/01

  • A VM deployed from an OVF
  • Leverages existing internal local direct attached storage (DAS) to present an NFS volume as shared storage to the VSA cluster
  • Uses Raid 1 across different nodes and Raid 10 within each node
  • The VSA is run on each host in the cluster
  • Clusters storage across server nodes
  • Designed for small environments without shared storage
  • End of Availability for vSphere Storage Appliance reached April 1, 2014

Array & Virtual Disk Thin Provisioning

Array and Virtual Disk Thin Provisioning

Array Thin Provisioning enables the creation of a datastore that is logically larger than what the array can support. The consequence is there may not be enough physical space available when needed.
Array thin provisioning is done in the storage arrays before and/or independent of the virtualization layer.

Array thin provisioning allows the organization to maximize space utilization and delay the purchase of additional capacity. It minimizes CAPEX.

Array Thin Provisioning:
  • You can overallocate or oversubscribe the storage by allowing a server to claim more storage than has physically been set aside for it.

    This increases flexibility when you don’t know which hosts will grow, yet you are sure there will be growth
  • Physical storage capacity is dedicated to each host only when data is actually written to the disk blocks

Virtual Disk Thin Provisioning

"Virtual disk thin provisioning controls how much of the datastore capacity will actually be dedicated to the VM’s disk. If you elect to thin provision a VM’s disk, the size of the disk will indicate how much of the datastore is dedicated to it, but only the amount that is written to it will be subtracted from the datastore capacity.”

The VMDKs on NFS datastores in vSphere are always thin provisioned.

VMware vs. Array?

Thin Provisioning - Should it be done in the Array or in VMware? The general answer is that both are right.

If your array supports thin provisioning, it's generally more efficient to use array-level thin provisioning in most operational models.

Thin provisioning tends to be more efficient the larger the scale of the thin pool.
On an array, the pool tends to be larger than a single datastore and therefore more efficient because thin provisioning is more efficient at larger scales.

Is there a downside to thin on thin? Thin-on-thin is the concept of thin provisioning at both the array and the vritualization layer. One consideration is the need for increased monitoring as the consequence of running out of space without notice can be disasterous. Careful monitoring of disk usage at both the vSphere layer and the storage layer is a requirement.

VMware oversubscribes guest memory, however it also provides the ability to use VM swap as a backdoor. There is no backdoor with Thin Provisioned disks.

Thin Provisioning Concerns

There are a few concerns with Thin Provisioning.
  • Space Management: Running out of space on a device that is Thin Provisioned at the back-end is a major concern. Starting at vSphere 5.0 a number of enhancements were made through VAAI to provide notification of space issues:
    • "VAAI will now automatically raise an alarm in vSphere if a Thin Provisioned datastore becomes 75% full"
    • Only VMs which require additional disk space on datastores with no more space are paused, unlike before when all VMs on the datastore were paused
    • Storage DRS is effectively disabled on a datastore if the 75% alarm is triggered on it
  • Dead space reclamation: The inability to reuse space on Thin Provisioned datastore.

    "Prior to vSphere 5.0, if a VM's file are deleted or if a VM is Storage vMotioned, there was no way to  inform the array that the disk space was no longer being used. At 5.0, a new VAAI primitive called UNMAP was introduced. It informs the array about blocks that are no longer used."
  • Metadata updates:

    "If the VMDK is provisioned as thin, then each time the VMDK grows (new blocks added), the VMFS datastore would have to be locked so that it's metadata could be updated with the new size information. Historically this was done with SCSI reservations, and could cause some performance related issues if a lot of thinly provisioned VMs were growing at the same time. VAAI and the Atomic Test & Set primitive (ATS) replaced SCSI reservations for metadata updates, minimizing the performance concerns with metadata updates."

Thin Provisioning - Provisioning - Storage Features

Thin Provisioning

Array Thin Provisioning allows you to create a datastore that is logically larger than what the array can actually support.

In a general sense, thin provisioning of disks allows you to overpromise what you can possibly deliver.

"Space required for thin-provisioned virtual disk is allocated and zeroed on demand as the space is used. Unused space is available for use by other virtual machines."

For example, if an administrator allocates 200 GB to a new virtual machine, and if the virtual machine uses only 40 GB, the remaining 160 GB are available for allocation to other virtual machines. As a virtual machine requires more space, vSphere provides additional blocks (if available) to it up to the originally allocated size, 200 GB in this case.

By using thin provisioning, administrators can create virtual machines with virtual disks of a size that is necessary in the long-term without having to immediately commit the total disk space that is necessary to support that allocation.

“The thin provisioning format is similar to the lazy-zeroed format in that the blocks and pointers are not zeroed or formatted on the storage area at the time of creation. In addition, the blocks used by the virtual disk are not preallocated for the VMFS datastore at the time of creation. When storage capacity is required by the virtual disk, the VMDK allocates storage in chunks equal to the size of the file system block.”

"As I/O occurs in the guest, the VMkernel zeroes out the space needed right before the guest I/O is committed and grows the VMDK file similarly. Sometimes, this is referred to as a sparse file. Note that space deleted from the guest OS's file system won't necessarily be released from the VMDK; if you added 50 GB of data and then deleted 50 GB of data, the space wouldn't necessarily be released to the hypervisor so that the VMDK can shrink in size."

The T10 SCSI UNMAP command is needed to address this situation.

Thin provisioned:
  • Supported at VMware vSphere 4.0 and later
  • Space required for a thin-provisioned virtual disk is allocated and zeroed upon first write, as opposed to upon creation.
  • There is a higher I/O cost (similar to that of lazy-zeroed thick disks) during the first write to an unwritten file block.
  • The use of VAAI-capable SAN storage can improve thin-provisioned disk first-time-write performance by improving file locking capability and offloading zeroing operations to the storage array.
  • If necessary, the disk can be manually converted to a Thick Disk later.
  • Thin provisioning provides storage on demand, and the amount of space consumed by the virtual disk on the VMFS datastore grows as data is written to the disk.
  • Thin-provisioning must be carefully managed, as multiple virtual machines may be using thin provisioned disks on the same VMFS datastore.
  • Results in CAPEX savings (no need to purchase additional disk space)

Thick Provisioning - Provisioning - Storage Features

Thick Provisioning

Thick provisioned disks
Thick virtual disks, which have all their space allocated at creation time, are further divided into two types: eager zeroed and lazy zeroed.

Lazy Zeroed

Thick (aka LazyZeroedThick)
A thick disk has all space allocated at creation time. This space may contain stale data on the physical media. Before writing to a new block a zero has to be written. The entire disk space is reserved and unavailable for use by other virtual machines.

"Disk blocks are only used on the back-end (array) when they get written to inside in the VM/Guest OS. Again, the Guest OS inside this VM thinks it has this maximum size from the start."

The blocks and pointers are allocated in the VMFS, and the blocks are allocated on the array at creation time. Also, the blocks are not zeroed or formatted on the array. This results in a fast creation time.

"At a later point in time when data needs to be written to the disk, the write process must pause while the blocks required to store the data on the storage array are zeroed out and allocated on the storage array.

This operation occurs every time a first-time-write needs to occur on any area of the disk that has not been written."

 Lazy-Zeroed Thick disk format is the default VMFS datastore virtual disk format.
Via vSphere Client
Via vSphere Web Client

"With Lazy-Zeroed Thick, the size of the VDMK file on the datastore is the size of the virtual disk that you create, but within the file, it is not pre-zeroed at the time of initial creation. As I/O occurs in the guest, the VMkernel zeroes out the space needed right before the guest I/O is committed, but the VDMK file size does not grow.”

Lazy-Zeroed Thick disk:
  • Sometimes referred to as a flat disk
  • Has all space allocated at the time of creation, but each block is zeroed only on first write.
  • Results in a shorter creation time, but reduced performance the first time a block is written to.
Use of VAAI-capable SAN or NAS storage can improve lazy-zeroed thick disk first-time-write performance by offloading zeroing operations to the storage array.

Eager Zeroed

Thick (aka EagerZeroedThick)
Eager-Zeroed Thick disks reserve space on the VMFS filesystem and zeroes out (wipes clean) the disk blocks at creation-time.

"This disk type may take a little longer to create as it zeroes out the blocks. However, if the array supports the VAAI Zero primitive which offloads the zero operation to the array, then the additional time to create the zeroed out VMDK should be minimal."

"The eager-zeroed thick virtual disk type is capable of providing better performance than a lazy-zeroed thick disk.

Like lazy-zeroed thick, space required for the virtual disk is allocated at creation time. However, the blocks and pointers on the virtual disk are preallocated and zeroed out when the virtual disk is created. Although this increases the virtual disk creation time, it improves the performance of the virtual disk during regular use.”

“If the array supports VAAI, vSphere can offload the up-front task of zeroing all the blocks and reduce the initial I/O and time requirements."

Eager zeroed:
  • An eager zeroed thick disk has all space allocated and wiped clean of any previous contents on the physical media at creation time.
  • This increases the time it takes to create the disk, but results in the best performance, even on the first write to each block.
  • The entire disk space is reserved and unavailable for use by other virtual machines.
  • The use of VAAI-capable SAN storage can speed up eager-zeroed thick disk creation by offloading zeroing operations to the storage array.

June 08, 2015

Virtual Disk Provisioning

Virtual Disk Provisioning

VMware vSphere virtual disks or VMDKs (virtual machine disks) can be provisioned in three different formats: Thin, Lazy-Zero Thick, or Eager-Zero Thick. The differences are in the way data is preallocated and whether blocks are zeroed at creation time or run-time.

With the exception of products such as VMware FT, Microsoft Clustering Service, certain appliances, etc., the choice of virtual disk formats is left to the administrator.

Lazy-zeroed Eager-zeroed
Creation time Fast Slow
(Faster with VAAI)
Zeroing file blocks File block is zeroed on write File block is zeroed when disk first created File block zeroed on write
Block allocation Fully preallocated on datastore Fully preallocated on datastore. File block allocated on write.

Thin-provisioning is a solution where the storage provider reports that the entire volume is available for use, however it only actually allocates physical space in small increments, as needed by the storage consumer. This is unlike with Thick-provisioning where the entire volume is allocated and dedicated to the consumer at creation time.

Thin-provisioning can be done either at the storage array and/or at the virtualization layer. Using both array and VMDK thin provisioning is not a best practice. Array Thin Provisioning allows you to create a datastore and report that all the request space is available for use when that may not be the case.

With array thin-provisioning, the provisioned space may be logically larger than what the array can actually support, a consequence being that it may not have the physical space to write to when needed. If you add virtual disk thin provisioning on top of that, and if you manage many datastores and LUNs, this could become very confusing and unmanageable over time.

"The three virtual machines (one thick and two thin) have disk sizes that together total 140 GB. The actual storage space available, however, is 100 GB. This size mismatch or overallocation is possible because the thin-provisioned disks take only (20 + 40) = 60 GB of actual disk space."

The 20 GB thick-provisioned disk has the entire 20 GB preallocated to it. The array reports 20 GB as available to the host and allocates/reserves the entire 20 GB just that host, whether or not it is actually used. The thin-provisioned disks, in contrast, are allocated only what they are actually used/written (20 GB and 40 GB).

  1. VM Performance on Flash Part 2: Thin vs. Thick Provisioning – Does it Matter?
  2. Performance Study of VMware vStorage Thin Provisioning – vSphere 4.0

Storage DRS

Storage DRS

"Storage DRS (SDRS) provides smart virtual machine placement and load balancing mechanisms based on I/O and space capacity. In other words, where Storage I/O Control (SIOC), introduced at vSphere 4.1, reactively throttles hosts and virtual machines to ensure fairness, SDRS proactively makes recommendations to prevent imbalances from both a space utilization and latency perspective. More simply, SDRS does for storage what DRS does for compute resources."

Create datastore cluster via vSphere Client:

Create datastore cluster via vSphere Web Client:

 "SDRS offers five key featues:1
  • Resource aggregation - grouping of multiple datastores, into a single, flexible pool of storage called a Datastore Cluster
  • Initial Placement - speed up the provisioning process by automating the selection of an individual datastore
  • Load Balancing - addresses imbalances within a datastore cluster
  • Datastore Maintenance - when enabled, all registered virtual machines on a datastore, are migrated to other datastores in the datastore cluster.
  • Affinity Rules - controls which virtual disks should or should not be placed on the same datastore within a datastore cluster"
Storage DRS enables more-efficient use of storage resources via VM placement and load balancing based on I/O and space usage. SDRS includes Datastore Clusters (grouping of datastores), Placement Recommendations and storage maintenance mode.

Recommendation is to initially use storage DRS in manual mode and Monitor and evaluate its recommendations for placement and capacity monitoring.

Load Balancing: VMware vSphere Storage DRS continuously balances storage space usage and storage I/O load while avoiding resource bottlenecks to meet application service levels. SDRS can operate in two distinct LB modes: No Automation (manual mode) or Fully Automated.

Continuous Monitoring: Storage DRS monitors storage space and I/O utilization across a pre-assigned datastore pool and aligns storage resources to meet your business growth needs.

Non-Disruptive Maintenance: When a vSphere administrator places a Storage DRS-enabled datastore cluster in Maintenance Mode, Storage DRS moves virtual machine disk files to another datastore. Typical use cases are data migration to a new storage array or maintenance on a LUN, such as migration to another RAID group.

Placement recommendations: Where virtual disks can be migrated


Storage vMotion

Storage vMotion

VMware vSphere Storage vMotion facilitates the live migration of virtual machine files from one datastore to another without service interruption.

Storage vMotion is to virtual machine files as standard vMotion is to virtual machine running instances. 

Using Storage vMotion, a virtual machine can be migrated from one datastore to another while the virtual machine is running, i.e. without downtime.

A virtual machine and all its virtual disks may be stored in a single location, or in separate locations.  Storage vMotion offers an attractive method for migrating virtual machine data from one set of storage to another, as the virtual machine can remain running while the data movement happens in the background without involvement or even awareness of the virtual machine’s OS.

The virtual machine itself does not change hosts during a migration with Storage vMotion, only the VM virtual disks are migrated.

VMware Storage vMotion describes the process by which files that make up the virtual machine (VM), e.g. meta-data files (.vmx, .log, .vswp, etc.) and virtual disk files (.VMDK) are relocated from one datastore to another without powering off the VM and without disconnecting users.

I.e. there is no downtime or service disruption on the running virtual machine. The process is completely transparent to the virtual machine or the end user.

VMware Storage VMotion:
  • provides an intuitive interface for live migration of virtual machine disk files
  • facilitates migration of VM files within and across storage arrays with no downtime or disruption in service
  • relocates virtual machine disk files from one shared storage location to another shared storage location with zero downtime, continuous service availability and complete transaction integrity
  • enables organizations to perform proactive storage migrations, simplify array migrations, improve virtual machine storage performance and free up valuable storage capacity.
Storage vMotion use cases include:
  • array maintenance – move virtual machine files off of arrays for array maintenance or upgrade
  • datastore maintenance – transform virtual disks from thick-provisioned to thin-provisioned or vice-versa, reclaim space, etc.
Over the different vSphere releases, VMware has deployed various mechanisms to migrate blocks from the source to the destination datastore. At vSphere 3.5, Storage vMotion used VM snapshots to migrate disks. In vSphere 4, VMware enhanced Storage vMotion by using the Changed Block Tracking (CBT) feature to track block changes during a Storage vMotion. At vSphere 5.0, VMware introduced to the Mirror Mode mechanism.

CBT used an iterative approach, after the initial copy, it checked to see if the blocks on the source datastore had been modified, if so, it copied over the “changed blocks”. It continued this iterative copy until it found no new modified blocks. At which time, vSphere would “stun” the VM and perform a switch-over from the source to the target datastore.

CBT kept track of blocks that were changed during the migration process and then applyed those changes to the destination disk. Mirror Mode mirrors every write from the source to destination disk while the migration is taking place.

With the Mirror Mode mechanism, the Mirror Driver is enabled on a VM that is being Storage vMotioned, and then the “Datamover” process is used to perform a single-pass block copy of the source disk to the destination disk. Mirror Driver mirrors (forks) any writes to the source AND destination disks. I.e. for any writes that go to the source disk, the same data is also sent to the destination disk. This removes the need for any iterative block copies.

Mirror Mode enables a single-pass block copy of the source disk to the destination disk by mirroring I/Os of copied blocks.

When both the source and the destination have acknowledged the write, the write will then be acknowledged to the virtual machine. Because of this, it is unnecessary to do re-iterative copies and the Storage vMotion process is more efficient with Mirror Mode.

vSphere Storage vMotion performs up to four parallel disk migrations per vSphere Storage vMotion operation.

Storage vMotion uses a synchronous mirroring approach to migrate a virtual disk from one datastore to another datastore on the same physical host. This is implemented by using two concurrent processes: a bulk copy process and an I/O mirroring process.
  • Bulk copy process
    • Clone
    • Single pass bulk copy of the disk blocks from the source to destination datastore
  • I/O mirroring process
    • Runs concurrently with the bulk copy process
    • Any virtual disk changes are copied/mirrored to both source and destination datastores
Storage vMotion mirrors I/O only to the disk region that has already been copied by the bulk copy process. Guest writes to a disk region that the bulk copy process has not yet copied are not mirrored because changes to this disk region will be copied by the bulk copy process eventually.

vSphere 5.1 (and later) vMotion follows an almost identical model for migrating a virtual disk, but uses a network transport for migrating the data.

Enhanced vMotion:
VMware vSphere 5.1 and later versions combine standard vMotion with VMware vSphere Storage vMotion in a single migration. This means you can live-migrate an entire virtual machine between hosts, between clusters or between physical data centers—without disruption or shared storage between the involved hosts. This capability to migrate virtual machines between hosts that do not share datastores can only be done using the vSphere Web Client.