Virtual Ascetic

everything about virtualization, storage, and technology

  • Home
  • About

Your likes and shares help increase my enthusiasm to share knowledge. Please leave a comment if you like any post

vMotion enhancements in vSphere 6

September 2, 2015 By asceticadmin

 

Long distance vMotion – now supported over network latency upto 150 ms – support for Geo distances

In vSphere 5.5 it was 10 ms latency

 

Standard vMotion guarantee – 1 sec execution switchover time

Batched RPCs, tcp congestion window handoff

 

vMotion – had to find a way to bypass TCP delays

  • Used control channel to send all content, data channel was used to send VM data

 

Very large bandwidth delay product – tcp performs poorly

  • Revised tcp algorithm – condition control algorithm not suited for large bandwidth delay projec
  • Changed the delay product to HSTCP

 

Packet loss was another concern –

  • Avoided packet drop within ESX host by adding mechanisms like flow control
  • Helped target packet loss rate reduce to 0.01%

 

What if the ESX hosts are in different subnets and you want to do long distance vMotion

vMotion across L3 network

  • Use ESX tcp/ip network stack virtualization (an ESX host can have multiple default gateways) – default gateways of the management network.
  • vMotion gets its own network stack instance (separate from management network)
  • Use default gateway for vMotion to go across L3 (and go across subnets)

VM network still requires L2 network stretching

DRS cluster requirements

  • All ESX hosts on L2 or all on L3 vMotion network
  • Mixed mode of L2 and L3 not supported

L2 adjacency limitations

  • Fault tolerance not supported
  • DPM using Wake on LAN requires subnet directed broadcast (if you are not different subnets then routers filter out broadcast traffic).

What if ESX hosts have different virtual switches

  • vSphere standard switch (VSS)
  • vSphere Distributed switch (VDS)

Ability to go from one vds to another vds – transfers all properties (e.g. Network io control, bandwidth restrictions etc)

Don’t support downgrade from vds to standard switch – since standard switch does not have vds type capabilities

 

What if ESX hosts are managed by different vCenters

In Vsphere 6 there is ability to vMotion across vCenters

  • Simultaneously change compute, storage, networks, and vCenter servers
  • Leverage vMotion without shared storage

Works with local, metro, and long distance vMotion

Preserve instance UUID and bios UUID (VM UUID)

Preserve VM historical data

  • Events, alarms, task history (pulled from original vcenter – not moved to new vCenter)
  • Preserve all HA and DRS properties, affinity/anti-affinity rules

SSO domain support

  • Vsphere web client requires same SSO domain
  • API support across SSO domains

 

What if you can take advantage of replication to the other site

Replication assisted vMotion

vSphere 6 supports Active/Active Async storage replication

  • Disk copy takes majority of migration time
  • Use replication to avoid disk copy
  • Leverage virtual volume (VVOL) technology

VVOL are primary unit of data management going forward – storage array knows vvol mapping to vm

Secondary site storage array promotes LUN containing replicated data

Active – active async replciation

  • Switch replication mode to sync
  • Migration start
    • Prepare destination ESX host for VVOL binding
    • Switch from async to sync replication
  • Migration end
    • Complete vvol binding on esx hosts
    • Switch back to async replication

vSphere 6.0 vMotion features interop

 

What if you have 40GBE NICs for vMotion network (vMotion performance and scalability improvements)

vMotion scalability

-Rearchitect vMotion to saturate 40GbE NIC

  • Zero copy transmission
  • New threading model for better CPU utilization on the receive side
  • Reduce locking

For e.g in maintenance mode – we may have to move 400gb memory from one host to another host. The entire maintenance mode will take only 5 mins on 40GbE env.

Reduce vMotion execution switchover time (improved)

  • Constant time VM memory metadata updates
  • Not a function of VM memory size

Reduce stack overhead

  • Improve VM power on time (all power on optimizations have been optimized)

Performance and Debugging

How to Gauge vMotion performance

  • Migration time (memory, disk, total)
  • Switchover time
  • Impact on guest applications
    • Application latency and throughput during vMotion
    • Time to resume to normal level of performance after migration

 

Monster VM vMotion performance

  • In vsphere 5.5 each vMotion was by default assigned 2 helper threads – running on 2 cores .. Could only go 20GB/sec
  • By increasing number of helper threads (tuned ESX) – throughput increased slightly
  • In 60Gb scenario the increase of helper threads does not help futher
  • In vsphere 6.0 locking has been removed and thus dynamically creates appropriate number of tcp channels and helper thread is automatically created. Thus performance improves significantly without any tuning

 

Debugging tips

Each vMotion has a unique id associated with it called Migration ID

Grep that migration id since it is unique across both source and destination hosts

From web client – select VM – and go to tasks

See the high level details in the task info – Vmware is adding migration id there

VPXD – find operation id of vMotion (in VPXD logs)

 

What’s next for vMotion

Cross cloud vMotion

  • vMotion between vCLoud air (vCA) and on-premise datacenter
  • No vendor lock-in; vMotion to vCA and from vCA to on-premise
  • Support for vSphere 5.5 for on-premise version (will be backwards compatible)

 

Non-volatile Memory (NVM) – disks, ssd’s

  • NVM resides in a Dual Inline Memory Module (DIMM)
  • Exposed as Memory and Virtual disk to VMs (persistent memory and disks through SCSI card)
  • Enable vMotion for VMs and NVM
  • Explore NVM to improve vMotion performance and scalability

 

Active/Passive Storage Replciation

  • Leverage broad partner ecosystem to optimize disk copy
  • VVOL required to reverse replication direction after vMotion
  • vMotion support for RDMA (don’t have to use CPU and can use RDMA for performance improvement instead)

 

Conclusions

 

Vsphere 6 vMotion is a big step towards vMotion anywhere

  • Cross geo boundaries
  • Cross management boundaries
  • Cross cloud vMotion

 

Filed Under: VMWorld Tagged With: 150 ms, 2015, anil sedha, control channel, cross cloud vMotion, EMCElect, enhancements, latency, long distance, now supports, standard switch, vExpert, virtual distributed switch, vmotion, vmware, vsphere 6

Let’s Get Social

  • Google+
  • Linkedin
  • Twitter


Recent Posts

  • VMware’s Photon OS 1.0 Now Available!
  • VMware’s Photon OS 1.0 Now Available!
  • VMware’s Photon OS 1.0 Now Available!
  • VMware’s Photon OS 1.0 Now Available!
  • VMware’s Photon OS 1.0 Now Available!

Archives

  • June 2016
  • May 2016
  • April 2016
  • February 2016
  • September 2015
  • August 2015
  • February 2015
  • January 2015
  • October 2014
  • September 2014
  • August 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • January 2014
  • October 2013
  • September 2013
  • August 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • November 2012
  • October 2012