Virtual Ascetic

everything about virtualization, storage, and technology

  • Home
  • About

Your likes and shares help increase my enthusiasm to share knowledge. Please leave a comment if you like any post

Monitoring and Troubleshooting Virtual SAN, Current and Future

September 3, 2015 By asceticadmin

Health Check Plugin in vSphere 6

Used for Virtual SAN in vSphere 6 – improved

Does 30 health checks – explains nature of failure and so on

vRealize Operations Management Pack for Storage Devices (MPSD) latest update released last week

vROps MPSD – Custom Dashboards

Build bespoke VSAN cluster info

Disk Group throughput

Log Insight – super tool for looking at logs generated by ESXi host and vSAN. Allows us to do analytics on logs – patterns, behaviours, intermittent issues

VSAN extension pack available with Log Insight

Real Life troubleshooting scenarios

We have realized there are gaps in Management story around VSAN – VMware has been working to provide right tools at the right places.

Low level – esxcli, rvc, observer

Need vSphere web client and vRealize as a standard way of consuming things. PowerCLI is also a tool in use

Things you probably want to check

  • components must be on HCL
  • Confirm network is good – e.g. Multicast
  • Make sure VMs can be deployed successfully
  • Test underlying storage components with a “stress test”
  • Inject failures, ensuring that VMs remain available
  • Test performance of Virtual SAN (VSAN)

 

Scenario#1

Assume HCL check fails.

Download the latest HCL file – maybe the device/driver/firmware support status has changed

 

Scenario#2

Run a storage performance test. This doubles as a stress test – VMware has released a tool called HCIbench – test performance of infrastructure

To test a bad drive don’t just remove it – run the special error injector with health check – simulate drive going bad – see POC guide

 

Scenario#3

  • Injecting failures in Virtual SAN can be done quite simply
    • Hosts (reboot or power off)
    • Network (disconnect uplinks or disable VSAN traffic service)
    • Disks (special error injector wit health check – simulate drive going bad – see POC guide)
  • Use the health check to understand failures

 

When you pull out a drive – VSAN knows it – VSAN will wait 60 minutes before remediation

When a drive fails – VSAN knows that and puts it in Degraded mode. VSAN will perform immediate remediation

VSAN requirements are three nodes – it can tolerate failure of one node. But if we do four nodes – we can have one node failure. But it protects us against one more node failure

So it is preferred that you start with 4 nodes for VSAN.

Multicast – Multicast configuration is the most common issue. When alarm in Proactive Tests shows status as failed then there is an issue with multicast. This proactive test will verify if multicast performance is acceptable for VSAN cluster

Sneak peak of ‘Performance Service’ – new tool. To be released hopefully in near futue

Stores history VSAN performance statistics

Stored on VSAN itself

Always-on

Fully Integrated – no need to install anything

Exposed via vSphere Web Client (and API)

Distributed architecture, built directly into ESXi

  • No network traffic going outside the cluster
  • No CPU/Memory usage in VC
  • Tiny impact on ESXi hosts

Benchmarked 50K IOPS (see image) using Performance Service

VSAN – When performing benchmark tests note that read cache in VSAN needs to warm up

Outstanding Ios (OIO) – not enough outstanding I/O for VM to push performance to its limits

You need to have enough VMs to perform a true performance test. We don’t run one giant VM but we run multiple VMs to ensure enough parallel I/O – that is what VSAN is built for.

Two new metrics in performance graphs – Delayed I/O percentage and Delayed I/O average latency

  • How many IOs were delayed
  • How many IOs did not make it to the pipeline

Filed Under: VMWorld Tagged With: and future, and troubleshooting, anil sedha, current, EMCElect, health check, IOs, monitoring, MPSD, plugin, SAN, sto6228, vExpert, virtual, vmware, vrops, vSAN, vsphere

Let’s Get Social

  • Google+
  • Linkedin
  • Twitter


Recent Posts

  • VMware’s Photon OS 1.0 Now Available!
  • VMware’s Photon OS 1.0 Now Available!
  • VMware’s Photon OS 1.0 Now Available!
  • VMware’s Photon OS 1.0 Now Available!
  • VMware’s Photon OS 1.0 Now Available!

Archives

  • June 2016
  • May 2016
  • April 2016
  • February 2016
  • September 2015
  • August 2015
  • February 2015
  • January 2015
  • October 2014
  • September 2014
  • August 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • January 2014
  • October 2013
  • September 2013
  • August 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • November 2012
  • October 2012