A healthier cluster begins with OPS: the Open vStorage Health Check

keep-calm-and-let-the-ops-team-handle-itWith more and more big size Open vStorage clusters being deployed, the Open vStorage Operations (OPS) team is tasked with monitoring more servers. In the rare case there is an issue with a cluster, the OPS team wants to get a quick idea of how serious the problems is. That is why the Open vStorage OPS team added another project to the GitHub repo: openvstorage-health-check.

The Open vStorage health check is a quick diagnostic tool to verify if all components on an Open vStorage node are working fine. It will for example check if all services and Arakoon databases are up and running, Memcache, RabbitMQ and Celery are behaving and if presets and backends are still operational.

Note that the health check is only a diagnostic tool. Hence it will not take any action to repair the cluster.

Get Started:

To install the Open vStorage health check on a node, execute:

apt-get install openvstorage-health-check

Next, run the health check by executing

ovs healthcheck

As always, this is work in progress so feel free to file a bug or a feature request for missing functionality. Pull Request are welcomed and will be accepted after careful review by the Open vStorage OPS team.

An example output of the Open vStorage health check:

root@perf-roub-04:~# ovs healthcheck
[INFO] Starting Open vStorage Health Check!
[INFO] ====================================
[INFO] Fetching LOCAL information of node:
[SUCCESS] Cluster ID: 3vvwuO9dd1S2sNIi
[SUCCESS] Hostname: perf-roub-04
[SUCCESS] Storagerouter ID: 6Y6uerfmfZaoZOCu
[SUCCESS] Storagerouter TYPE: EXTRA
[SUCCESS] Environment RELEASE: Fargo
[SUCCESS] Environment BRANCH: Unstable
[INFO] Checking LOCAL OVS services:
[SUCCESS] Service ‘ovs-albaproxy_geo-accel-alba’ is running!
[SUCCESS] Service ‘ovs-workers’ is running!
[SUCCESS] Service ‘ovs-watcher-framework’ is running!
[SUCCESS] Service ‘ovs-dtl_local-flash-roub’ is running!
[SUCCESS] Service ‘ovs-dtl_local-hdd-roub’ is running!

[INFO] Checking ALBA proxy ‘albaproxy_local-flash-roub’:
[SUCCESS] Namespace successfully created or already existed on proxy ‘albaproxy_local-flash-roub’ with preset ‘default’!
[SUCCESS] Creation of a object in namespace ‘ovs-healthcheck-ns-default’ on proxy ‘albaproxy_local-flash-roub’ with preset ‘default’ succeeded!
[SUCCESS] Namespace successfully created or already existed on proxy ‘albaproxy_local-flash-roub’ with preset ‘high’!
[SUCCESS] Creation of a object in namespace ‘ovs-healthcheck-ns-high’ on proxy ‘albaproxy_local-flash-roub’ with preset ‘high’ succeeded!
[SUCCESS] Namespace successfully created or already existed on proxy ‘albaproxy_local-flash-roub’ with preset ‘low’!
[SUCCESS] Creation of a object in namespace ‘ovs-healthcheck-ns-low’ on proxy ‘albaproxy_local-flash-roub’ with preset ‘low’ succeeded!
[INFO] Checking the ALBA ASDs …
[SKIPPED] Skipping ASD check because this is a EXTRA node …
[INFO] Recap of Health Check!
[INFO] ======================
[SUCCESS] SUCCESS=154 FAILED=0 SKIPPED=20 WARNING=0 EXCEPTION=0

Spread the word ...Tweet about this on TwitterShare on FacebookShare on LinkedInShare on Google+Pin on PinterestShare on RedditDigg thisEmail this to someone

Leave a Comment