Toronto Metropolitan University
Browse

Heatmap Visualization for Monitoring Health of a Large-scale Cloud System

Download (1.46 MB)
thesis
posted on 2024-06-18, 19:11 authored by Sarah Sohana
The complex infrastructure, growing scale, and variety of a large-scale Cloud System (LCS) pose many challenges in monitoring the health of its components. Unfortunately, existing advanced monitoring systems often fail to assist the Cloud Operation Team in gaining meaningful insights about their system and its underlying components. In this thesis, we propose a near-real-time interactive visual monitoring tool based on heatmaps that help developers and maintainers of LCS to perform exploratory analysis of LCS health and aid in decision-making regarding resource planning and provisioning, configuration design, and problem identification. We have validated our tool in real-world settings by monitoring IBM Cloud Console (an LCS used by IBM to monitor IBM Cloud). Results show that our heatmaps can provide actionable insights. In particular, the tool has helped the team diagnose anomalous behaviour of the components, determine heavy or low traffic, find latency issues and make critical business decisions. Our tool is of interest to practitioners as it can be used to monitor the health of an arbitrary LCS. Moreover, it can serve as a building block for creating a theory of monitoring complex software systems, which is of interest to academics.

History

Language

eng

Degree

  • Master of Health Science

Program

  • Computer Science

Granting Institution

Ryerson University

LAC Thesis Type

  • Thesis

Thesis Advisor

Andriy Miranskyy

Year

2022

Usage metrics

    Toronto Metropolitan University

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC