posted on 2021-05-23, 09:41authored byGhazal Zamani
Increasingly, the application providers are using a separate fault management system that offers out-of-the-box monitoring and alarms support for application instances. A fault management system usually consists of a set of management components that does both fault detection and can trigger actions, for example, automatic restart of monitored components. Such a distributed structure supports scalability and helps to ensure that an application meets its quality requirements. However, successful recovery of an application now depends on the fault management architecture and the status of the management components. This thesis presents a model that accounts for the effect of management-architecture based coverage on the mean throughput of an application. Such a model would benefit the application providers for choosing the right fault management architecture for their applications. Comparing five different sample fault management architectures, shows that for higher workload, the case with highest number of detection paths has the maximum throughput.