DSpace
 

University of Jos Institutional Repository >
Natural Sciences >
Computer Science >

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/2818

Title: Understanding Error Log Event Sequence for Failure Analysis
Authors: Gurumdimma, Nentawe
Bisandu, Desmond Bala
Keywords: Failure Sequences
HPC
Similarity
Cluster
Issue Date: 2018
Publisher: Science World Journal
Series/Report no.: Vol.13;No.4; Pp 8-15
Abstract: Due to the evolvement of large-scale parallel systems, they are mostly employed for mission critical applications. The anticipation and accommodation of failure occurrences is crucial to the design. A commonplace feature of these large-scale systems is failure, and they cannot be treated as exception. The system state is mostly captured through the logs. The need for proper understanding of these error logs for failure analysis is extremely important. This is because the logs contain the “health” information of the system. In this paper we design an approach that seeks to find similarities in patterns of these logs events that leads to failures. Our experiment shows that several root causes of soft lockup failures could be traced through the logs. We capture the behavior of failure inducing patterns and realized that the logs pattern of failure and non-failure patterns are dissimilar.
URI: http://hdl.handle.net/123456789/2818
ISSN: 1597-6343
Appears in Collections:Computer Science

Files in This Item:

File Description SizeFormat
18872-Article Text-75965-1-10-20181224.pdf688.48 kBAdobe PDFView/Open
View Statistics

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback