Data Pre-processing of Hard Disk Drive Data for Failure Prediction in the Context of Industry 4.0

Authors: Balogun, K. and Xu, L.

Journal: IFIP Advances in Information and Communication Technology

Volume: 759 IFIPAICT

Pages: 77-100

eISSN: 1868-422X

ISSN: 1868-4238

DOI: 10.1007/978-3-031-97051-1_6

Abstract:

Hard disk drive failure prediction is one of the crucial application domains of predictive maintenance in Industry 4.0. While a vast amount of hard disk drive (HDD) monitoring data from SMART attributes and sensors is available, effective pre-processing techniques to pre-process these data for failure prediction are still lagging in the context of Industry 4.0, where data originates from diverse and multiple sources. Traditionally, HDD data preprocessing focuses on a single data source and uses a static predefined approach which is unsuitable in the Industry 4.0 context. This paper addresses this research gap by developing a unique and iterative data pre-processing framework that is specifically designed to cater to varying hard disk drive data from different data sources in the context of service-oriented or component-based systems. Our research objectives include identifying the unique challenges of HDD data pre-processing in industrial settings, developing a unified and adaptive data pre-processing workflow, and creating a framework that supports future automation through automated AI. Methodologically, we adopt the FIWARE smart architecture to harness HDD failure prediction capabilities in the Industry 4.0 settings. Our framework implements a specialized technique to handle SMART data problems such as missing values, feature dependencies, class imbalance, and SMART attributes inconsistencies, which are intrinsic to Industrial HDD operations. We validate the effectiveness of the framework via a series of experiments with machine learning models such as Gradient-Boosted Decision Trees (GDBT) and Decision Trees (DT), achieving over 90% Failure Detection Rate (FDR) across several performance metrics. The key contributions of this study include a validated and iterative data pre-processing pipeline that is adaptable to service-oriented industrial systems, and implementation strategies for hard disk drive predictive maintenance in the Industry 4.0 ecosystem. It also contributes to the theoretical understanding and practical application of HDD failure prediction in modern manufacturing environments.

Source: Scopus

Data Pre-processing of Hard Disk Drive Data for Failure Prediction in the Context of Industry 4.0

Authors: Balogun, K. and Xu, L.

Journal: TECHNOLOGICAL INNOVATION FOR AI-POWERED CYBER-PHYSICAL SYSTEMS, DOCEIS 2025

Volume: 759

Pages: 77-100

eISSN: 1868-422X

ISBN: 978-3-031-97053-5

ISSN: 1868-4238

DOI: 10.1007/978-3-031-97051-1_6

Source: Web of Science (Lite)