Concept drift detection and adaption in big imbalance industrial IoT data using an ensemble learning method of offline classifiers

Chun Cheng Lin, Der-Jiunn Deng, Chin Hung Kuo, Linnan Chen

Research output: Contribution to journalArticle

Abstract

In a smart factory, thousands of industrial Internet of Things (IIoT) devices or sensors are installed in production machines to collect big data on machine conditions and transmit it to a cyber-physical system in the cloud center of the factory. Then, the system employs a variety of condition-based maintenance (CBM) methods to predict the time point when machines start to be operated abnormally and to maintain them or replace their components in advance so as to avoid manufacturing enormous detective products. CBM suffers from problems of concept drifts (i.e., the distribution of fault patterns may change over time) and imbalance data (i.e., the data with faults accounts for a minority of all data). Ensemble learning that integrates the diversity of multiple classifiers provides a high-performance solution to address these problems. In practice, most companies may not have a sufficient budget to establish a sound infrastructure to support real-time online classifiers, but may have off-the-shelf offline classifiers in their existing systems. However, most previous works on ensemble learning only focused on supporting online classifiers. Consequently, this work proposes an ensemble learning algorithm that supports offline classifiers to cope with three-stage CBM with concept drifts and imbalance data, in which Stages 1 (training an ensemble classifier) and 3 (creating a new ensemble) employ an improved Dynamic AdaBoost.NC classifier and the SMOTE method to address imbalance data; and Stage 2 (detecting concept drifts in imbalance data) employs an improved LFR (Linear Four Rates) method. The experimental results on datasets with different degrees of imbalance show that the proposed method can successfully detect all concept drifts, and has a high accuracy rate in detecting minority-class data, which is over 94%.

Original languageEnglish
Article number8694986
Pages (from-to)56198-56207
Number of pages10
JournalIEEE Access
Volume7
DOIs
Publication statusPublished - 2019 Jan 1

Fingerprint

Classifiers
Industrial plants
Adaptive boosting
Internet of things
Learning algorithms
Acoustic waves
Sensors
Industry

All Science Journal Classification (ASJC) codes

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)

Cite this

@article{7789e6d6732443a09a12d2ca2e59aeed,
title = "Concept drift detection and adaption in big imbalance industrial IoT data using an ensemble learning method of offline classifiers",
abstract = "In a smart factory, thousands of industrial Internet of Things (IIoT) devices or sensors are installed in production machines to collect big data on machine conditions and transmit it to a cyber-physical system in the cloud center of the factory. Then, the system employs a variety of condition-based maintenance (CBM) methods to predict the time point when machines start to be operated abnormally and to maintain them or replace their components in advance so as to avoid manufacturing enormous detective products. CBM suffers from problems of concept drifts (i.e., the distribution of fault patterns may change over time) and imbalance data (i.e., the data with faults accounts for a minority of all data). Ensemble learning that integrates the diversity of multiple classifiers provides a high-performance solution to address these problems. In practice, most companies may not have a sufficient budget to establish a sound infrastructure to support real-time online classifiers, but may have off-the-shelf offline classifiers in their existing systems. However, most previous works on ensemble learning only focused on supporting online classifiers. Consequently, this work proposes an ensemble learning algorithm that supports offline classifiers to cope with three-stage CBM with concept drifts and imbalance data, in which Stages 1 (training an ensemble classifier) and 3 (creating a new ensemble) employ an improved Dynamic AdaBoost.NC classifier and the SMOTE method to address imbalance data; and Stage 2 (detecting concept drifts in imbalance data) employs an improved LFR (Linear Four Rates) method. The experimental results on datasets with different degrees of imbalance show that the proposed method can successfully detect all concept drifts, and has a high accuracy rate in detecting minority-class data, which is over 94{\%}.",
author = "Lin, {Chun Cheng} and Der-Jiunn Deng and Kuo, {Chin Hung} and Linnan Chen",
year = "2019",
month = "1",
day = "1",
doi = "10.1109/ACCESS.2019.2912631",
language = "English",
volume = "7",
pages = "56198--56207",
journal = "IEEE Access",
issn = "2169-3536",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

Concept drift detection and adaption in big imbalance industrial IoT data using an ensemble learning method of offline classifiers. / Lin, Chun Cheng; Deng, Der-Jiunn; Kuo, Chin Hung; Chen, Linnan.

In: IEEE Access, Vol. 7, 8694986, 01.01.2019, p. 56198-56207.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Concept drift detection and adaption in big imbalance industrial IoT data using an ensemble learning method of offline classifiers

AU - Lin, Chun Cheng

AU - Deng, Der-Jiunn

AU - Kuo, Chin Hung

AU - Chen, Linnan

PY - 2019/1/1

Y1 - 2019/1/1

N2 - In a smart factory, thousands of industrial Internet of Things (IIoT) devices or sensors are installed in production machines to collect big data on machine conditions and transmit it to a cyber-physical system in the cloud center of the factory. Then, the system employs a variety of condition-based maintenance (CBM) methods to predict the time point when machines start to be operated abnormally and to maintain them or replace their components in advance so as to avoid manufacturing enormous detective products. CBM suffers from problems of concept drifts (i.e., the distribution of fault patterns may change over time) and imbalance data (i.e., the data with faults accounts for a minority of all data). Ensemble learning that integrates the diversity of multiple classifiers provides a high-performance solution to address these problems. In practice, most companies may not have a sufficient budget to establish a sound infrastructure to support real-time online classifiers, but may have off-the-shelf offline classifiers in their existing systems. However, most previous works on ensemble learning only focused on supporting online classifiers. Consequently, this work proposes an ensemble learning algorithm that supports offline classifiers to cope with three-stage CBM with concept drifts and imbalance data, in which Stages 1 (training an ensemble classifier) and 3 (creating a new ensemble) employ an improved Dynamic AdaBoost.NC classifier and the SMOTE method to address imbalance data; and Stage 2 (detecting concept drifts in imbalance data) employs an improved LFR (Linear Four Rates) method. The experimental results on datasets with different degrees of imbalance show that the proposed method can successfully detect all concept drifts, and has a high accuracy rate in detecting minority-class data, which is over 94%.

AB - In a smart factory, thousands of industrial Internet of Things (IIoT) devices or sensors are installed in production machines to collect big data on machine conditions and transmit it to a cyber-physical system in the cloud center of the factory. Then, the system employs a variety of condition-based maintenance (CBM) methods to predict the time point when machines start to be operated abnormally and to maintain them or replace their components in advance so as to avoid manufacturing enormous detective products. CBM suffers from problems of concept drifts (i.e., the distribution of fault patterns may change over time) and imbalance data (i.e., the data with faults accounts for a minority of all data). Ensemble learning that integrates the diversity of multiple classifiers provides a high-performance solution to address these problems. In practice, most companies may not have a sufficient budget to establish a sound infrastructure to support real-time online classifiers, but may have off-the-shelf offline classifiers in their existing systems. However, most previous works on ensemble learning only focused on supporting online classifiers. Consequently, this work proposes an ensemble learning algorithm that supports offline classifiers to cope with three-stage CBM with concept drifts and imbalance data, in which Stages 1 (training an ensemble classifier) and 3 (creating a new ensemble) employ an improved Dynamic AdaBoost.NC classifier and the SMOTE method to address imbalance data; and Stage 2 (detecting concept drifts in imbalance data) employs an improved LFR (Linear Four Rates) method. The experimental results on datasets with different degrees of imbalance show that the proposed method can successfully detect all concept drifts, and has a high accuracy rate in detecting minority-class data, which is over 94%.

UR - http://www.scopus.com/inward/record.url?scp=85067017907&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85067017907&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2019.2912631

DO - 10.1109/ACCESS.2019.2912631

M3 - Article

VL - 7

SP - 56198

EP - 56207

JO - IEEE Access

JF - IEEE Access

SN - 2169-3536

M1 - 8694986

ER -