An approach to mining the multi-relational imbalanced database

Chien I. Lee, Cheng Jung Tsai, Tong Qin Wu, Wei Pang Yang

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

The class imbalance problem is an important issue in classification of Data mining. For example, in the applications of fraudulent telephone calls, telecommunications management, and rare diagnoses, users would be more interested in the minority than the majority. Although there are many proposed algorithms to solve the imbalanced problem, they are unsuitable to be directly applied on a multi-relational database. Nevertheless, many data nowadays such as financial transactions and medical anamneses are stored in a multi-relational database rather than a single data sheet. On the other hand, the widely used multi-relational classification approaches, such as TILDE, FOIL and CrossMine, are insensitive to handle the imbalanced databases. In this paper, we propose a multi-relational g-mean decision tree algorithm to solve the imbalanced problem in a multi-relational database. As shown in our experiments, our approach can more accurately mine a multi-relational imbalanced database.

Original languageEnglish
Pages (from-to)3021-3032
Number of pages12
JournalExpert Systems with Applications
Volume34
Issue number4
DOIs
Publication statusPublished - 2008 May 1

Fingerprint

Decision trees
Telephone
Telecommunication
Data mining
Experiments

All Science Journal Classification (ASJC) codes

  • Engineering(all)
  • Computer Science Applications
  • Artificial Intelligence

Cite this

Lee, Chien I. ; Tsai, Cheng Jung ; Wu, Tong Qin ; Yang, Wei Pang. / An approach to mining the multi-relational imbalanced database. In: Expert Systems with Applications. 2008 ; Vol. 34, No. 4. pp. 3021-3032.
@article{7283a60ca20848e78cde3aecdc022881,
title = "An approach to mining the multi-relational imbalanced database",
abstract = "The class imbalance problem is an important issue in classification of Data mining. For example, in the applications of fraudulent telephone calls, telecommunications management, and rare diagnoses, users would be more interested in the minority than the majority. Although there are many proposed algorithms to solve the imbalanced problem, they are unsuitable to be directly applied on a multi-relational database. Nevertheless, many data nowadays such as financial transactions and medical anamneses are stored in a multi-relational database rather than a single data sheet. On the other hand, the widely used multi-relational classification approaches, such as TILDE, FOIL and CrossMine, are insensitive to handle the imbalanced databases. In this paper, we propose a multi-relational g-mean decision tree algorithm to solve the imbalanced problem in a multi-relational database. As shown in our experiments, our approach can more accurately mine a multi-relational imbalanced database.",
author = "Lee, {Chien I.} and Tsai, {Cheng Jung} and Wu, {Tong Qin} and Yang, {Wei Pang}",
year = "2008",
month = "5",
day = "1",
doi = "10.1016/j.eswa.2007.05.048",
language = "English",
volume = "34",
pages = "3021--3032",
journal = "Expert Systems with Applications",
issn = "0957-4174",
publisher = "Elsevier Limited",
number = "4",

}

An approach to mining the multi-relational imbalanced database. / Lee, Chien I.; Tsai, Cheng Jung; Wu, Tong Qin; Yang, Wei Pang.

In: Expert Systems with Applications, Vol. 34, No. 4, 01.05.2008, p. 3021-3032.

Research output: Contribution to journalArticle

TY - JOUR

T1 - An approach to mining the multi-relational imbalanced database

AU - Lee, Chien I.

AU - Tsai, Cheng Jung

AU - Wu, Tong Qin

AU - Yang, Wei Pang

PY - 2008/5/1

Y1 - 2008/5/1

N2 - The class imbalance problem is an important issue in classification of Data mining. For example, in the applications of fraudulent telephone calls, telecommunications management, and rare diagnoses, users would be more interested in the minority than the majority. Although there are many proposed algorithms to solve the imbalanced problem, they are unsuitable to be directly applied on a multi-relational database. Nevertheless, many data nowadays such as financial transactions and medical anamneses are stored in a multi-relational database rather than a single data sheet. On the other hand, the widely used multi-relational classification approaches, such as TILDE, FOIL and CrossMine, are insensitive to handle the imbalanced databases. In this paper, we propose a multi-relational g-mean decision tree algorithm to solve the imbalanced problem in a multi-relational database. As shown in our experiments, our approach can more accurately mine a multi-relational imbalanced database.

AB - The class imbalance problem is an important issue in classification of Data mining. For example, in the applications of fraudulent telephone calls, telecommunications management, and rare diagnoses, users would be more interested in the minority than the majority. Although there are many proposed algorithms to solve the imbalanced problem, they are unsuitable to be directly applied on a multi-relational database. Nevertheless, many data nowadays such as financial transactions and medical anamneses are stored in a multi-relational database rather than a single data sheet. On the other hand, the widely used multi-relational classification approaches, such as TILDE, FOIL and CrossMine, are insensitive to handle the imbalanced databases. In this paper, we propose a multi-relational g-mean decision tree algorithm to solve the imbalanced problem in a multi-relational database. As shown in our experiments, our approach can more accurately mine a multi-relational imbalanced database.

UR - http://www.scopus.com/inward/record.url?scp=38649105114&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=38649105114&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2007.05.048

DO - 10.1016/j.eswa.2007.05.048

M3 - Article

AN - SCOPUS:38649105114

VL - 34

SP - 3021

EP - 3032

JO - Expert Systems with Applications

JF - Expert Systems with Applications

SN - 0957-4174

IS - 4

ER -