Detecting drifting concepts on the internet

Chien I. Lee, Cheng Jung Tsai, Chien Hui Hsieh

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

With the explosive growth of information sources available on the World Wide Web, it has become increasingly necessary to utilize automated tools to discovery interesting and potentially useful patterns from data on the Internet. Since the data on the Internet such as communication packages, email, and e-commerce transactions come consecutively, an efficient and accurate incremental learning approach is required. Moreover, since the labels of these data may change over time, the problem of concept drift must be considered while incrementally learning from the data on the Internet. In this paper, we give a detailed discussion of the concept-drifting problem on the Internet. We also address a new problem called two-way drift. An approach adapted to the occurrence of concept drift is then proposed as the solution to incrementally learn from the data on the Internet. Our approach works as a preprocessor to detect the occurrence of concept drift and can be incorporated into any existing classification techniques. Our approach can also reveal which attribute values cause concept drift and therefore enables systems or decision makers to adopt proper decision in advance.

Original languageEnglish
Pages (from-to)229-236
Number of pages8
JournalJournal of Internet Technology
Volume9
Issue number3
Publication statusPublished - 2008 Jul 1

Fingerprint

Internet
Electronic mail
World Wide Web
Labels
Communication

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Networks and Communications

Cite this

Lee, Chien I. ; Tsai, Cheng Jung ; Hsieh, Chien Hui. / Detecting drifting concepts on the internet. In: Journal of Internet Technology. 2008 ; Vol. 9, No. 3. pp. 229-236.
@article{bf8cb31042b14d3c8d67b4df0429f647,
title = "Detecting drifting concepts on the internet",
abstract = "With the explosive growth of information sources available on the World Wide Web, it has become increasingly necessary to utilize automated tools to discovery interesting and potentially useful patterns from data on the Internet. Since the data on the Internet such as communication packages, email, and e-commerce transactions come consecutively, an efficient and accurate incremental learning approach is required. Moreover, since the labels of these data may change over time, the problem of concept drift must be considered while incrementally learning from the data on the Internet. In this paper, we give a detailed discussion of the concept-drifting problem on the Internet. We also address a new problem called two-way drift. An approach adapted to the occurrence of concept drift is then proposed as the solution to incrementally learn from the data on the Internet. Our approach works as a preprocessor to detect the occurrence of concept drift and can be incorporated into any existing classification techniques. Our approach can also reveal which attribute values cause concept drift and therefore enables systems or decision makers to adopt proper decision in advance.",
author = "Lee, {Chien I.} and Tsai, {Cheng Jung} and Hsieh, {Chien Hui}",
year = "2008",
month = "7",
day = "1",
language = "English",
volume = "9",
pages = "229--236",
journal = "Journal of Internet Technology",
issn = "1607-9264",
publisher = "Taiwan Academic Network Management Committee",
number = "3",

}

Lee, CI, Tsai, CJ & Hsieh, CH 2008, 'Detecting drifting concepts on the internet', Journal of Internet Technology, vol. 9, no. 3, pp. 229-236.

Detecting drifting concepts on the internet. / Lee, Chien I.; Tsai, Cheng Jung; Hsieh, Chien Hui.

In: Journal of Internet Technology, Vol. 9, No. 3, 01.07.2008, p. 229-236.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Detecting drifting concepts on the internet

AU - Lee, Chien I.

AU - Tsai, Cheng Jung

AU - Hsieh, Chien Hui

PY - 2008/7/1

Y1 - 2008/7/1

N2 - With the explosive growth of information sources available on the World Wide Web, it has become increasingly necessary to utilize automated tools to discovery interesting and potentially useful patterns from data on the Internet. Since the data on the Internet such as communication packages, email, and e-commerce transactions come consecutively, an efficient and accurate incremental learning approach is required. Moreover, since the labels of these data may change over time, the problem of concept drift must be considered while incrementally learning from the data on the Internet. In this paper, we give a detailed discussion of the concept-drifting problem on the Internet. We also address a new problem called two-way drift. An approach adapted to the occurrence of concept drift is then proposed as the solution to incrementally learn from the data on the Internet. Our approach works as a preprocessor to detect the occurrence of concept drift and can be incorporated into any existing classification techniques. Our approach can also reveal which attribute values cause concept drift and therefore enables systems or decision makers to adopt proper decision in advance.

AB - With the explosive growth of information sources available on the World Wide Web, it has become increasingly necessary to utilize automated tools to discovery interesting and potentially useful patterns from data on the Internet. Since the data on the Internet such as communication packages, email, and e-commerce transactions come consecutively, an efficient and accurate incremental learning approach is required. Moreover, since the labels of these data may change over time, the problem of concept drift must be considered while incrementally learning from the data on the Internet. In this paper, we give a detailed discussion of the concept-drifting problem on the Internet. We also address a new problem called two-way drift. An approach adapted to the occurrence of concept drift is then proposed as the solution to incrementally learn from the data on the Internet. Our approach works as a preprocessor to detect the occurrence of concept drift and can be incorporated into any existing classification techniques. Our approach can also reveal which attribute values cause concept drift and therefore enables systems or decision makers to adopt proper decision in advance.

UR - http://www.scopus.com/inward/record.url?scp=49449085666&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=49449085666&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:49449085666

VL - 9

SP - 229

EP - 236

JO - Journal of Internet Technology

JF - Journal of Internet Technology

SN - 1607-9264

IS - 3

ER -