High-dimensional data mining in finance by Robust semi-supervised Kernel classifiers on maximum covariance discriminant subspace

Research output: Contribution to journalArticle

Abstract

Kernel machines (such as support vector machines) have demon-strated excellent performance in numerous areas of pattern recognition. However, traditional kernel machines do not make efficient use of both labeled training data and unlabeled testing data. Moreover, high-dimensional and nonlinear distributed data generally degrade the performance of a classifier due to the curse of dimensionality, especially in financial distress predictions. To address these problems, this study proposes a novel hybrid classifier which constructs a robust semi-supervised support vector machine SVM on kernel partial least square discriminant space (KPLSDS). KPLSDS is created by optimal projection of original data space to a low-dimensional subspace which has maximum covariance between inputs and outputs. Robust semi-supervised SVMs constructed on KPLSDS exploit the candidate low-density separators and simultaneously prevent the identification of a poor separator with the help of unlabeled data. Compared with other dimensionality reduction methods and conventional classifiers, the hybrid classifier performs best.

Original languageEnglish
Pages (from-to)7473-7492
Number of pages20
JournalInformation (Japan)
Volume16
Issue number10
Publication statusPublished - 2013 Jan 1

Fingerprint

Finance
Data mining
Classifiers
Separators
Support vector machines
Pattern recognition
Testing

All Science Journal Classification (ASJC) codes

  • Information Systems

Cite this

@article{e2b5d3b86ae24636be74a0d358816e12,
title = "High-dimensional data mining in finance by Robust semi-supervised Kernel classifiers on maximum covariance discriminant subspace",
abstract = "Kernel machines (such as support vector machines) have demon-strated excellent performance in numerous areas of pattern recognition. However, traditional kernel machines do not make efficient use of both labeled training data and unlabeled testing data. Moreover, high-dimensional and nonlinear distributed data generally degrade the performance of a classifier due to the curse of dimensionality, especially in financial distress predictions. To address these problems, this study proposes a novel hybrid classifier which constructs a robust semi-supervised support vector machine SVM on kernel partial least square discriminant space (KPLSDS). KPLSDS is created by optimal projection of original data space to a low-dimensional subspace which has maximum covariance between inputs and outputs. Robust semi-supervised SVMs constructed on KPLSDS exploit the candidate low-density separators and simultaneously prevent the identification of a poor separator with the help of unlabeled data. Compared with other dimensionality reduction methods and conventional classifiers, the hybrid classifier performs best.",
author = "Huang, {Shian Chang} and Wu, {Tung Kuang}",
year = "2013",
month = "1",
day = "1",
language = "English",
volume = "16",
pages = "7473--7492",
journal = "Information",
issn = "1343-4500",
publisher = "International Information Institute",
number = "10",

}

TY - JOUR

T1 - High-dimensional data mining in finance by Robust semi-supervised Kernel classifiers on maximum covariance discriminant subspace

AU - Huang, Shian Chang

AU - Wu, Tung Kuang

PY - 2013/1/1

Y1 - 2013/1/1

N2 - Kernel machines (such as support vector machines) have demon-strated excellent performance in numerous areas of pattern recognition. However, traditional kernel machines do not make efficient use of both labeled training data and unlabeled testing data. Moreover, high-dimensional and nonlinear distributed data generally degrade the performance of a classifier due to the curse of dimensionality, especially in financial distress predictions. To address these problems, this study proposes a novel hybrid classifier which constructs a robust semi-supervised support vector machine SVM on kernel partial least square discriminant space (KPLSDS). KPLSDS is created by optimal projection of original data space to a low-dimensional subspace which has maximum covariance between inputs and outputs. Robust semi-supervised SVMs constructed on KPLSDS exploit the candidate low-density separators and simultaneously prevent the identification of a poor separator with the help of unlabeled data. Compared with other dimensionality reduction methods and conventional classifiers, the hybrid classifier performs best.

AB - Kernel machines (such as support vector machines) have demon-strated excellent performance in numerous areas of pattern recognition. However, traditional kernel machines do not make efficient use of both labeled training data and unlabeled testing data. Moreover, high-dimensional and nonlinear distributed data generally degrade the performance of a classifier due to the curse of dimensionality, especially in financial distress predictions. To address these problems, this study proposes a novel hybrid classifier which constructs a robust semi-supervised support vector machine SVM on kernel partial least square discriminant space (KPLSDS). KPLSDS is created by optimal projection of original data space to a low-dimensional subspace which has maximum covariance between inputs and outputs. Robust semi-supervised SVMs constructed on KPLSDS exploit the candidate low-density separators and simultaneously prevent the identification of a poor separator with the help of unlabeled data. Compared with other dimensionality reduction methods and conventional classifiers, the hybrid classifier performs best.

UR - http://www.scopus.com/inward/record.url?scp=84893859849&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893859849&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84893859849

VL - 16

SP - 7473

EP - 7492

JO - Information

JF - Information

SN - 1343-4500

IS - 10

ER -