Driving behaviors analysis based on feature selection and statistical approach

a preliminary study

Mu Song Chen, Chipan Hwang, Tze Yee Ho, Hsuan Fu Wang, Chih Min Shih, Hsing Yu Chen, Wen Kai Liu

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Due to the prevalence of IoV technology, big data has increasingly been promoted as a revolutionary development in a variety of applications. Indeed, the received big data from IoV is valuable particularly for those involved in analyzing driver’s behaviors. For instance, in the fleet management domain, fleet administrators are interested in fine-grained information about fleet usage, which is influenced by different driver usage patterns. In the vehicle insurance market, usage-based insurance or pay-as-you-drive schemes aim to adapt the insurance premium to individual driver behavior or even to provide various value-added services to policy holders. These applications can be expected to improve and to make safer the driving style of various individuals. Nowadays, big data analysis is becoming indispensable for automatic discovering of intelligence that is involved in the frequently occurring patterns and hidden rules. It is essential and necessary to study how to utilize these large-scale data. Regarding driving behaviors analysis, this paper presents a preliminary study based on feature selection and statistical approach. Feature selection is one of the important and frequently used techniques in data preprocessing for big data mining. Feature selection, as a dimensionality reduction technique, focuses on choosing a small subset of the significant features from the original data by removing irrelevant or redundant features. According to selection process, the most significant feature is vehicle speed for the collected vehicular data. Afterward, the statistical approach calculates skewness and dispersion in speed distribution as the statistical features for driving behaviors analysis. Finally, the established classification rules not only provide data-driven services and big data analytics but also offer training data samples for supervised machine learning algorithms. To validate the feasibility of the proposed method, over 150 drivers and more than 200,000 trips are verified in the simulation. As expected, experimental results are well matched with our observations.

Original languageEnglish
Pages (from-to)2007-2026
Number of pages20
JournalJournal of Supercomputing
Volume75
Issue number4
DOIs
Publication statusPublished - 2019 Apr 1

Fingerprint

Feature Selection
Feature extraction
Insurance
Driver
Learning algorithms
Data mining
Data Preprocessing
Learning systems
Classification Rules
Dimensionality Reduction
Supervised Learning
Skewness
Big data
Data-driven
Learning Algorithm
Data analysis
Data Mining
Machine Learning
Calculate
Subset

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Information Systems
  • Hardware and Architecture

Cite this

Chen, Mu Song ; Hwang, Chipan ; Ho, Tze Yee ; Wang, Hsuan Fu ; Shih, Chih Min ; Chen, Hsing Yu ; Liu, Wen Kai. / Driving behaviors analysis based on feature selection and statistical approach : a preliminary study. In: Journal of Supercomputing. 2019 ; Vol. 75, No. 4. pp. 2007-2026.
@article{430871cb791645249168b5229b63cf1b,
title = "Driving behaviors analysis based on feature selection and statistical approach: a preliminary study",
abstract = "Due to the prevalence of IoV technology, big data has increasingly been promoted as a revolutionary development in a variety of applications. Indeed, the received big data from IoV is valuable particularly for those involved in analyzing driver’s behaviors. For instance, in the fleet management domain, fleet administrators are interested in fine-grained information about fleet usage, which is influenced by different driver usage patterns. In the vehicle insurance market, usage-based insurance or pay-as-you-drive schemes aim to adapt the insurance premium to individual driver behavior or even to provide various value-added services to policy holders. These applications can be expected to improve and to make safer the driving style of various individuals. Nowadays, big data analysis is becoming indispensable for automatic discovering of intelligence that is involved in the frequently occurring patterns and hidden rules. It is essential and necessary to study how to utilize these large-scale data. Regarding driving behaviors analysis, this paper presents a preliminary study based on feature selection and statistical approach. Feature selection is one of the important and frequently used techniques in data preprocessing for big data mining. Feature selection, as a dimensionality reduction technique, focuses on choosing a small subset of the significant features from the original data by removing irrelevant or redundant features. According to selection process, the most significant feature is vehicle speed for the collected vehicular data. Afterward, the statistical approach calculates skewness and dispersion in speed distribution as the statistical features for driving behaviors analysis. Finally, the established classification rules not only provide data-driven services and big data analytics but also offer training data samples for supervised machine learning algorithms. To validate the feasibility of the proposed method, over 150 drivers and more than 200,000 trips are verified in the simulation. As expected, experimental results are well matched with our observations.",
author = "Chen, {Mu Song} and Chipan Hwang and Ho, {Tze Yee} and Wang, {Hsuan Fu} and Shih, {Chih Min} and Chen, {Hsing Yu} and Liu, {Wen Kai}",
year = "2019",
month = "4",
day = "1",
doi = "10.1007/s11227-018-2618-9",
language = "English",
volume = "75",
pages = "2007--2026",
journal = "Journal of Supercomputing",
issn = "0920-8542",
publisher = "Springer Netherlands",
number = "4",

}

Driving behaviors analysis based on feature selection and statistical approach : a preliminary study. / Chen, Mu Song; Hwang, Chipan; Ho, Tze Yee; Wang, Hsuan Fu; Shih, Chih Min; Chen, Hsing Yu; Liu, Wen Kai.

In: Journal of Supercomputing, Vol. 75, No. 4, 01.04.2019, p. 2007-2026.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Driving behaviors analysis based on feature selection and statistical approach

T2 - a preliminary study

AU - Chen, Mu Song

AU - Hwang, Chipan

AU - Ho, Tze Yee

AU - Wang, Hsuan Fu

AU - Shih, Chih Min

AU - Chen, Hsing Yu

AU - Liu, Wen Kai

PY - 2019/4/1

Y1 - 2019/4/1

N2 - Due to the prevalence of IoV technology, big data has increasingly been promoted as a revolutionary development in a variety of applications. Indeed, the received big data from IoV is valuable particularly for those involved in analyzing driver’s behaviors. For instance, in the fleet management domain, fleet administrators are interested in fine-grained information about fleet usage, which is influenced by different driver usage patterns. In the vehicle insurance market, usage-based insurance or pay-as-you-drive schemes aim to adapt the insurance premium to individual driver behavior or even to provide various value-added services to policy holders. These applications can be expected to improve and to make safer the driving style of various individuals. Nowadays, big data analysis is becoming indispensable for automatic discovering of intelligence that is involved in the frequently occurring patterns and hidden rules. It is essential and necessary to study how to utilize these large-scale data. Regarding driving behaviors analysis, this paper presents a preliminary study based on feature selection and statistical approach. Feature selection is one of the important and frequently used techniques in data preprocessing for big data mining. Feature selection, as a dimensionality reduction technique, focuses on choosing a small subset of the significant features from the original data by removing irrelevant or redundant features. According to selection process, the most significant feature is vehicle speed for the collected vehicular data. Afterward, the statistical approach calculates skewness and dispersion in speed distribution as the statistical features for driving behaviors analysis. Finally, the established classification rules not only provide data-driven services and big data analytics but also offer training data samples for supervised machine learning algorithms. To validate the feasibility of the proposed method, over 150 drivers and more than 200,000 trips are verified in the simulation. As expected, experimental results are well matched with our observations.

AB - Due to the prevalence of IoV technology, big data has increasingly been promoted as a revolutionary development in a variety of applications. Indeed, the received big data from IoV is valuable particularly for those involved in analyzing driver’s behaviors. For instance, in the fleet management domain, fleet administrators are interested in fine-grained information about fleet usage, which is influenced by different driver usage patterns. In the vehicle insurance market, usage-based insurance or pay-as-you-drive schemes aim to adapt the insurance premium to individual driver behavior or even to provide various value-added services to policy holders. These applications can be expected to improve and to make safer the driving style of various individuals. Nowadays, big data analysis is becoming indispensable for automatic discovering of intelligence that is involved in the frequently occurring patterns and hidden rules. It is essential and necessary to study how to utilize these large-scale data. Regarding driving behaviors analysis, this paper presents a preliminary study based on feature selection and statistical approach. Feature selection is one of the important and frequently used techniques in data preprocessing for big data mining. Feature selection, as a dimensionality reduction technique, focuses on choosing a small subset of the significant features from the original data by removing irrelevant or redundant features. According to selection process, the most significant feature is vehicle speed for the collected vehicular data. Afterward, the statistical approach calculates skewness and dispersion in speed distribution as the statistical features for driving behaviors analysis. Finally, the established classification rules not only provide data-driven services and big data analytics but also offer training data samples for supervised machine learning algorithms. To validate the feasibility of the proposed method, over 150 drivers and more than 200,000 trips are verified in the simulation. As expected, experimental results are well matched with our observations.

UR - http://www.scopus.com/inward/record.url?scp=85053854959&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85053854959&partnerID=8YFLogxK

U2 - 10.1007/s11227-018-2618-9

DO - 10.1007/s11227-018-2618-9

M3 - Article

VL - 75

SP - 2007

EP - 2026

JO - Journal of Supercomputing

JF - Journal of Supercomputing

SN - 0920-8542

IS - 4

ER -