TY - JOUR
T1 - Driving behaviors analysis based on feature selection and statistical approach
T2 - a preliminary study
AU - Chen, Mu Song
AU - Hwang, Chi Pan
AU - Ho, Tze Yee
AU - Wang, Hsuan Fu
AU - Shih, Chih Min
AU - Chen, Hsing Yu
AU - Liu, Wen Kai
PY - 2019/4/1
Y1 - 2019/4/1
N2 - Due to the prevalence of IoV technology, big data has increasingly been promoted as a revolutionary development in a variety of applications. Indeed, the received big data from IoV is valuable particularly for those involved in analyzing driver’s behaviors. For instance, in the fleet management domain, fleet administrators are interested in fine-grained information about fleet usage, which is influenced by different driver usage patterns. In the vehicle insurance market, usage-based insurance or pay-as-you-drive schemes aim to adapt the insurance premium to individual driver behavior or even to provide various value-added services to policy holders. These applications can be expected to improve and to make safer the driving style of various individuals. Nowadays, big data analysis is becoming indispensable for automatic discovering of intelligence that is involved in the frequently occurring patterns and hidden rules. It is essential and necessary to study how to utilize these large-scale data. Regarding driving behaviors analysis, this paper presents a preliminary study based on feature selection and statistical approach. Feature selection is one of the important and frequently used techniques in data preprocessing for big data mining. Feature selection, as a dimensionality reduction technique, focuses on choosing a small subset of the significant features from the original data by removing irrelevant or redundant features. According to selection process, the most significant feature is vehicle speed for the collected vehicular data. Afterward, the statistical approach calculates skewness and dispersion in speed distribution as the statistical features for driving behaviors analysis. Finally, the established classification rules not only provide data-driven services and big data analytics but also offer training data samples for supervised machine learning algorithms. To validate the feasibility of the proposed method, over 150 drivers and more than 200,000 trips are verified in the simulation. As expected, experimental results are well matched with our observations.
AB - Due to the prevalence of IoV technology, big data has increasingly been promoted as a revolutionary development in a variety of applications. Indeed, the received big data from IoV is valuable particularly for those involved in analyzing driver’s behaviors. For instance, in the fleet management domain, fleet administrators are interested in fine-grained information about fleet usage, which is influenced by different driver usage patterns. In the vehicle insurance market, usage-based insurance or pay-as-you-drive schemes aim to adapt the insurance premium to individual driver behavior or even to provide various value-added services to policy holders. These applications can be expected to improve and to make safer the driving style of various individuals. Nowadays, big data analysis is becoming indispensable for automatic discovering of intelligence that is involved in the frequently occurring patterns and hidden rules. It is essential and necessary to study how to utilize these large-scale data. Regarding driving behaviors analysis, this paper presents a preliminary study based on feature selection and statistical approach. Feature selection is one of the important and frequently used techniques in data preprocessing for big data mining. Feature selection, as a dimensionality reduction technique, focuses on choosing a small subset of the significant features from the original data by removing irrelevant or redundant features. According to selection process, the most significant feature is vehicle speed for the collected vehicular data. Afterward, the statistical approach calculates skewness and dispersion in speed distribution as the statistical features for driving behaviors analysis. Finally, the established classification rules not only provide data-driven services and big data analytics but also offer training data samples for supervised machine learning algorithms. To validate the feasibility of the proposed method, over 150 drivers and more than 200,000 trips are verified in the simulation. As expected, experimental results are well matched with our observations.
UR - http://www.scopus.com/inward/record.url?scp=85053854959&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85053854959&partnerID=8YFLogxK
U2 - 10.1007/s11227-018-2618-9
DO - 10.1007/s11227-018-2618-9
M3 - Article
AN - SCOPUS:85053854959
VL - 75
SP - 2007
EP - 2026
JO - Journal of Supercomputing
JF - Journal of Supercomputing
SN - 0920-8542
IS - 4
ER -