Publication: Application of machine learning technique to estimate modal choicefor kuantan city
Loading...
Date
2023-02-01
Authors
Nur Fahriza Binti Mohd Ali
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The machine learning technique has been rapidly adopted due to its
effectiveness in modelling travel mode choice when compared to the conventional
technique of Discrete Choice Modelling by means of Logistic Regression. The general
aim of the work reported in this thesis is to identify a promising alternative approach
for modelling travel mode choice effectively for solving the current transport problems
in Kuantan. To achieve those aims, the research must satisfy its three main objectives;
to identify the parameters contributing to users’ daily travel mode choice; to develop
machine learning models that can classify users’ daily travel mode choice; and to
analyse the implications of the machine learning technique for modelling future travel
mode choice. The Revealed/Stated Preference (RP/SP) survey method was used
involving 386 respondents in Kuantan City, Pahang, Malaysia. The travel mode choice
modelling for door-to-door journeys which includes walking distance from home to
the nearest stop (WD1), waiting time (WT), in-vehicle time (IVT), and walking
distance from the last stop to destination (WD2). The implementation of feature
importance by means of feature selection technique depicted that WT, total Travel
Time (TT), Region, WD1, Ticket, IVT, Reason of that journey, WD2, Employment
Status, and age of respondents are significant compared to the remaining variables.
Some notable machine learning models were developed and compared to depict the
most effective model for travel mode choice. They were Neural Network (NN),
Random Forest (RF), Logistic Regression (LR), Naïve Bayes (NB), k-Nearest
Neighbour (kNN), Decision Tree (DT), and Support Vector Machine (SVM). Among
them, Neural Network (NN) was the most effective in modelling travel mode choice
with the Classification Accuracy (CA) of training and testing of about 0.727 and 0.721
respectively, followed by LR (training: 0.714, testing: 0.714), RF (training: 0.713,
testing: 0.685), NB (training: 0.697, testing: 0.675), DT (training: 0.661, testing:
0.660), kNN (training: 0.667, testing: 0.673), and SVM (training: 0.555, testing:
0.558). Additionally, via the data augmentation technique, prediction by means of NN
model delivered an optimum achievement in term of users’ interest to choose public
transport when the total travel time (TT) was improved by 30%, from an initial 1165
private vehicles (N) users being reduced to 561 users whilst 791 public transport (P)
users had increased to 1395 users. On average, users were triggered to switch mode
from private vehicles to public transport if the provided travel time follows accordingly
to their need or expectation, which were WD1 (4.49 minutes), WT (9.55 minutes), IVT
(24.89 minutes), WD2 (4.97 minutes), and TT (43.90 minutes) with fares RM 2.93 for
a journey of about 40 kilometres from Kuantan City. It can be concluded that the
machine learning technique is an effective approach to develop travel mode choice
modelling.