Analysis and Classification of Traffic Events on Social Media
Published:
Analysis and Classification of Traffic Events on Social Media by Hajra Ali (2016)
Traffic congestion is an important social issue which wastes time and also causes economical loss. Many studies have mentioned that in recent times transportation systems rely more on data for efficient transport management and control. Rapid emergence of social media provides a novel avenue for information creation and dissemination. In the past few years, there has been an immense increase in the usage of smart phone applications. These applications offer access to information and services which produces significant amount of real-time information. Among the social media, Twitter is considered as one of the widely used micro-blogging and social network service that has been largely investigated for real-time event detection. Twitter users share information about many real-time occurring phenomena by posting small text messages. These messages also contain information related to Traffic events. In this study we utilize Twitter as our main source to obtain data that has significant value to provide traffic related information. Mining techniques discover hidden and useful patterns from the traffic data available on Twitter. Therefore, exploring mining techniques to provide traffic related information can be useful. In machine learning classification algorithms are used to build models which categorize data, based on features, into different classes. In this thesis, we present a framework which employs classification based method to categorize traffic data into various classes which represent traffic events (i.e., traffic jam, accidents, road blocked). Traffic tweets not only contain information about traffic jams or congestion, but they also reflect small and major events that cause traffic on the roads. Therefore, it is important and useful to identify all these traffic events that can provide useful traffic information. This system can help commuters and traffic authorities to regulate traffic flow by taking immediate measures. To approach our problem of classification, we use Random Forest and Support Vector Machines classification algorithms. We also investigate and identify important features which represent important characteristics of the data to carried out our experiments. We also use combinations of features that can contribute in achieving better accuracy. To evaluate the performance of the classifier model, we use performance metrics: precision, recall, and accuracy. Experimental results indicated the effectiveness of our approach for automating the process of traffic events classification.