Using Twitter to Analyze Initiatives of a Startup Company

Published:

Using Twitter to Analyze Initiatives of a Startup Company by Bashayer Alotaibi (2020)

Public opinions are significant to almost any organization, companies, and governments, in which such entities are aware of the significance of utilizing the unstructured data in the social networks, and it has become a growing research area lately. Twitter, as a microblogging platform, represents a significant source of public opinions that is easily accessible. In the business domain, the previous research efforts in analyzing startups activities through Twitter analysis are generally limited, especially for the Arabic language. In Twitter analysis filed, there is a lack of a twitter- based analytics framework that combines different analysis methods to utilize Twitter dataset better. This thesis study aims to fill the literature research gaps through propose a Twitter analytics-based framework called Startup Initiatives Response Analysis (SIRA) that assesses the performance of an initiative taken by startup using text classification, sentiment analysis, and statistical analysis techniques. The proposed framework is validated empirically through a case study of Arabic startup (i.e., Careem), regarding the initiative of empowering women to work in the Careem. The study experiment was carried out based on using supervised machine learning in built the subject and sentiment classification models. As well, the classification models were evaluated through a comparative analysis in terms of examining a variety of machine learning classifiers, and various levels of preprocessing techniques to improve the performance of Arabic text mining. The study experiment yielded the following results: for both two Arabic classification models, CNB achieved the higher F1 measure with applying text cleaning and normalization as text preprocessing techniques. While for both two English classification models, NN achieved a higher F1 measure. In contrast, based on the classified dataset several statistical analyses were conducted and presented (e.g., Tweets Reply frequency, the temporal distribution of Tweets). The experiment results analysis confirms the effectiveness of such a framework in delivering valuable insights regarding the public responsiveness, based on a comprehensive qualitative and quantitative analysis of the Twitter dataset. The proposed framework (SIRA), is applicable as a Twitter-based analytics framework in any domain and for any purpose. It is recommended to validate the proposed framework with other experiments of different datasets.

Paper published