Identifying Misinformation Spreaders on Twitter using Ego-Centric Network Embeddings

Published:

Identifying Misinformation Spreaders on Twitter using Ego-Centric Network Embeddings by Atta Ullah (2023)

Social Media platforms such as Twitter remove distance barriers and play an important role in broadcasting information due to their ease of use, speed, and accessibility. As a result, a huge volume of information is generated every day. However, it causes the spread of misinformation, or false information which leads to catastrophic events and spreads uncertainty around in societies. Some organizations like Snoops and PolitiFact check the authenticity of social media posts related to politicians and celebrities. But misinformation is not limited to only politicians and celebrities and therefore we need an automatic approach to detect misinformation on time. To address this sensitive problem, Researchers used machine and deep learning models such as SVM, XGBoost, Random Forest, Decision Trees, Recurrent Neural Networks, etc., by extracting stylometry features such as sentence length, sentence segmentation, part-of-speech tagging, tokenization, and linguistic features such as sentimental features, word frequency, and bag-of-words. Most of these approaches only use the contents of the tweets for detecting misinformation. Since false information is engineered to influence a wide range of users, it is difficult to detect misinformation purely based on contents. Similarly, fine-tuning model on one dataset does not perform well on another dataset because of differences in domains such as fine-tuning a model on ``COVID-19’’ related datasets cannot perform well on political statements related datasets. Therefore, more information such as social context or propagation feature is required to detect misinformation. In this thesis, we propose a model to distinguish between misinformation spreaders and regular Twitter users by utilizing the propagation feature of tweets (i.e., how the information flows in the network?). The proposed model is based on Graph Neural Network (GNN) and consists of two parts. First, we generate an ego-centric graph up to 3 hops and then apply a state-of-the-art GNN model to detect misinformation spreaders. Experimental results show that the deep learning classifier “Deep Graph Convolutional Neural Network” (DGCNN) outperforms in term of Mathew’s Correlational Coefficient (MCC). The DGCNN consists of three parts: In the first part the layers of Graph Convolutional Network are used to learn embeddings for each user by aggregating their neighborhood information, then a Sort-Polling layer is used to sort the vertex features, and then use a traditional convolutional layer and a dense layer to learn embeddings on graph level. Experimental results show that propagation features are valuable features to detect misinformation. We compare our results with other baseline models and the proposed model outperforms the baseline models in terms of Accuracy, ROC-AUC, and MCC.