Targeted aspect-based sentiment analysis for Ugandan social media reviews
Abstract
Social media has emerged as a popular sounding board for customers to express their experiences with products or services. Keeping track of all customer's emotions helps businesses to better understand customer feedback and opinions. However, data from social media is voluminous with a high velocity making manual summarisation impractical. Different levels of sentiment analysis have facilitated automatic summarisation of social media reviews.
However, they do not capture the sentiment over multiple aspects of multiple target entities. To remedy this challenge, we introduce targeted aspect-based sentiment analysis (TABSA).
TABSA extracts from review text both the aspects and target entities then resolves the sentiment towards the aspects.
In this research, we investigated whether Twitter and Facebook reviews contain enough information for TABSA. The study also explored to find out if machine learning models can be trained to extract TABSA information from these reviews. In order to achieve the first objective, we extracted Ugandan telecom social media reviews from Twitter and Facebook for the period between February 2019 and September 2019.
We collected almost 22,000 from Twitter and Facebook. The reviews were either in English or had a code-mix of English and Luganda. The reviews were preprocessed to remove spam reviews and tweets or posts made by the telecoms themselves. The remaining reviews were human-annotated with the target telecom, aspects and sentiment towards the aspects of target telecom.
After the annotation, a dataset called SentiTel was constructed with a final Cohen Kappa inter-annotation agreement of 0.59, 0.73 and 0.60 respectively on the aspects. SentiTel contains 6,320 telecom related reviews. Through the annotation of the SentiTel dataset, it was demonstrated that Twitter and Facebook reviews are rich in information and can be used for TABSA.
To achieve the second objective, we trained several machine learning models which include random forest, logistic regression, attentive Bi-LSTM and BERT. The TABSA task is modeled as a sentence pair classification task with the labels "positive", "negative" and "none".
On the aspect category detection task, the models random forest, logistic regression and BERT obtained AUC scores of 0.883, 0.705 and 0.95 respectively. On sentiment classification, the models obtained AUC scores of 0.915, 0.895 and 0.965 respectively. In both tasks, the best results were obtained using the BERT model. These AUC scores indicate that a machine learning model can be trained to extract TABSA information from Twitter and Facebook reviews thus achieving the second objective of this study. The analysis of results indicated that the performance of the models depends on several factors. These factors include: the aspect lexicon variation, review length and the review category.