Show simple item record

dc.contributor.authorKabiito, David
dc.date.accessioned2021-03-11T08:38:45Z
dc.date.available2021-03-11T08:38:45Z
dc.date.issued2021
dc.identifier.citationDavid, K. (2021).Targeted aspect-based sentiment analysis for Ugandan social media reviews. Master thesis. Makerere University.en_US
dc.identifier.urihttp://hdl.handle.net/10570/8131
dc.descriptionA research thesis submitted to the School of Computing and Informatics Technology for the study leading to requirements for the award of master of science in computer science of Makerere Universityen_US
dc.description.abstractSocial media has emerged as a popular sounding board for customers to express their experiences with products or services. Keeping track of all customer's emotions helps businesses to better understand customer feedback and opinions. However, data from social media is voluminous with a high velocity making manual summarisation impractical. Different levels of sentiment analysis have facilitated automatic summarisation of social media reviews. However, they do not capture the sentiment over multiple aspects of multiple target entities. To remedy this challenge, we introduce targeted aspect-based sentiment analysis (TABSA). TABSA extracts from review text both the aspects and target entities then resolves the sentiment towards the aspects. In this research, we investigated whether Twitter and Facebook reviews contain enough information for TABSA. The study also explored to find out if machine learning models can be trained to extract TABSA information from these reviews. In order to achieve the first objective, we extracted Ugandan telecom social media reviews from Twitter and Facebook for the period between February 2019 and September 2019. We collected almost 22,000 from Twitter and Facebook. The reviews were either in English or had a code-mix of English and Luganda. The reviews were preprocessed to remove spam reviews and tweets or posts made by the telecoms themselves. The remaining reviews were human-annotated with the target telecom, aspects and sentiment towards the aspects of target telecom. After the annotation, a dataset called SentiTel was constructed with a final Cohen Kappa inter-annotation agreement of 0.59, 0.73 and 0.60 respectively on the aspects. SentiTel contains 6,320 telecom related reviews. Through the annotation of the SentiTel dataset, it was demonstrated that Twitter and Facebook reviews are rich in information and can be used for TABSA. To achieve the second objective, we trained several machine learning models which include random forest, logistic regression, attentive Bi-LSTM and BERT. The TABSA task is modeled as a sentence pair classification task with the labels "positive", "negative" and "none". On the aspect category detection task, the models random forest, logistic regression and BERT obtained AUC scores of 0.883, 0.705 and 0.95 respectively. On sentiment classification, the models obtained AUC scores of 0.915, 0.895 and 0.965 respectively. In both tasks, the best results were obtained using the BERT model. These AUC scores indicate that a machine learning model can be trained to extract TABSA information from Twitter and Facebook reviews thus achieving the second objective of this study. The analysis of results indicated that the performance of the models depends on several factors. These factors include: the aspect lexicon variation, review length and the review category.en_US
dc.language.isoenen_US
dc.publisherMakerere Universityen_US
dc.subjectauxiliary sentencesen_US
dc.subjectsocial mediaen_US
dc.subjectSocial mediaen_US
dc.subjectSentiment analysisen_US
dc.subjectSocial media reviewsen_US
dc.subjectBi-LSTMen_US
dc.subjectBERTen_US
dc.titleTargeted aspect-based sentiment analysis for Ugandan social media reviewsen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record