Targeted aspect-based sentiment analysis for Ugandan social media reviews

Kabiito, David

dc.contributor.author	Kabiito, David
dc.date.accessioned	2021-03-11T08:38:45Z
dc.date.available	2021-03-11T08:38:45Z
dc.date.issued	2021
dc.identifier.citation	David, K. (2021).Targeted aspect-based sentiment analysis for Ugandan social media reviews. Master thesis. Makerere University.	en_US
dc.identifier.uri	http://hdl.handle.net/10570/8131
dc.description	A research thesis submitted to the School of Computing and Informatics Technology for the study leading to requirements for the award of master of science in computer science of Makerere University	en_US
dc.description.abstract	Social media has emerged as a popular sounding board for customers to express their experiences with products or services. Keeping track of all customer's emotions helps businesses to better understand customer feedback and opinions. However, data from social media is voluminous with a high velocity making manual summarisation impractical. Different levels of sentiment analysis have facilitated automatic summarisation of social media reviews. However, they do not capture the sentiment over multiple aspects of multiple target entities. To remedy this challenge, we introduce targeted aspect-based sentiment analysis (TABSA). TABSA extracts from review text both the aspects and target entities then resolves the sentiment towards the aspects. In this research, we investigated whether Twitter and Facebook reviews contain enough information for TABSA. The study also explored to find out if machine learning models can be trained to extract TABSA information from these reviews. In order to achieve the first objective, we extracted Ugandan telecom social media reviews from Twitter and Facebook for the period between February 2019 and September 2019. We collected almost 22,000 from Twitter and Facebook. The reviews were either in English or had a code-mix of English and Luganda. The reviews were preprocessed to remove spam reviews and tweets or posts made by the telecoms themselves. The remaining reviews were human-annotated with the target telecom, aspects and sentiment towards the aspects of target telecom. After the annotation, a dataset called SentiTel was constructed with a final Cohen Kappa inter-annotation agreement of 0.59, 0.73 and 0.60 respectively on the aspects. SentiTel contains 6,320 telecom related reviews. Through the annotation of the SentiTel dataset, it was demonstrated that Twitter and Facebook reviews are rich in information and can be used for TABSA. To achieve the second objective, we trained several machine learning models which include random forest, logistic regression, attentive Bi-LSTM and BERT. The TABSA task is modeled as a sentence pair classification task with the labels "positive", "negative" and "none". On the aspect category detection task, the models random forest, logistic regression and BERT obtained AUC scores of 0.883, 0.705 and 0.95 respectively. On sentiment classification, the models obtained AUC scores of 0.915, 0.895 and 0.965 respectively. In both tasks, the best results were obtained using the BERT model. These AUC scores indicate that a machine learning model can be trained to extract TABSA information from Twitter and Facebook reviews thus achieving the second objective of this study. The analysis of results indicated that the performance of the models depends on several factors. These factors include: the aspect lexicon variation, review length and the review category.	en_US
dc.language.iso	en	en_US
dc.publisher	Makerere University	en_US
dc.subject	auxiliary sentences	en_US
dc.subject	social media	en_US
dc.subject	Social media	en_US
dc.subject	Sentiment analysis	en_US
dc.subject	Social media reviews	en_US
dc.subject	Bi-LSTM	en_US
dc.subject	BERT	en_US
dc.title	Targeted aspect-based sentiment analysis for Ugandan social media reviews	en_US
dc.type	Thesis	en_US

Files in this item

Name:: SentiTel_TABSA_Thesis_David ...
Size:: 1.618Mb
Format:: PDF
Description:: Full Thesis

View/Open

This item appears in the following Collection(s)

School of Computing and Informatics Technology (CIT) Collection

Show simple item record