• Login
    View Item 
    •   Mak IR Home
    • College of Business and Management Sciences (CoBAMS)
    • School of Statistics and Planning (SSP)
    • School of Statistics and Planning (SSP) Collections
    • View Item
    •   Mak IR Home
    • College of Business and Management Sciences (CoBAMS)
    • School of Statistics and Planning (SSP)
    • School of Statistics and Planning (SSP) Collections
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    A classification algorithm for delinquent invoices in Uganda

    Thumbnail
    View/Open
    Master's dissertation (1.291Mb)
    Date
    2021-05
    Author
    Mutungi, Gilbert
    Metadata
    Show full item record
    Abstract
    This study aimed at developing a classification algorithm for delinquent invoices in Uganda. Data on 2028 invoices was extracted from Patasente, an e- procurement platform in Uganda. Gain Ratios for different attributes were used to determine each attribute’s importance in determining the payment outcome of an invoice. C4.5 decision tree, random forest and logistic regression models were developed to classify the invoices into two categories; those paid on time and the others paid late. Both models were tested using the 0.632 bootstrap method in order to compare their performance levels. Results showed that 34% of the invoices were paid late. Invoice base amount (0.021), proportion of previously delayed invoices (0.0166), Customer Location (0.01226) and product or service offered (0.01223) were the most important attributes in determining whether an invoice was paid on time. The Random Forest Algorithm had the highest classification accuracy with a rate of 83.76% while the C4.5 Decision tree and Logistic regression models had accuracy rates of 71.15% and 66.27% respectively. The Kappa statistics for the models were 0.621, 0.336 and 0.085 respectively. The study concluded that the Random Forest Classification Algorithm (83.76%) provides higher accuracy results than both the Decision tree Algorithm (71.15%) and Logistic Regression (66.27%) in classifying the payment outcome of an invoice. The study recommends using a larger dataset across more years so as this may increase accuracy rates. Furthermore, incorporating a cost matrix in the model that punishes wrongly predicting late invoices as on time (False Positives) may improve the model’s relevance to businesses and is thus recommended.
    URI
    http://hdl.handle.net/10570/10083
    Collections
    • School of Statistics and Planning (SSP) Collections

    DSpace 5.8 copyright © Makerere University 
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     

    Browse

    All of Mak IRCommunities & CollectionsTitlesAuthorsBy AdvisorBy Issue DateSubjectsBy TypeThis CollectionTitlesAuthorsBy AdvisorBy Issue DateSubjectsBy Type

    My Account

    LoginRegister

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    DSpace 5.8 copyright © Makerere University 
    Contact Us | Send Feedback
    Theme by 
    Atmire NV