Capstone Projects – Page 9 – National e-Science Postgraduate Teaching and Training Platform

Optimisation of hybrid neural network techniques used for stock market predictions

Written by nepttp18 May 2024

Researcher: Mohammad Rehman, University of the Witwatersrand, Johannesburg
Supervisor: Dr Wilbert Chagwiza, University of the Witwatersrand, Johannesburg

The combination of traditional technical and fundamental analysis techniques and machine learning techniques has become a common practice for informed stock price prediction.

The focus of this research was to stochastically optimise a long short term memory (LSTM) prediction model using a genetic algorithm (GA) and test its viability for predicting next day stock prices.

The hybrid GA optimised LSTM model created was able to achieve an RMSE of 247.30, an MAE of 190.22 and an MAPE of 1.52% indicating that a viable prediction model was constructed.

Rural Governance and Financial Inclusion: Does Rural Governance Matter in the Financial Inclusion in Developing Countries

Written by nepttp17 May 2024

Researcher: Takunda Pfigu, University of the Witwatersrand, Johannesburg
Supervisor: Dr Nyasha Mahonye, University of the Witwatersrand, Johannesburg

Financial Inclusion : Process that ensures the access to and usage of basic formal financial services for all. It is crucial for local economic development. Rural people, women and poor people are is proportionately unbanked.

Prediction of Lightning in South Africa using an LSTM Neural Network Model using Historical Lightning Data

Written by nepttp16 May 2024

Researcher: Yaseen Essa , University of the Witwatersrand, Johannesburg
Supervisors: Dr Ritesh Ajoodah and Dr Hugh Hunt , University of the Witwatersrand, Johannesburg

We evaluated the prediction ability of the Long-Short Term-Memory-Recurrent Neural Network (LSTM) model to predict short-term lightening flash densities within South Africa using historical lightening events. We predicted the lightening flash densities for one-hour periods for two areas within South Africa using data from the South African Lightening Detection Network. Models were trained using four years of data and predictions were made for every one-hour interval for one year. The models were tested repeatedly and cross-validated. We found a combined Mean Absolute Error of 2.87 lightening-flashes.hour and a combined Mean Squared Error of 1209. The model predicted 3 in 10 lightening events. We believe that LSTM models are useful tools to manage lightening risk.

Electoral Accountability in South Africa

Written by nepttp15 May 2024

Researcher: Leslie Dwolatzky, University of the Witwatersrand, Johannesburg
Supervisor: Prof. Rod Alence, University of the Witwatersrand, Johannesburg

This research project investigates the relationship between change in support for the ANC and the change in the provision of public services at the electoral ward level. The project replicates the study conducted by de Kadt and Lieberman (2017) and seeks to contribute further to their analysis. Contrary to expectations, there is a signicant, negative relationship between service delivery and support for the ANC: The ANC is more likely to experience a decrease in support in electoral wards where it has done best at improving service provision.

Modelling Gold Production Using Sigmoid Models

Written by nepttp14 May 2024

Researcher: Ignitious Chauke, University of Venda
Supervisor: Dr Caston Sigauke, University of Venda

This study approximate monthly gold production from Sibanye-Stillwater South Africa (SA) gold operations based on five sigmoid models. The studied models were Gompertz, Gaussian, Probit and the Hill, which were used to forecast the future. Although all estimated five models offered a good realistic estimate, the Hill model was de-fined to better approximate the observed gold output pat-tern in Sibanye-Stillwater mines. The model has been chosen based on its high variance (R2) and the lowest error (RMSE) and information loss (AIC) value. The model indicated that the production of gold would be too low by 2035 given the current trend towards gold production persists in Sibanye-Stillwater (South Africa operation) mines continues. The Hill model findings were also backed ARIMA (0,1,2)(1,0,1)[4] model showed that the monthly gold production will continue to decrease until 2025.

Investigation of supervised learning methods to classify cell types from single-cell RNA sequencing data

Written by nepttp13 May 2024

Researcher: Warren Freeborough, University of the Witwatersrand, Johannesburg
Supervisors: Prof. Terence van Zyl, University of Johannesburg and Nikki Gentle, University of the Witwatersrand, Johannesburg

The study of living systems has prompted improvements in sequencing technology, which in turn has led to biological science entering the field of big data. To adequately study this single cell RNA sequencing (scRNA) data requires use of data scientific methods.

The aim of the study is to replicate the results produced by Grabski and Irizarry, using the same datasets, whilst exploring alternative supervised learning methods. In doing so, this study hopes to provide support for the models usage in scRNA classification or provide promising alternative to explore further.

Predicting Insurance Claims Fraud using Random Forest

Written by nepttp12 May 2024

Researcher: Khanya May, University of the Witwatersrand, Johannesburg
Supervisor: Dr Wilbert Chagwiza, University of the Witwatersrand, Johannesburg

Fraud presents a threat that has serious consequences in the insurance industry. In recent years the use of machine learning and analytical techniques for fraud detection has been a topic of several research projects. This research aims to explore the capabilities of random forest in fraud detection. The random forest model is applied to TSA insurance claims data in an effort to predict fraud. The model achieved an accuracy and F measure of 69.72% and 75.04%, respectively. The performance of the model is better than that of an unskilled model which would accurately predict 54.67%. The is a 76.09% chance that the model will be able to distinguish between fraudulent and legitimate claims.

Credit Card Default Payment Prediction using Random Forest

Written by nepttp11 May 2024

Researcher: Thanganedzo Beverly Mashamba, University Venda
Supervisor: Dr W Chagwiza, University of the Witwatersrand, Johannesburg

Lending money to customers, it is a good investment if the credit card is with a customer with good credit standing[1, 2]. Credit card applicants are usually subjected to thorough background checks before their applications are approved or declined. By doing so, banks ensure that their credit cards are issued only to clients with the proven ability to repay the credit. For banks to identify those customers who are not worthy of being given a credit card, they must have in place models that reliably predict any risky behavior on the part of those who apply for credit cards. This study uses the Random forest machine-learning algorithm to predict the customers who deserve to be given a credit cards.

Customer churn prediction in telecoms industry using random forest

Written by nepttp10 May 2024

Researcher: Fortune Mhlanga, University of the Witwatersrand, Johannesburg
Supervisor: Dr Wilbert Chagwiza, University of the Witwatersrand, Johannesburg

The telecommunications industry lose customers to their competitors daily in the world. The churning of customers always leads to reduced revenues. The need to create predictive models that would predict customers that are likely to churn will significantly increase the revenue of the industry.

This study seeks to develop a customer churn prediction model using the random forest algorithm. The performance of the model is measured using statistical measures such as accuracy, precision, recall, f-measure, Cohen’s kappa, Gini coefficient and Matthew’s correlation coefficient. This study explores the SK-learn default filter method, the step forward feature selection, the step backward feature selection, and the exhaustive feature selection, to find features that can be used as inputs of the model. This study shows that the random forest algorithm is not the best at predicting customer churn for the telecommunications industry, since the best performing model has a recall metric value of 52.2%.

Event classification for gamma-hadron separation for H.E.S.S

Written by nepttp9 May 2024

Researcher: Wandile Lesejane, University of the Witwatersrand, Johannesburg
Supervisor: Prof Nukri Komin, University of the Witwatersrand, Johannesburg

The H.E.S.S is one of the best IACTs and is crucial for studying cosmic particles, particularly gamma induced particles. It is able todetect particles with energies ranging from tens of GeV to TeV. The challenge stems from the influx of the hadronic air showers which are more common and can obscure the detection of gamma particles. Deep Neural Networks were employed to discriminate the gamma events from hadron events using data that was simulated using KASKADE and SMASH softwares. The model had a performance accuracy of 97.42% and a loss of 7.21% at its best and an accuracy of 53% at its poorest.

Categories