Capstone Projects – Page 8 – National e-Science Postgraduate Teaching and Training Platform

Random Forest Model for Stock Prediction Based on Fundamental Analysis

Written by nepttp23 May 2024

Researcher: Rinae Tshivhidzo, University of the Witwatersrand, Johannesburg
Supervisor: Dr Wilbert Chagwiza, University of the Witwatersrand, Johannesburg

Researchers have devoted their focus to studying stock market prediction. However, predicting the stock trend is still an open question. With the development of the advanced period, the prediction of stock market has moved to the field of machine learning. In many kinds of research that have been conducted, machine learning techniques have proven to be robust in predicting stock market. This research aimed to train a model that accurately predicts stock market based on fundamental analysis using a random forest algorithm. The model is trained on the US stock market using fundamental analysis. The results show that the random forest model succeeds in the prediction of stock market. Hence random forest was able to predict stock market with a 0.93 degree of accuracy.

Measuring the South African Financial Cycle using Wavelet Analysis

Written by nepttp22 May 2024

Researcher: By Kabo Phage, University of the Witwatersrand, Johannesburg
Supervisor: Prof. Gregory Farrell, University of the Witwatersrand, Johannesburg

Financial cycles capture the evolution of risks to financial stability, and it follows that they are important for macro prudential policymakers. The robust measurement thereof can aid in formulating and implementing policy. This report adds some new evidence to scarce South African literature by focusing on measuring the financial cycle using Continuous Wavelet Transform techniques, which can decompose a time series into statistically significant frequency ranges used to identify cyclical behaviour. The results show that the South African financial cycle is well defined when identified by the co movements of medium-term cycles in credit and house prices whereas equity prices tend to be less informative. Furthermore, the financial cycle is longer in duration than the traditional business cycle and so policymakers should focus monitoring on the medium term for the sake of identifying the buildup of risk.

Medicare fraud detection using extreme g radient boosted machines (XGBoost)

Written by nepttp21 May 2024

Researcher: Sheena Phillip, University of the Witwatersrand, Johannesburg
Supervisor: Dr Wilbert Chagwiza, University of the Witwatersrand, Johannesburg

Fraud detection of health care providers is a growing concern worldwide as billions of dollars is lost each year. Medicare publicly released health provider data in order to encourage the development of models to overcome fraud. The aim of this research is to train a model using the Medicare dataset to determine with what accuracy the model can predict fraud and to identify the top 5 features which contribute the most towards fraud detection. It also aims to investigate the impact that explicit features such as Provider, BeneID and ClaimID have on the accuracy of the model. Four datasets were combined into a single comprehensive dataset and was subsequently used to train an XGBoost model. The model had accuracy of 0 98 with a recall score of 0 97 and performed extremely well overall. The model trained on the dataset excluding explicit features produced an accuracy of 0 85 and a recall score of 0 71. Comparatively, the model performed poorly with a 13 drop in accuracy. It is noted that regardless of which feature space was used, the top 5 features encompassed details about the doctor as well as the location of the hospital.

Forecasting Accuracy Comparison of Various Machine Learning and Statistical Models on Stock Market Price Movements

Written by nepttp20 May 2024

Researcher: Ruan Pretorius, University of the Witwatersrand, Johannesburg
Supervisors: Prof. Terence van Zyl, University of Johannesburg and Dr Farai Mlambo, University of the Witwatersrand, Johannesburg

Accurate financial time series forecasts can assist investors in gaining a competitive edge over other participants in capital markets No empirical conclusion existed on what the most accurate model(s) were for forecasting stock market price movements over different forecast horizons Limitations from previous studies were addressed in this study by compared the forecasting accuracy of 20 different models on 403 time series of stocks/indices These included machine learning ( statistical and benchmark models The naïve benchmark model outperformed all other models in this study for nearly all accuracy metrics and forecast horizons tested.

The classification and clustering of bank telemarketing data using extreme gradient boost and k-prototype techniques

Written by nepttp19 May 2024

Researcher: Tselahale Serongwa, University of the Witwatersrand, Johannesburg
Supervisor: Dr Wilbert Chagwiza, University of the Witwatersrand, Johannesburg

The banking sector needs the ability to categorize the customer data they possess to enable business intelligence analytics and improve their marketing strategies. They require trivial automated models that yield interpretable results to comply with the financial regulations. An XGBoost model is built to determine the minimum number of attributes with the greatest impact in determining the potential of the 45211 customers to subscribe to a term deposit. The Synthetic Minority Over-sampling Technique was used to balance the dataset and eleven important attributes with 79% prediction power from 39 attributes. The model had an f1-score and testing accuracy of 93% whilst the model’s reliability was 86%. Four clusters were determined using the k-prototype clustering technique to group customers for tailored marketing strategies. It was determined that the bank had more chances of getting business from the 6859 customers clustered in the three most valuable clusters and should consider cheaper marketing options for the remainder of their customers.

Optimisation of hybrid neural network techniques used for stock market predictions

Written by nepttp18 May 2024

Researcher: Mohammad Rehman, University of the Witwatersrand, Johannesburg
Supervisor: Dr Wilbert Chagwiza, University of the Witwatersrand, Johannesburg

The combination of traditional technical and fundamental analysis techniques and machine learning techniques has become a common practice for informed stock price prediction.

The focus of this research was to stochastically optimise a long short term memory (LSTM) prediction model using a genetic algorithm (GA) and test its viability for predicting next day stock prices.

The hybrid GA optimised LSTM model created was able to achieve an RMSE of 247.30, an MAE of 190.22 and an MAPE of 1.52% indicating that a viable prediction model was constructed.

Rural Governance and Financial Inclusion: Does Rural Governance Matter in the Financial Inclusion in Developing Countries

Written by nepttp17 May 2024

Researcher: Takunda Pfigu, University of the Witwatersrand, Johannesburg
Supervisor: Dr Nyasha Mahonye, University of the Witwatersrand, Johannesburg

Financial Inclusion : Process that ensures the access to and usage of basic formal financial services for all. It is crucial for local economic development. Rural people, women and poor people are is proportionately unbanked.

Prediction of Lightning in South Africa using an LSTM Neural Network Model using Historical Lightning Data

Written by nepttp16 May 2024

Researcher: Yaseen Essa , University of the Witwatersrand, Johannesburg
Supervisors: Dr Ritesh Ajoodah and Dr Hugh Hunt , University of the Witwatersrand, Johannesburg

We evaluated the prediction ability of the Long-Short Term-Memory-Recurrent Neural Network (LSTM) model to predict short-term lightening flash densities within South Africa using historical lightening events. We predicted the lightening flash densities for one-hour periods for two areas within South Africa using data from the South African Lightening Detection Network. Models were trained using four years of data and predictions were made for every one-hour interval for one year. The models were tested repeatedly and cross-validated. We found a combined Mean Absolute Error of 2.87 lightening-flashes.hour and a combined Mean Squared Error of 1209. The model predicted 3 in 10 lightening events. We believe that LSTM models are useful tools to manage lightening risk.

Electoral Accountability in South Africa

Written by nepttp15 May 2024

Researcher: Leslie Dwolatzky, University of the Witwatersrand, Johannesburg
Supervisor: Prof. Rod Alence, University of the Witwatersrand, Johannesburg

This research project investigates the relationship between change in support for the ANC and the change in the provision of public services at the electoral ward level. The project replicates the study conducted by de Kadt and Lieberman (2017) and seeks to contribute further to their analysis. Contrary to expectations, there is a signicant, negative relationship between service delivery and support for the ANC: The ANC is more likely to experience a decrease in support in electoral wards where it has done best at improving service provision.

Modelling Gold Production Using Sigmoid Models

Written by nepttp14 May 2024

Researcher: Ignitious Chauke, University of Venda
Supervisor: Dr Caston Sigauke, University of Venda

This study approximate monthly gold production from Sibanye-Stillwater South Africa (SA) gold operations based on five sigmoid models. The studied models were Gompertz, Gaussian, Probit and the Hill, which were used to forecast the future. Although all estimated five models offered a good realistic estimate, the Hill model was de-fined to better approximate the observed gold output pat-tern in Sibanye-Stillwater mines. The model has been chosen based on its high variance (R2) and the lowest error (RMSE) and information loss (AIC) value. The model indicated that the production of gold would be too low by 2035 given the current trend towards gold production persists in Sibanye-Stillwater (South Africa operation) mines continues. The Hill model findings were also backed ARIMA (0,1,2)(1,0,1)[4] model showed that the monthly gold production will continue to decrease until 2025.

Categories