Measuring the South African Financial Cycle using Wavelet Analysis

Researcher: By Kabo Phage, University of the Witwatersrand, Johannesburg
Supervisor: Prof. Gregory Farrell, University of the Witwatersrand, Johannesburg

Financial cycles capture the evolution of risks to financial stability, and it follows that they are important for macro prudential policymakers.  The robust measurement thereof can aid in formulating and implementing policy. This report adds some new evidence to scarce South African literature by focusing on measuring the financial cycle using Continuous Wavelet Transform techniques, which can decompose a time series into statistically significant frequency ranges used to identify cyclical behaviour. The results show that the South African financial cycle is well defined when identified by the co movements of medium-term cycles in credit and house prices whereas equity prices tend to be less informative. Furthermore, the financial cycle is longer in duration than the traditional business cycle and so policymakers should focus monitoring on the medium term for the sake of identifying the buildup of risk.


Read More

Medicare fraud detection using extreme g radient boosted machines (XGBoost)

Researcher: Sheena Phillip, University of the Witwatersrand, Johannesburg
Supervisor: Dr Wilbert Chagwiza, University of the Witwatersrand, Johannesburg

Fraud detection of health care providers is a growing concern worldwide as billions of dollars is lost each year. Medicare publicly released health provider data in order to encourage the development of models to overcome fraud. The aim of this research is to train a model using the Medicare dataset to determine with what accuracy the model can predict fraud and to identify the top 5 features which contribute the most towards fraud detection. It also aims to investigate the impact that explicit features such as Provider, BeneID and ClaimID have on the accuracy of the model. Four datasets were combined into a single comprehensive dataset and was subsequently used to train an XGBoost model. The model had accuracy of 0 98 with a recall score of 0 97 and performed extremely well overall. The model trained on the dataset excluding explicit features produced an accuracy of 0 85 and a recall score of 0 71. Comparatively, the model performed poorly with a 13 drop in accuracy. It is noted that regardless of which feature space was used, the top 5 features encompassed details about the doctor as well as the location of the hospital.


Read More

Forecasting Accuracy Comparison of Various Machine Learning and Statistical Models on Stock Market Price Movements

Researcher: Ruan Pretorius, University of the Witwatersrand, Johannesburg
Supervisors: Prof. Terence van Zyl, University of Johannesburg and Dr Farai Mlambo, University of the Witwatersrand, Johannesburg

Accurate financial time series forecasts can assist investors in gaining a competitive edge over other participants in capital markets No empirical conclusion existed on what the most accurate model(s) were for forecasting stock market price movements over different forecast horizons Limitations from previous studies were addressed in this study by compared the forecasting accuracy of 20 different models on 403 time series of stocks/indices These included machine learning ( statistical and benchmark models The naïve benchmark model outperformed all other models in this study for nearly all accuracy metrics and forecast horizons tested.


Read More

The classification and clustering of bank telemarketing data using extreme gradient boost and k-prototype techniques

Researcher: Tselahale Serongwa, University of the Witwatersrand, Johannesburg
Supervisor: Dr Wilbert Chagwiza, University of the Witwatersrand, Johannesburg

The banking sector needs the ability to categorize the customer data they possess to enable business intelligence analytics and improve their marketing strategies. They require trivial automated models that yield interpretable results to comply with the financial regulations. An XGBoost model is built to determine the minimum number of attributes with the greatest impact in determining the potential of the 45211 customers to subscribe to a term deposit. The Synthetic Minority Over-sampling Technique was used to balance the dataset and eleven important attributes with 79% prediction power from 39 attributes. The model had an f1-score and testing accuracy of 93% whilst the model’s reliability was 86%. Four clusters were determined using the k-prototype clustering technique to group customers for tailored marketing strategies. It was determined that the bank had more chances of getting business from the 6859 customers clustered in the three most valuable clusters and should consider cheaper marketing options for the remainder of their customers.


Read More

Optimisation of hybrid neural network techniques used for stock market predictions

Researcher: Mohammad Rehman, University of the Witwatersrand, Johannesburg
Supervisor: Dr Wilbert Chagwiza, University of the Witwatersrand, Johannesburg

The combination of traditional technical and fundamental analysis techniques and machine learning techniques has become a common practice for informed stock price prediction.

The focus of this research was to stochastically optimise a long short term memory (LSTM) prediction model using a genetic algorithm (GA) and test its viability for predicting next day stock prices.

The hybrid GA optimised LSTM model created was able to achieve an RMSE of 247.30, an MAE of 190.22 and an MAPE of 1.52% indicating that a viable prediction model was constructed.


Read More

Rural Governance and Financial Inclusion: Does Rural Governance Matter in the Financial Inclusion in Developing Countries

Researcher: Takunda Pfigu, University of the Witwatersrand, Johannesburg
Supervisor: Dr Nyasha Mahonye, University of the Witwatersrand, Johannesburg

Financial Inclusion : Process that ensures the access to and usage of basic formal financial services for all. It is crucial for local economic development. Rural people, women and poor people are is proportionately unbanked.


Read More

Prediction of Lightning in South Africa using an LSTM Neural Network Model using Historical Lightning Data

Researcher: Yaseen Essa , University of the Witwatersrand, Johannesburg
Supervisors: Dr Ritesh Ajoodah and Dr Hugh Hunt , University of the Witwatersrand, Johannesburg

We evaluated the prediction ability of the Long-Short Term-Memory-Recurrent Neural Network (LSTM) model to predict short-term lightening flash densities within South Africa using historical lightening events. We predicted the lightening flash densities for one-hour periods for two areas within South Africa using data from the South African Lightening Detection Network. Models were trained using four years of data and predictions were made for every one-hour interval for one year. The models were tested repeatedly and cross-validated. We found a combined Mean Absolute Error of 2.87 lightening-flashes.hour and a combined Mean Squared Error of 1209. The model predicted 3 in 10 lightening events. We believe that LSTM models are useful tools to manage lightening risk.


Read More

Electoral Accountability in South Africa

Researcher: Leslie Dwolatzky, University of the Witwatersrand, Johannesburg
Supervisor: Prof. Rod Alence, University of the Witwatersrand, Johannesburg

This research project investigates the relationship between change in support for the ANC and the change in the provision of public services at the electoral ward level. The project replicates the study conducted by de Kadt and Lieberman (2017) and seeks to contribute further to their analysis. Contrary to expectations, there is a signi cant, negative relationship between service delivery and support for the ANC: The ANC is more likely to experience a decrease in support in electoral wards where it has done best at improving service provision.


Read More

Modelling Gold Production Using Sigmoid Models

Researcher: Ignitious Chauke, University of Venda
Supervisor: Dr Caston Sigauke, University of Venda

This study approximate monthly gold production from Sibanye-Stillwater South Africa (SA) gold operations based on five sigmoid models. The studied models were Gompertz, Gaussian, Probit and the Hill, which were used to forecast the future. Although all estimated five models offered a good realistic estimate, the Hill model was de-fined to better approximate the observed gold output pat-tern in Sibanye-Stillwater mines. The model has been chosen based on its high variance (R2) and the lowest error (RMSE) and information loss (AIC) value. The model indicated that the production of gold would be too low by 2035 given the current trend towards gold production persists in Sibanye-Stillwater (South Africa operation) mines continues. The Hill model findings were also backed ARIMA (0,1,2)(1,0,1)[4] model showed that the monthly gold production will continue to decrease until 2025.


Read More

Investigation of supervised learning methods to classify cell types from single-cell RNA sequencing data

Researcher: Warren Freeborough, University of the Witwatersrand, Johannesburg
Supervisors: Prof. Terence van Zyl, University of Johannesburg and Nikki Gentle, University of the Witwatersrand, Johannesburg

The study of living systems has prompted improvements in sequencing technology, which in turn has led to biological science entering the field of big data. To adequately study this single cell RNA sequencing (scRNA) data requires use of data scientific methods.

The aim of the study is to replicate the results produced by Grabski and Irizarry, using the same datasets, whilst exploring alternative supervised learning methods. In doing so, this study hopes to provide support for the models usage in scRNA classification or provide promising alternative to explore further.


Read More