Using text mining on reviews of Accommodation establishments on TripAdvisor to draw insights on perceived South African weather

Researcher: Subhashini Pillay, University of the Witwatersrand, Johannesburg
Supervisors: Dr Jennifer Fitchett, University of the Witwatersrand, Johannesburg

Social media is an becoming an important platform for deriving the opinions, views and perceptions of tourists (O’Connor, 2010);
TripAdvisor is the most popular reviewing platform for tourists (Litvin and Dowling, 2018); and
With the sheer amount of data being generated from TripAdvisor, text mining techniques may be the easiest way to determine tourists views and perceptions related to the weather (Berezina et al, 2016).


Read More

Survival Analysis: Modelling Time to Event Data with Time Varying Features using Survival Trees and Forests

Researcher: Andisani Nemavhola, University of Venda
Supervisors: Dr Justine Nasejje, University of the Witwatersrand, Johannesburg

This study presents and implements random survival forests (RSF) and two of its split rules. The two RSF split rules that this study focuses on are the ????1 and the logrank.

The research questions for this study are the following:

  1. Can the proposed ????1 split rule for building RSF be used in the presence of non-proportional hazards
    or time varying covariates?
  2. Does the proposed ????1 split rule for RSF perform better than the log rank split rule?


Read More

Deep Learning & Its Application To Searches Beyond The Standard Model With The ATLAS Detector At The Large Hadron Collider

Researcher: Asiphe Mzaza, University of the Witwatersrand, Johannesburg
Supervisors: Prof. Bruce Mellado, University of the Witwatersrand, Johannesburg

Anomalies in the observations of Run 1 and Run 2 data from the ATLAS detector at the LHC suggest physics beyond the Standard Model. The Madala Hypothesis, attempts to explain these anomalies with new hypothetical scalars H and S with 2mh<mH<2mH and mh<mS<mH, where mh and mt are the masses of the Higgs boson and top-quark. These bosons are being searched in the leptonic decay channels, such as H→SS→4W [1], which have extremely low signal-to-background ratios. A discriminative machine learning algorithm is developed to distinguish between signal and background. Specifically, the viability of a deep neural network is assessed. A neural network model with useful diagnostic power is constructed using simulations of the di-lepton decay process.


Read More

Market Fraction Hypothesis: Application to the South African Financial Market

Researchers: Patrick Mthisi, University of the Witwatersrand, Johannesburg
Supervisor: Dr Yudhvir Seetharam, University of the Witwatersrand, Johannesburg

This project uses a disaggregate modeling approach to model the behaviour of key financial market players in order to assess the quality of South African financial market. To achieve this, an Agent-based Model [1] is used as the primary tool. In addition, theoretical concepts that underlay Market Fraction Hypothesis [2] are used and the market fractions of the agents’ risk clusters are observed over a period of time to assess the quality of the financial market.


Read More

Understanding The Jet Activities In Association With The Dileptons Using Machine Learning Techniques With The Use Of The ATLAS Detector At LHC

Researcher: Mashaka Molepo, University of Venda
Supervisor: Professor Bruce Mellado, University of the Witwatersrand, Johannesburg

The use of neural networks as sources for the study in Monte Carlo simulation for the detection of hadronic -decay w boson including strong jet quacks [1]. The need for data science algorithms in deep neural networks leads to a strong rejection of the fundamental one-jet correlation sub-structure [1]. The use of deep learning methods has been noticed in the context of an increased deep neural networks to enhance context exclusion including the use of implantation sub-sub-structures and mass-tagers [1]. The corresponding classifiers are often linearly associated with the parameters in their substructures.


Read More

Comparative Analysis of Logistic Regression over Homomorphically Encrypted Data and Decrypted Data

Researcher: Linhle Mbombo, University of Venda
Supervisor: Professor Augustine Munagi and Professor Turgay Çelik, University of the Witwatersrand, Johannesburg

Machine learning (ML) algorithms is improving auto-mated tasks, using data to make predictions and solve clustering problems. The data that is used to fit the model is from institutions, organizations, etc. that have sensitive data. Such as personal, medical and financial data. The research seeks to bridge the security gab and design secure measures of preserving the privacy of data used in the process. Homomorphic encryption concept allows computations on encrypted data. The research uses Paillier algorithm for an encryption scheme.


Read More

The development of Extreme Learning Machine with di-lepton data from the ATLAS detector at the LHC

Researcher: Makgoka Nkoana, University of the Witwatersrand, Johannesburg
Supervisor: Dr Farai Mlambo, University of the Witwatersrand, Johannesburg

Wavelet Neural Networks were inspired by neural networks as well as wavelet decomposition, and as such Zhang and Beneveniste sort to combine the two concepts. This particular network is often used in the financial industry for making forecasts related to the stock market and as such this Investigation sort to compare the results of stock market forecasts made several implementations of the Wavelet Neural Networks, with different activation functions. The main findings from this investigation found that the Shannon Wavelet Neural Network and the Gaussian Neural Network were the best performing followed by the Mexican Hat Neural Network.


Read More

Credit Card Fraud Detection using Machine Learning

Researcher: Seleme Shoky, University of Venda
Supervisor: Dr Farai Mlambo, University of the Witwatersrand, Johannesburg

Each year credit card fraud is growing significantly with the advancements of technology resulting in extreme losses to those affected. We build ML model to detect fraudulent activity in credit card transaction systems. The binary classifiers build are Neural Network and Random Forest. We used Random Forest for variable importance. The aim to develop an approach which will detect fraud with high recall score and low number of false positives(Sampling Techniques). We used ROC curves, confusion matrices and precision recall statistics to measure the performance of the models.


Read More

Wind Speed Forecasting Using Long Short-Term Memory (LSTM) Networks

Researcher: Sindisiwe Zulu, University of the Witwatersrand, Johannesburg
Supervisor: Mr Rendani Mbuvha, University of the Witwatersrand, Johannesburg

Wind energy is seen as the next promising renewable energy to be used for future power generation. The stochastic wind behavior has resulted in the development of improved wind forecasting techniques. New techniques are required for wind speed forecasting. This research investigates Long Short Term Memory (LSTM) networks for 1 hour to 3 hours ahead forecasting of wind speed. In comparison with LSTM, the Multilayer Perceptron (MLP) model tests the LSTM model’s efficiency. From the results it is shown that the LSTM model outperformed the MLP model however, the differences in performance are not statistically significant. Best results were obtained from 1 hour ahead forecasting.


Read More

Contrasting trained Wavelet Neural Networks with an application to bankruptcy prediction in banks

Researcher: Mamphaga Ratshilengo, University of Venda, Johannesburg
Supervisor: Dr Farai Mlambo, University of the Witwatersrand, Johannesburg

Wavelet is a wavy oscillation with width, scale, or magnitude that begin at zero point, rises, and decline to zero again [1]. Neural network (NN) is a series of algorithms aims in recognising the relations in a set of data through a process that characterises the work of human brain. NNs are applicable in various field, including speech recognition, hand-written digit recognition and when driving a car [2]. In WNNs, wavelets and NNs are combined into one thing [1].


Read More