Currency Exchange Rate Forecasting: A Text-mining Approach
Researcher: Zanele Khumalo, University of the Witwatersrand, Johannesburg
Supervisor: Mr Rendani Mbuvha, University of the Witwatersrand, Johannesburg
The currency market deals with all aspects of buying, selling and hedging currencies. Financial markets are a complicated system and are difficult to model due to structural instabilities and noise driven by different factors such as economic conditions, investor behaviour, and politics. Investors are required to be abreast of the latest economic news while also synthesizing historical market performance in order to come up with investment strategies that minimise risks and maximise profits. With the rapid growth of digitisation of news articles, analysis of such information can be difficult due to the volume and variety of contents; therefore, requiring automation. Text mining approaches can assist in automating the process of extracting useful information from multiple news article contents and possibly improve market predictions when combined with historical market prices.
In this thesis, we investigate whether information derived from news articles adds statistically significant predictive power in exchange rate market forecasting. This is done by comparing the performance of the ARIMA, SVR and LSTM in forecasting the closing prices of the USD/ZAR currency pair. Furthermore, we investigate the effect of Reddit news headlines and Reuters news article contents on the prediction of the USD/ZAR currency pair, by using Latent Dirichlet Allocation (LDA), to extract topics from raw text documents and using these as additional input features to LSTM and SVR models. The results show that the traditional ARIMA outperforms the SVR and LSTM models in forecasting the USD/ZAR closing prices, however, the additional news features can yield statistically significant improvements in performances of SVR models when forecasting the daily USD/ZAR closing prices. While marginally improving LSTM models. The performances of improved SVR and LSTM models show no statistically significant differences when compared to ARIMA models.