Machine learning and data science application for financial price prediction and portfolio optimization

Loading...
Thumbnail Image
Date
2024-08-23
Authors
Dip Das, Joy
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This thesis explores interconnected advanced machine learning (ML) and data science (DS) methodologies for improved predictive accuracy in financial markets and resilient portfolio optimization. Studying the literature on ML/DS methodologies extensively led us to observe a significant lack of application of these advances, such as autoencoder (AE), recurrent neural networks (RNN), etc. in the finance industry. The novelty of this thesis is to study price prediction and portfolio optimization with RNN and AE algorithms. Furthermore, unsupervised ML strategies were studied to introduce robustness in portfolio optimization. For this purpose, two innovative encoder-decoder-based RNN architectures autoencoder-based gated recurrent unit (AE-GRU) and autoencoder-based long short-term memory (AE-LSTM) were proposed, which were shown to be effective in predictive efficacy across diverse asset types and market conditions, showcasing enhanced predictive accuracy for financial assets. Various DS concepts, such as data visualization, Bollinger bands, data-driven volatility estimates, unsupervised ML, etc. were integrated while implementing and experimenting with new architectures for price prediction and portfolio optimization. The proposed models in this thesis showed effectiveness in price prediction and portfolio optimization under varying market conditions. The study also highlights the benefits of diversified portfolios by proposing a novel DL-based model for portfolio construction, especially when coupled with affinity propagation (AP) clustering and appropriate data-driven risk measures based on volatility estimates - with sign correlation (VES) and volatility correlation (VEV). Traditional models optimize portfolio weights using objective functions, while recent innovations emphasize data-driven risk measures for minimum risk weights from random samples. Despite challenges with short-term data featuring negative mean returns, the proposed ML-based diversification approaches (for both traditional and data-driven PO) identified portfolios with positive returns by clustering assets with high mean returns. In summary, this thesis introduces novel hybridized ML/DL models for price prediction and approaches to enhance diversification within specific types of portfolio optimization through advanced clustering methods like affinity propagation and DBSCAN. Diversified stock selection with clustering techniques significantly enhances profitability in data-driven portfolio optimization. These studies underscore its critical role in portfolio resilience, showing that optimal asset selection across different clusters is pivotal for portfolio performance under challenging market conditions.
Description
Keywords
Machine Learning, Autoencoder, Long Short-Term Memory, Gated Recurrent Unit, Data Science, Data-driven Techniques, Bollinger Bands, Data Inference, Hybridization, Stocks, Stock Index, Cryptocurrency, Portfolio Construction, Portfolio Diversification, Portfolio Optimization
Citation