Deep Reinforcement Learning in Cryptocurrency Trading: A Profitable Approach

Main Article Content

Xue Hao Tay https://orcid.org/0009-0009-8009-5786
Siew Mooi Lim https://orcid.org/0000-0001-7529-2704

Keywords

Investment, Inflation, Cryptocurrency, Time Series, Deep Reinforcement Learning

Abstract

This study proposes an Automatic Cryptocurrency Trading System using Deep Reinforcement Learning (DRL). Six popular cryptocurrencies were used: Bitcoin, Ethereum, BinanceCoin, DogeCoin, Cardano, and WAVES. Development of the trading system started with building three timeseries models – Temporal Convolutional Neural Network (TCNN), Long Short-Term Memory Network (LSTM), and Gated Recurrent Unit Network (GRU) – to predict future prices. Then, cryptocurrency sentiment data was scraped using the Alternative.me API. Data on historical prices, predicted future prices, cryptocurrency sentiment index, technical indicators, and trading account information was fed as input states to three DRL Agents — Deep Q Network (DQN), Advantage Actor Critic (A2C), and Recurrent Proximal Policy Optimization (RPPO) — which were trained using a custom-developed trading environment. Each agent was given $1000 initial capital for all six cryptocurrencies to trade using three possible actions — Buy, Sell and Hold — and were back-tested on one year of unseen data. Our DQN model had the highest overall return on investment (ROI) of $740, an average 12.3% ROI across all six cryptocurrencies, with an ROI of 63.98% achieved for BinanceCoin. However, A2C and RPPO both had negative ROI.


 



Downloads

Download data is not yet available.
Abstract 89 | 985-PDF-v12n3pp126-147 Downloads 3

References

Alternative.me. (n.d.). Crypto Fear & Greed Index. Alternative.me. Retrieved April 10, 2024, from https://alternative.me/crypto/fear-and-greed-index/
“ASNB declares FY23 dividends”. (2023, March 31). The Star. https://www.thestar.com.my/business/business-news/2023/03/31/asnb-declares-fy23-dividends
Binance Square. (n.d.). Crypto Fear & Greed Index. Binance. Retrieved April 10, 2024, from https://www.binance.com/en/square/fear-and-greed-index
Chen, J. (2021, September 29). Technical Indicator: Definition, Analyst Uses, Types and Examples. Investopedia. https://www.investopedia.com/terms/t/technicalindicator.asp
Dolan, B. (2024, March 8). What Is MACD? Investopedia. https://www.investopedia.com/terms/m/macd.asp
DOSM. (2024a, January). Consumer Price Index December 2023. Department of Statistics Malaysia. https://www.dosm.gov.my/portal-main/release-content/consumer-price-index-december-2023
DOSM. (2024b, March). Consumer Price Index February 2024. Department of Statistics Malaysia. https://www.dosm.gov.my/portal-main/release-content/consumer-price-index-february-2021
DOSM. (2024c, February). Consumer Price Index January 2024. Department of Statistics Malaysia. https://www.dosm.gov.my/portal-main/release-content/consumer-price-index-january-2024
Durairaj, M., & Krishna Mohan, B. H. (2022). A convolutional neural network based approach for financial time series prediction. Neural Computing and Applications, 34, 13319–13337. https://doi.org/10.1007/s00521-022-07143-2
Fatima, N. (2020). Enhancing Performance of a Deep Neural Network: A Comparative Analysis of Optimization Algorithms. ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 9(2), 79–90. https://doi.org/10.14201/ADCAIJ2020927990
Fernando, J. (2024a, April 10). Relative Strength Index (RSI) Indicator Explained With Formula. Investopedia. https://www.investopedia.com/terms/r/rsi.asp
Fernando, J. (2024b, January 30). Sharpe Ratio: Definition, Formula, and Examples. Investopedia. https://www.investopedia.com/terms/s/sharperatio.asp
Gimelfarb, M., Sanner, S., & Lee, C. G. (2020). Epsilon-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning. Proceedings of the 35th Uncertainty in Artificial Intelligence Conference, PMLR, 115, 476–485. https://doi.org/10.48550/arXiv.2007.00869
Hamayel, M. J., & Owda, A. Y. (2021). A Novel Cryptocurrency Price Prediction Model Using GRU, LSTM and bi-LSTM Machine Learning Algorithms. AI, 2(4), 477–496. https://doi.org/10.3390/ai2040030
Konda, V. R., & Tsitsiklis, J. N. (2003). Actor-Critic Algorithms. Neural Information Processing Systems, 42(4), 1143–1166.
Lin, M., Meng, Y., & Zhu, H. (2023). How connected is the crypto market to risk to investor sentiment? Finance Research Letters, 56, 104177. https://doi.org/10.1016/j.frl.2023.104177
Liu, R., & Zou, J. (2017). The Effects of Memory Replay in Reinforcement Learning. arXiv, 1(06574). https://doi.org/10.48550/arXiv.1710.06574
Lund, B. (2023, July 11). Bollinger Bands: A powerful technical tool for traders. Britannica Money. https://www.britannica.com/money/bollinger-bands-indicator
OpenAI. (2017, July 20). Proximal Policy Optimization. https://openai.com/research/openai-baselines-ppo
Or, B. (2021, January 30). Value-based Methods in Deep Reinforcement Learning. Towards Data Science. https://towardsdatascience.com/value-based-methods-in-deep-reinforcement-learning-d40ca1086e1
O’Hara, N. (2023, August 31). The Multiple Strategies of Hedge Funds. Investopedia. https://www.investopedia.com/articles/investing/111313/multiple-strategies-hedge-funds.asp
Pandey, M., Fernandez, M., Gentile, F., Isayev, O., Trospha, A., & Stern, A. C. (2022). The transformational role of GPU computing and deep learning in drug discovery. Nature Machine Intelligence, 4, 211–221. http://dx.doi.org/10.1038/s42256-022-00463-x
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. arXiv, 2(06347). https://doi.org/10.48550/arXiv.1707.06347
Sornmayura, S. (2019). Robust FOREX Trading with Deep Q Network (DQN). ABAC Journal, 39(1), 15–33. https://core.ac.uk/download/pdf/233618241.pdf
Sutton, R. S., & Barto, A. B. (2018). Reinforcement Learning: An introduction (2nd ed.). London, England: The MIT Press.
Tradingview. (n.d.). BTCUSD chart. Tradingview. Retrieved April 10, 2024, from https://www.tradingview.com/chart/cZxzp0Jc/?symbol=BITSTAMP%3ABTCUSD
VethaSalam, R., & Ibrahim, J. (2024, March 3). EPF declares a 5.5% dividend for conventional savings for 2023. The Star. https://www.thestar.com.my/news/nation/2024/03/03/epf-declares-55-dividend-for-conventional-savings-for-2023
Yang, H., Lim, X. Y., Zhong, S., & Walid, A. (2020). Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy. In ACM International Conference on AI in Finance, 31, 1–8. https://dx.doi.org/10.2139/ssrn.3690996
Yoon, C. (2019a, October 20). Dueling Deep Q Networks. Towards Data Science. https://towardsdatascience.com/dueling-deep-q-networks-81ffab672751
Yoon, C. (2019b, February 6). Understanding Actor Critic Methods and A2C. Towards Data Science. https://towardsdatascience.com/understanding-actor-critic-methods-931b97b6df3f
Zangirolami, V. & Borrotti, M. (2024). Dealing with uncertainty: Balancing exploration and exploitation in deep recurrent reinforcement learning. Knowledge-Based Systems, 293, 111663. https://doi.org/10.1016/j.knosys.2024.111663