Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J. & Zaremba, W. (2016). OpenAI Gym. arXiv:arXiv:1606.01540
Busoniu, L., de Bruin, T., Tolić, D., Kober, J. & Palunko, I. (2018). Reinforcement learning for control: Performance, stability, and deep approximators. Annual Reviews in Control.
Chen, L. & Gao, Q. (2019). Application of deep reinforcement learning on automated stock trading. 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS), 29-33.
Chong, T., Ng, W.-K. & Liew, V. (2014). Revisiting the performance of MACD and RSI oscillators. Journal of Risk and Financial Management, 1-12.
Craig A., E. & Parbery, S. A. (2005). Is smarter better? A comparison of adaptive, and simple moving average trading strategies. Research in International Business and Finance, 399-411.
Deng, Y., Bao, F., Kong, Y., Ren, Z. & Dai, Q. (2016). Deep direct reinforcement learning for financial signal representation and trading. IEEE Transactions on Neural Networks and Learning Systems, 1-12.
Fischer, T. G. (2018). Reinforcement learning in financial markets - a survey. FAU Discussion Papers in Economics.
Fujimoto, S., Hoof, H. & Meger, D. (2018). Addressing function approximation error in actor-critic methods. International conference on machine learning, 1587-1596.
Gurrib, I. (2018). Performance of the average directional index as a market timing tool for the most actively traded USD based currency pairs. Banks and Bank Systems, 58-70.
Haarnoja, T., Zhou, A., Abbeel, P. & Levine, S. (2018). Soft actor critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. International conference on machine learning, 1861-1870.
Heidari, M. & Amiri, H. (2022). Inspecting the Predictive Power of Artificial Intelligence Models in Predicting the Stock Price Trend in Tehran Stock Exchange. Financial Research Journal, 24(4), 602-623. (in Persian)
Hill, A., Raffin, A., Ernestus, M., Gleave, A., Kanervisto, A., Traore, R., … Wu, Y. (2018). Stable baselines. https://github.com/hill-a/stable-baselines.
Jeong, G. & Kim, H. (2019). Improving financial trading decisions using deep Q-learning: predicting the number of shares, action strategies, and transfer learning. Expert Systems with Applications, 117, 125- 138.
Jiang, Z. & Liang, J. (2017). Cryptocurrency portfolio management with deep reinforcement learning. In 2017 Intelligent systems conference (IntelliSys) (pp. 905-913). IEEE.
Konda, V. & Tsitsiklis, J. (2001). Actor-critic algorithms. Society for Industrial and Applied Mathematics. 12.
Kritzman, M. & Li, Y. (2010). Skulls, financial turbulence, and risk management. Financial Analysts Journal, 66(5), 30-41.
Lauguico, S., Concepcion II, R., Alejandrino, J., Macasaet, D., Tobias, R. R., Bandala, A. & Dadios, E. (2019). A fuzzy logic-based stock market trading algorithm using bollinger bands. International conference on humanoid, nanotechnology, information technology, communication and control, environment, and management (HNICEM), 1-6.
Li, J., Rao, R. & Shi, J. (2018). Learning to Trade with Deep Actor Critic Methods. 11th International Symposium on Computational Intelligence and Design, 66-71.
Maitah, M., Procházka, P., Čermák, M. & Šrédl, K. (2016). Comodity Channel index: evaluation of trading rule of agricultural Commodities. International Journal of Economics and Financial, 176-178.
Markowitz, H. (1952). Portfolio selection. Journal of Finance, 77-91.
Mohebi, S., Fadaeinejad, M. E., Osoolian, M. & Hamidizadeh, M. R. (2022). Feature Selection for the Prediction Model of the Tehran Stock Exchange Index by Dimensionality Reduction Techniques. Financial Research Journal, 24(4), 577-601. (in Persian)
Neuneier, R. (1996). Optimal asset allocation using adaptive dynamic programming. Conference on Neural Information Processing Systems.
Neuneier, R. (1997). Enhancing Q-learning for optimal asset allocation. Coference on Neural Information Processing Systems.
Nourahmadi, M. J. & Nourahmadi, M. (2023). Application of Kalman Filter to Estimate Dynamic Hedge Ratio in Pairs Trading Strategy: A Case Study of the Automobile Industry. Financial Research Journal, 25(1), 63-87. (in Persian)
Nourahmadi, M., Rahimi, A. & Sadeqi, H. (2024). Designing a Stock Recommender System Using the Collaborative Filtering Algorithm for the Tehran Stock Exchange. Financial Research Journal, 26(2), 302-330. (in Persian)
Pacheco Aznar, D. (2023). Portfolio Management: A Deep Distributional RL Approach. SSRN.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv:1707.06347.
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., ... & Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484-489.
Sutton, R. & Barto, A. (1998). Reinforcement learning: an introduction. IEEE Transactions on Neural Networks, 1054.
Sutton, R., Mcallester, D., Singh, S. & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. Conference on Neural Information Processing Systems (NeurIPS).
Yang, H., Liu, X.-Y., Zhong, S. & Walid, A. (2020). Deep reinforcement learning for automated stock trading: An ensemble strategy. In Proceedings of the first ACM international conference on AI in finance, 1-8.
Yu, K. (2023). Quantitative Trading of Stocks Based on TD3 Algorithm. Highlights in Science, Engineering and Technology, 224-231.
Zhang, Y. & Yang, X. (2017). Online portfolio selection strategy based on combining experts’ advice. Computational Economics, 50(1), 141-159.
Zhang, Z., Zohren, S. & Roberts, S. (2019). Deep reinforcement learning for trading. arXiv preprint arXiv:1911.10107.