Gonçalo Hora de Carvalho

Game-Solving DRL An Introductory Literature Survey of the Last Decade & A Critical Methodological Review

Gonçalo Carvalho, Twan Vos

2024

Deep Reinforcement Learning (DRL) has demonstrated remarkable success in high-dimensional and stochastic environments, with applications ranging from board games to real-time multi-agent systems. However, the field continues to face fundamental methodological challenges, including inconsistencies in benchmarking, a lack of theoretical definitions, and difficulties in generalization. In this paper, we critically examine these limitations and explore whether DRL can be reframed as a general problem-solving methodology. Specifically, we hypothesize that any complex problem can, in principle, be formulated as a game, making it solvable via DRL under the right conditions. We review key advancements in DRL, including techniques such as value prediction functions, policy gradients, and model-based learning approaches like MuZero. Our analysis identifies persistent gaps in current methodologies, such as the reliance on handcrafted reward signals and computational inefficiencies. To address these concerns, we propose a recursive decomposition framework in which complex problems are broken down into sub-problems, each mapped onto a DRL solvable structure. We introduce an extended Markov Decision Process (MDP) formulation incorporating intrinsic motivation and goal representations, allowing for reward-free optimization. Our results suggest that, if DRL can autonomously learn goal representations and intrinsic reward structures, it may serve as a generalizable tool for problem-solving beyond traditional reinforcement learning applications. However, open questions remain regarding computational feasibility, scalability, and the theoretical limits of game-based problem encoding. We conclude by discussing potential future directions, including the need for improved interpretability and a deeper mathematical foundation for DRL as a general AI paradigm.

Preprint

Game-Solving DRL: An Introductory Literature Survey of the Last Decade & A Critical Methodological Review
Gonçalo Carvalho, Twan Vos · 2024

Read Preprint