Optimizing Decision Parameters of Humanoid Robots using Deep Reinforcement Learning
PDF

Keywords

Deep Reinforcement Learning
Behavior Switching
Humanoid Robots

Abstract

This work investigates the use of deep reinforcement learning to enable humanoid Nao robots in the RoboCup 3D Soccer Simulation to autonomously decide when to switch between complex behaviors. Two main experiments were conducted. In the first, an agent was trained to learn the optimal moment to transition from walking towards the ball to executing a kick. The robot was randomly initialized at varying distances and orientations relative to the ball and trained using Proximal Policy Optimization to maximize accuracy in kicking the ball towards a target after approaching it. The resulting models achieved strong performance on par with the handcrafted baseline in simulated matches. The second experiment extended this setup by allowing the agent to also determine a favorable pre-kick position round the ball before deciding to switch. Despite the richer decision space, the resulting models performed significantly worse than the baseline, indicating the increased difficulty of jointly learning spatial positioning and timing.

https://doi.org/10.60643/urai.v2025p29
PDF
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright (c) 2025 Richard Pufe