(Played at the original speed)
(Played at the original speed)
(Played at the original speed)
(Played at the original speed)
Reinforcement learning controllers have made impressive progress in humanoid locomotion and light load manipulation. However, achieving robust and precise motion with strong force interaction remains a significant challenge.
Based on the above limitations, this paper proposes HAFO, a dual-agent reinforcement learning control framework that simultaneously optimizes both a robust locomotion strategy and a precise upper-body manipulation strategy through coupled training under external force interaction environments. Simultaneously, we explicitly model the external pulling disturbances through a spring-damper system and achieve fine-grained force control by manipulating the virtual spring. During this process, the reinforcement-learning policy spontaneously generates disturbance-rejection response by exploiting environmental feedback. Moreover, HAFO employs an asymmetric Actor-Critic framework in which the Critic-network access to privileged spring-damping forces guides the actor-network to learn a generalizable, robust policy for resisting external disturbances.
The experimental results demonstrate that HAFO achieves stable control of humanoid robot under various strong force interactions, showing remarkable performance in load tasks and ensuring stable robot operation under rope tension disturbances. Project website: hafo-robot.github.io.
HAFO Overview. (a) Policy training. A dual-agent strategy with decoupled upper and lower bodies is adopted, where the lower-body policy takes root linear and angular velocities as command inputs, and the upper-body policy uses reference joint trajectories as command inputs. Meanwhile, various explicit dynamic perturbations are introduced at key locations to enhance the system's robustness and adaptability. (b) Strategy deployment. A humanoid robot control system based on teleoperation is developed, employing an efficient inverse kinematics algorithm to compute the robot's joint angles in real time with high precision, enabling efficient loco-manipulation tasks.