Combining Variational Autoencoder and Recurrent Neural Networks for Generic Human Motion Prediction

Authors

  • Jonas Hansert Karlsruhe University of Applied Sciences

DOI:

https://doi.org/10.60643/urai.v2023p11

Keywords:

Human Motion Prediction, Recurrent Neural Network, Variational Autoencoder, Machine Learning, Time Series Prediction, 3d Computer Vision

Abstract

Real-time motion prediction in a three dimensional environment is required for many application from autonomous cars to human robot collaboration to free-fall sorting machines. The most widely distributed sensors for the detection of three-dimensional environments like time of flight cameras, lidar sensors, stereo cameras or radar devices delivers point clouds or other formats that can easily converted to point clouds. The high dimensionality of point clouds and even voxel grids is a major challenge for real-time motion prediction. Most approaches use a skeleton tracking algorithm for dimensionality reduction, which itself is very error-prone. We investigated an approach consisting of a combination of two separately trained neural networks. We used a variational autoencoder for dimension reduction combined with a long short-term memory or a gated recurrent units network for time series prediction in latent space. We were able to show that it is possible to make reliable motion predictions up to one second into the future, depending on the motion.

Downloads

Published

13.05.2025