Human Pose Analysis in Smooth and Continuous Manifold Space Leveraging Interpolated Embedding Variables
Author(s) -
Jong-Hoon Kim,
Chan-Su Lee
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3609836
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Skeleton data, defined by the hierarchical structure of joints and bones, effectively capture the spatio-temporal features of human motion. While most studies simultaneously address spatial and temporal dynamics in skeleton datasets to solve specific tasks, such as classification, generation, or inpainting, this paper introduces a task-agnostic approach to pose representation. This method is designed to be versatile across various downstream applications. Within the proposed latent space, identical poses from different time sequences are mapped to the same or highly similar points, irrespective of external factors like temporal order or task-specific annotations. This ensures that the latent space remains generalizable and consistent, focusing purely on the geometric configuration of a single skeleton. To accurately capture the characteristics of individual poses across sequential frames, a self-attention-based autoencoder is employed. This autoencoder constructs a low-dimensional embedding space that preserves the original skeleton characteristics. Moreover, incorporating reconstruction loss from interpolated embedding variables facilitates the formation of a smooth and continuous manifold space. This smoothness enables the generation of realistic motion data through variable interpolation along the geometric path of latent variables. The potential of the method as a task-agnostic approach was validated through inpainting tests, arbitrary motion style generation, and motion recognition tasks. The effectiveness of the auxiliary loss function was demonstrated, achieving 9.5% higher accuracy in missing frame estimation compared to standard autoencoder in position accuracy(L2P) evaluation. Additionally, in a classification task utilizing pose representation, sarcopenia diagnosis performance improved by 11% compared to methods based on spatiotemporal graph convolutional networks.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom