3D Human Motion Estimation via Motion Compression and Refinement

本文介绍了一个两阶段的基于视频的3D人体运动估计方法,该方法在VIBE基础上改进。指出此前方法衡量指标忽略了时间平滑性,本文方法更平滑且MPJPE更低。设计两阶段是因通用模型可能无法模拟特定动作,使用VAE学习人体运动的准确性与平滑性,经第二阶段细化得到最终结果。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

  1. 3D Human Motion Estimation via Motion Compression and Refinement[1]
  • 一个两阶段的基于视频的3d人体motion estimation。
  • 本文是在VIBE[2]的基础上做的,文章指出之前所提出的方法的衡量指标是MPJPE,只是强调空间上的准确性,忽略了temporal smoothness,所以VIBE在进行可视化的时候会发现“jitter”的存在。下图用加速度误差这个指标衡量了temporal smoothness,可见本文的方更平滑,同时最后论文结果MPJPE也更低。

2. Acceleration error,加速度误差:用来衡量3d joints的平滑性,计算代码如下,来源于[3]

3. story

  • 本文指出,由于people share相同的人体结构(就是都使用了SMPL人体模型),所以it is possible to learn a generalized kinematic model that can be matched against the image to infer the general motion of a person. However, since generalized models of motion can also fail to model person-specific motions, it may also be necessary to ‘add back in’ or refine the general motion estimates using image evidence。(这就是介绍为啥本文要设计成两个阶段,而不是一个阶段)。第一阶段得到一个coarse kinematic sequences of a person in a video,第二阶段是一个残差结构,把第一段的结果和原始特征concat到一起,迭代refine得到一个精细的结果,这样最后结果就能有准确,又smooth。这里准确容易理解,smooth的话下面详细的说一下。
  • smooth:文章先说了一下以前是怎么解决smooth问题的如下图原文

最后也指出了仅仅把smooth的先验用到loss function是很难找到准确性和smooth的balance。本文使用了Variational Autoencoder (VAE不熟悉VAE的可以自行补课)。就是先在AMAAS上训练VAE,VAE的作用是什么呢就是学到AMAAS上包含的人体motion的准确性和smooth,就是训练好的VAE能encode到smooth这个信息。但是AMAAS数据中包含的动作有限,所以会遇到其他动作,这就需要第二部refine的操作得到最终又准确又smooth的结果。

参考

  1. ^https://arxiv.org/abs/2008.03789
  2. ^https://arxiv.org/pdf/1912.05656.pdf
  3. ^https://github.com/akanazawa/human_dynamics/blob/master/src/evaluation/eval_util.py
内容概要:本文详细介绍了900W或1Kw,20V-90V 10A双管正激可调电源充电机的研发过程和技术细节。首先阐述了项目背景,强调了充电机在电动汽车和可再生能源领域的重要地位。接着深入探讨了硬件设计方面,包括PCB设计、磁性器件的选择及其对高功率因数的影响。随后介绍了软件实现,特别是程序代码中关键的保护功能如过流保护的具体实现方法。此外,文中还提到了充电机所具备的各种保护机制,如短路保护、欠压保护、电池反接保护、过流保护和过温度保护,确保设备的安全性和可靠性。通讯功能方面,支持RS232隔离通讯,采用自定义协议实现远程监控和控制。最后讨论了散热设计的重要性,以及为满足量产需求所做的准备工作,包括提供详细的PCB图、程序代码、BOM清单、磁性器件和散热片规格书等源文件。 适合人群:从事电力电子产品研发的技术人员,尤其是关注电动汽车充电解决方案的专业人士。 使用场景及目标:适用于需要高效、可靠充电解决方案的企业和个人开发者,旨在帮助他们快速理解和应用双管正激充电机的设计理念和技术要点,从而加速产品开发进程。 其他说明:本文不仅涵盖了理论知识,还包括具体的工程实践案例,对于想要深入了解充电机内部构造和工作原理的人来说是非常有价值的参考资料。
### 3D Human Pose Estimation Techniques and Applications In the realm of computer vision, **3D human pose estimation (HPE)** aims to identify and classify not only the presence but also the three-dimensional positions of key joints within the human body[^1]. This technology has evolved significantly with advancements in deep learning methods. #### Monocular Image-Based Methods Monocular image-based approaches leverage single-camera setups for estimating 3D poses from images or video frames. These models often employ convolutional neural networks (CNNs) that are trained on large datasets containing annotated keypoints. The network learns to predict depth information alongside spatial coordinates by understanding context clues such as limb orientation relative to camera angles[^2]. For instance, a popular method involves using hourglass architectures which iteratively refine heatmaps representing probable locations of each joint until convergence upon accurate predictions. Another approach utilizes multi-view geometry principles combined with CNN outputs to reconstruct full-body skeletons even when parts of bodies may be occluded during capture sessions. #### Multi-modal Fusion Approaches Beyond traditional visual data sources like RGB cameras, researchers have explored integrating other sensing modalities into HPE systems. One notable example includes leveraging WiFi signals capable of penetrating obstacles including walls; this allows for non-line-of-sight tracking without requiring line-of-sight visibility between subjects and sensors. By training deep neural networks on synchronized wireless and visual inputs, these hybrid solutions can achieve comparable accuracy levels while extending operational capabilities beyond conventional limitations imposed by purely optical means alone. #### Real-world Applications The practical implications span across various domains: - **Healthcare**: Monitoring patient movements post-surgery recovery. - **Sports Science**: Analyzing athlete performance metrics accurately. - **Virtual Reality/Augmented Reality**: Enhancing user interaction experiences through realistic avatar animations driven directly off real-time motion captures. ```python import numpy as np from sklearn.model_selection import train_test_split def preprocess_data(images, labels): """Preprocesses input dataset.""" X_train, X_val, y_train, y_val = train_test_split( images, labels, test_size=0.2, random_state=42) return X_train, X_val, y_train, y_val class PoseEstimator: def __init__(self): self.model = None def fit(self, X_train, y_train): # Train model here... pass def evaluate(self, X_val, y_val): # Evaluate model performance... pass ```
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值