Capturing forceful interaction with deformable objects using a deep learning- powered... 翻译-优快云博客

该文档由Doc2X翻译提供解析与翻译, 想看更多论文翻译欢迎来Doc2X
This document is provided with parsing and translation by Doc2X. For more translated papers, feel free to visit Doc2X.
原文地址 https://www.nature.com/articles/s41467-024-53654-y
项目地址：https://github.com/jeffsonyu/ViTaM

Capturing forceful interaction with deformable objects using a deep learning- powered stretchable tactile array

利用深度学习驱动的可拉伸触觉阵列捕捉与可变形物体的强力交互

Chunpeng Jiang ${\text{@}}^{1,7}$ ,Wenqiang ${\mathrm{ {Xu}}}^{2,7}$ ,Yutong ${\mathrm{ {Li}}}^{2}$ ,Zhenjun ${\mathrm{ {Yu}}}^{2}$ , Longchun Wang ${}^{1}$ ,Xiaotong Hu ${}^{1,3}$ ,Zhengyi Xie ${}^{1,3}$ ,Qingkun Liu O’,Bin Yang ${}^{1}$ , Xiaolin Wang1,Wenxin Du2,Tutian Tang ${\text{O}}^{2}$ ,Dongzhe Zheng ${}^{2}$ ,Siqiong Yao ${}^{4}$ , Cewu Lu ${}^{5,6} \boxtimes$ & Jingquan Liu ${}^{1} \boxtimes$

Capturing forceful interaction with deformable objects during manipulation benefits applications like virtual reality, telemedicine, and robotics. Replicating full hand-object states with complete geometry is challenging because of the occluded object deformations. Here, we report a visual-tactile recording and tracking system for manipulation featuring a stretchable tactile glove with 1152 force-sensing channels and a visual-tactile joint learning framework to estimate dynamic hand-object states during manipulation. To overcome the strain interference caused by contact with deformable objects, an active suppression method based on symmetric response detection and adaptive calibration is proposed and achieves ${97.6}\%$ accuracy in force measurement, contributing to an improvement of ${45.3}\%$ . The learning framework processes the visual-tactile sequence and reconstructs hand-object states. We experiment on 24 objects from 6 categories including both deformable and rigid ones with an average reconstruction error of ${1.8}\mathrm{\;{cm}}$ for all sequences, demonstrating a universal ability to replicate human knowledge in manipulating objects with varying degrees of deformability.

在操作过程中捕捉与可变形物体的强力交互，有利于虚拟现实、远程医疗和机器人等应用。由于物体变形的遮挡，复制完整的手-物体状态及其几何形状是具有挑战性的。在此，我们报告了一种视觉-触觉记录和跟踪系统，该系统配备有1152个力传感通道的可拉伸触觉手套，并采用视觉-触觉联合学习框架来估计操作过程中的动态手-物体状态。为了克服与可变形物体接触引起的应变干扰，提出了一种基于对称响应检测和自适应校准的主动抑制方法，并在力测量中实现了 ${97.6}\%$ 的精度，贡献了 ${45.3}\%$ 的改进。学习框架处理视觉-触觉序列并重建手-物体状态。我们在6个类别中的24个物体（包括可变形和刚性的物体）上进行了实验，所有序列的平均重建误差为 ${1.8}\mathrm{\;{cm}}$ ，展示了在操作不同程度可变形物体时复制人类知识的通用能力。

Human-machine interaction (HMI) systems serve as gateways to the metaverse, acting as bridges between the physical world and the digital realm. A natural user interface in HMI allows humans to perform natural and intuitive control ${}^{1}$ . Although the non-forceful interfaces such as hand gestures (Fig. 1A(i)) can be tracked using technologies like inertial measurement units ${\left( \mathrm{ {IMU}}\right) }^{2}$ ,electromyography (EMG) ${\text{sensors}}^{3,4}$ ,strain sensors ${}^{5,6}$ ,video recording ${}^{7}$ and triboelectric sensors ${}^{8}$ , the forceful interfaces such as interaction with objects, i.e., the human manipulation,are less explored ${}^{9,{10}}$ . Capturing forceful human Nature Communications | (2024)15:9513 manipulation has extensive potential applications, such as virtual reality ${\left( \mathrm{ {VR}}\right) }^{ {11},{12}}$ ,telemedicine ${}^{13}$ ,robotics ${}^{ {14},{15}}$ ,and contributes to real-world understanding for large artificial intelligence (AI) models ${}^{16}$ . Replicating the hand-object interplay is the first step to applying human manipulation knowledge in these applications. However, the hand-object states captured in previous research were far from complete. They mainly explore tasks like semantic recognition and spatial localization to predict object category and position (Fig. 1A(ii)) ${}^{ {17} - {20}}$