【ai】ICASSP 2024 ： Freetalker

最新推荐文章于 2025-07-30 10:50:05 发布

等风来不如迎风去

最新推荐文章于 2025-07-30 10:50:05 发布

阅读量204

点赞数

CC 4.0 BY-SA版权

分类专栏： AI入门与实战文章标签：人工智能 python livekit webrtc

本文链接：https://blog.youkuaiyun.com/commshare/article/details/139329299

AI入门与实战专栏收录该内容

268 篇文章 ¥99.90 ¥299.90

订阅专栏

超级会员免费看

FreeTalker是一个创新框架，首次结合扩散模型生成自发和非自发的演讲者动作，包括共语手势和舞台移动。通过无分类器引导和DoubleTake技术，实现动作的自然过渡和高度可控性。该框架使用异构数据集，提高了模型的泛化能力和动作多样性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness

Current talking avatars mostly generate co-speech gestures based on audio and text of the utterance, without considering the non-speaking motion of the speaker. Furthermore, previous works on co-speech gesture generation have designed network structures based on individual gesture datasets, which results in limited data volume, compromised generalizability, and restricted speaker movements. To tackle these issues, we introduce FreeTalker, which, to the best of our knowledge, is the first framework for the generation

了解本专栏