
人工智能
文章平均质量分 91
小吴同学真棒
佛系更新中~~~
希望自己一点点进步,变得强大,结交很多优秀的朋友,积极向上的生活
展开
-
【论文速度 + 核心代码定位】(2024 ECCV)ParCo: Part-Coordinating Text-to-Motion Synthesis
这篇论文提出了 ParCo 框架,让动作生成模型更好地理解并协调生成身体各个部分(body part)的运动。原创 2025-04-27 21:24:57 · 960 阅读 · 0 评论 -
Hi-TRS:骨架点视频序列的层级式建模及层级式自监督学习
Hi-TRS:骨架点视频序列的 层级式建模 及 层级式自监督学习原创 2023-08-17 14:06:38 · 789 阅读 · 0 评论 -
浅谈 EMP-SSL + 代码解读:自监督对比学习的一种极简主义风
自监督对比学习的一种极简主义风:一张图片裁剪成不同的 patch,对不同的 patch 做数据增强,分别输入 encoder,得到多个 embedding,对它们求均值,得到作为这张图片的 embedding。最后,拉近每个 patch 的 embedding 和图片的 embedding()之间的余弦距离;再用 Total Coding Rate(TCR) 防止坍塌(即 encoder 对所有输入都输出相同的 embedding)原创 2023-08-14 20:37:38 · 813 阅读 · 0 评论 -
【论文阅读笔记】(2022 ECCV)CMD: Self-supervised 3D Action Representation Learning with Cross-modal Mutual Di
作者提出了一个跨模态互蒸馏(Cross-modal Mutual Distillation,CMD)的自监督学习框架。其中,模态之间进行的是双向知识蒸馏(bidirectional knowledge distillation);蒸馏的知识(knowledge)是样本和其他样本的相似度分布(the neighboring similarity distribution)在蒸馏的过程中,为老师(teacher)和学生(student)模型设置不同的参数,目的是稳定蒸馏的过程,同时保证传输具有高置信度的信息。原创 2023-04-04 14:18:49 · 1356 阅读 · 1 评论 -
InstDisc 代码解读
这里将解读代码最核心的部分:计算 loss和更新 memory bank 的部分。原创 2022-11-06 12:35:29 · 1774 阅读 · 1 评论 -
【论文阅读笔记】(2021 CVPR)3D Human Action Representation Learning via Cross-View Consistency Pursuit
We propose CrosSCLR, a cross-view contrastive learning framework for skeleton-based action representation. First, we develop Contrastive Learning for Skeleton-based action Representation (SkeletonCLR) to learn the singleview representations of skeleton dat原创 2022-09-05 20:03:43 · 737 阅读 · 1 评论 -
【论文阅读笔记】(2022 ECCV)Contrastive Positive Mining for Unsupervised 3D Action Representation Learning
论文在骨架点序列上做了一个自监督对比学习任务:首先对一段骨架点序列分别做两次数据增强,分别送入两个分支,分别得到两个增强样本的特征。比起直接拉近两个增强样本之间特征的距离,这篇论文拉近的是:这两个增强样本与队列中的 N 个样本相似程度的分布。同时,除了两个增强样本互为各自的正样本,在训练的第二阶段里,模型还会使用 Positive Mining 策略将队列中的某些样本也作为这两个增强样本的正样本,进行positive-enhanced 的对比学习。原创 2022-08-21 16:06:34 · 1091 阅读 · 0 评论 -
【视频学习笔记】(霹雳吧啦Wz)MobileNet 系列
霹雳吧啦Wz 的 MobileNet 系列视频学习笔记~整理方便复习~原创 2022-07-27 15:55:41 · 3740 阅读 · 2 评论 -
【论文阅读笔记】(2022 CVPR)Self-Supervised Material and Texture Representation Learning for Remote Sensing T
论文在遥感图像上做了一个自监督对比学习任务由一个改良版encoder(见 Sec 1)对遥感图像提取特征,再用特征和类中心之间的关系得到每个样本对应的 soft representation(见Sec 2)。接着就是用对比学习的loss去拉近正样本对之间的soft representation,推远负样本对的 soft representation(见Sec 3)...原创 2022-07-20 20:35:31 · 1502 阅读 · 1 评论 -
【论文阅读笔记】(2021 CVPR)Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Label
Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels(2021 CVPR)Sangdoo Yun, Seong Joon Oh, Byeongho Heo, Dongyoon Han, Junsuk Choe, Sanghyuk ChunNotesContributionsIn this paper, we propose a re-labeling strategy, R原创 2022-05-23 16:44:12 · 399 阅读 · 0 评论 -
【论文阅读笔记】(2019 ICCV)SlowFast Networks for Video Recognition
论文名称:SlowFast Networks for Video Recognition论文链接:https://arxiv.org/pdf/1812.03982.pdf论文作者:Christoph Feichtenhofer,Haoqi Fan,Jitendra Malik,Kaiming He【Facebook AI Research (FAIR)】写在前面由于关于这篇论文网上已经有比较详细的讲解了,所以我就不做重复的工作了。在引用别人讲解...原创 2022-03-27 15:29:24 · 599 阅读 · 0 评论 -
【论文阅读笔记】(2s-AGCN)Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognit
写在前面很久之前就出了这篇文章的代码讲解博客:解读 2s-AGCN 代码_小吴同学真棒的博客-优快云博客_2s-agcn代码意外发现还挺多人阅读和收藏的,那我借着今天再复习这篇论文的时候再补充一下论文的方法笔记吧。Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition(2019 CVPR)Lei Shi, Yifan Zhang, Jian Cheng, H..原创 2022-03-22 23:33:24 · 2830 阅读 · 0 评论 -
【论文阅读笔记】(2019 IROS)Action Recognition Based on 3D Skeleton and RGB Frame Fusion
Action Recognition Based on 3D Skeleton and RGB Frame Fusion(2019 IROS)Guiyu Liu, Jiuchao Qian, Fei Wen, Xiaoguang Zhu, Rendong Ying, Peilin LiuNotesContributions• First, based on the skeleton information, we propose a preprocess strategy and d原创 2022-03-15 16:46:22 · 1453 阅读 · 0 评论 -
【数据集详情笔记】PKUMMD:多模态人体动作检测数据集
PKUMMDNotesDataset URL: PKU-MMDDataset DetailsThis is a large-scale multi-modalities action detection dataset. contains 1076 long video sequences, each of which lasts about 3∼4 minutes (recording ratio set to 30 FPS) and contains approximately原创 2022-03-11 13:26:53 · 4239 阅读 · 0 评论 -
【论文阅读笔记 + 代码解读】(2018 AAAI)ST-GCN
写在前面ST-GCN 是skeleton based action recognition 的开山鼻祖。MMLab 出品,必是精品!开山鼻祖级别的论文必有很多理论 + 数学公式,再加上本人(菜鸡)既不是数学专业又不是计软本科出身的,所以第一次看这篇论文的时候很痛苦。。。所以本来应该很早就要写这篇博客的,被我拖啊拖。。。拖到了 2s-AGCN 的解读博客我都写完好久了,ST-GCN 的还是没出来。。。解读 2s-AGCN 代码_小吴同学真棒的博客-优快云博客_2s-agcn代码论文:https.原创 2022-03-02 17:08:35 · 8222 阅读 · 5 评论 -
【论文阅读笔记】(2015 CVPR)Hierarchical recurrent neural network for skeleton based action recognition
Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition(2015 CVPR)AuthorsNotesContributionsWe propose an end-to-end hierarchical RNN for skeleton based action recognition. Instead oftaking the whole skeleton as the in原创 2022-02-22 16:28:55 · 2082 阅读 · 1 评论 -
Data Parallel 的那些事儿(梯度计算、同步 BN ......)
0、写在前面本文是一个学习链接博客。网上已有许多参考文档,故不再重复。我从找到的学习链接中筛选出我认为写得清晰、通俗易懂的部分截取给大家,并加上了我学习过程中的笔记标注。来源已标注,感谢各位大佬博主!1、Data Parallel 工作原理 & 梯度计算pytorch多gpu DataParallel 及梯度累加解决显存不平衡和显存不足问题_gaoyelu的博客-优快云博客2、Data Parallel 暂时没有 PyTorch 官方的同步,但 DD..原创 2022-02-20 18:38:58 · 2270 阅读 · 0 评论 -
论文阅读笔记:(2018 ACCV)Cross Pixel Optical-Flow Similarity for Self-Supervised Learning
Cross Pixel Optical-Flow Similarity for Self-Supervised Learning(2018 ACCV: Asian Conference on Computer Vision)Aravindh Mahendran, James Thewlis, Andrea VedaldiNotesContributionsThe authors propose a new self-supervised algorithm by using the原创 2022-02-04 20:58:08 · 2042 阅读 · 0 评论 -
【跟着代码读论文】ViT(2021 ICLR)An image is worth 16x16 words: Transformers for image recognition at scale
论文: An image is worth 16x16 words: Transformers for image recognition at scale.Github code(PyTorch Implementation):https://github.com/lucidrains/vit-pytorch目录Model OverviewGithub Code UsageProcedure 1 & 2:split an image into fixed-size pa原创 2021-12-23 12:37:08 · 1646 阅读 · 1 评论 -
【论文阅读笔记】(2021 ICCV)Video Pose Distillation for Few-Shot, Fine-Grained Sports Action Recognition
Video Pose Distillation for Few-Shot, Fine-Grained Sports Action Recognition(2021 ICCV)James Hong, Matthew Fisher, Micha¨el Gharbi, Kayvon FatahalianNotes写在前面(中文版自己总结)之前的 AR(Action Recognition) 有两种做法:(1)end-to-end:就是普通的 AR,输入 RGB frames,输出动作的原创 2021-12-21 13:02:32 · 579 阅读 · 0 评论 -
【论文阅读笔记】(2015 ICML)Unsupervised Learning of Video Representations using LSTMs
Unsupervised Learning of Video Representations using LSTMs(2015 ICML)Nitish Srivastava, Elman Mansimov, Ruslan SalakhutdinovNotesContributionsOur model uses an encoder LSTM to map an input sequence into a fixed length representation. This repre原创 2021-11-23 17:22:55 · 750 阅读 · 0 评论 -
【论文阅读笔记】(2018 ECCV)Look, Listen and Learn
Look, Listen and Learn(2018 ECCV)Relja Arandjelovi´c Andrew ZissermanNotesContributionsWe introduce a novel Audio-Visual Correspondence (AVC) learning task that is used to train the two (visual and audio) networks from scratch. The AVC t...原创 2021-11-18 11:02:07 · 2151 阅读 · 2 评论 -
【论文阅读笔记】(2017 CVPR)See, Hear, and Read: Deep Aligned Representations
See, Hear, and Read: Deep Aligned Representations(2017 CVPR)Yusuf Aytar, Carl Vondrick, Antonio TorralbaNotesContributionsIn this paper, we learn rich deep representations that are aligned across the three major natural modalities: vision, soun原创 2021-11-18 10:50:44 · 3139 阅读 · 0 评论 -
【论文阅读】Revisiting self-supervised visual representation learning
0、写在前面比起其他设计 novel SSL pretext task 的文章,这篇文章主要是做实验探究:network architecture 对 SSL pretext task 后学到 representation 好坏的影响。1、结论Architecture choices which negligibly affect performance in the fully labeled setting, may significantly affect performance in原创 2021-10-03 13:57:37 · 598 阅读 · 0 评论 -
风格迁移(Style Transfer)首次学习总结
0、写在前面最近看了吴恩达老师风格迁移相关的讲解视频,深受启发,于是想着做做总结。1、主要思想目的:把一张内容图片(content image)的风格迁移成与另一张图片(style image)风格一致。(图自论文:A Neural Algorithm of Artistic Style)方法:通过约束 Content Loss 和 Style Loss 来生成最终的图片。1.0 activation(representation)、kernel(filter)、cha.原创 2021-09-25 15:45:05 · 5104 阅读 · 1 评论 -
【论文阅读】(XDC)Self-Supervised Learning by Cross-Modal Audio-Video Clustering
Self-Supervised Learning by Cross-Modal Audio-Video Clustering(2020 NeurIPS)Humam Alwassel, Dhruv Mahajan, Bruno Korbar, Lorenzo Torresani, Bernard Ghanem, Du TranNotesContributionspropose Cross-Modal Deep Clustering (XDC), a novel self-supervi原创 2021-09-22 15:19:07 · 939 阅读 · 0 评论 -
【论文阅读】:NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding
NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding(2019 TPAMI)Jun Liu, Amir Shahroudy, Mauricio Perez, Gang Wang, Ling-Yu Duan, and Alex C. KotNote论文链接:https://arxiv.org/pdf/1905.04757.pdfGithub:https://github.com/shahroud原创 2021-08-25 13:29:43 · 1254 阅读 · 0 评论 -
【论文阅读】:NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis
NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis(2016 CVPR)Amir Shahroudy, Jun Liu, Tian-Tsong Ng, Gang WangNotesContribution1、introduce a large-scale dataset for RGB+D human action recognition2、propose a new recurrent neural n原创 2021-08-25 12:57:17 · 1519 阅读 · 0 评论 -
论文阅读:(2020 AAAI) Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning
Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning(2020 AAAI)Dezhao Luo, Chang Liu, Yu Zhou, Dongbao Yang, Can Ma, Qixiang Ye, Weiping WangNotes论文:Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning代码:https://git.原创 2021-07-19 12:58:16 · 574 阅读 · 0 评论 -
论文阅读:Skeleton-Based Action Recognition with Directed Graph Neural Networks
Skeleton-Based Action Recognition with Directed Graph Neural Networks(2019 CVPR)Lei Shi1,2Yifan Zhang, Jian Cheng, Hanqing LuNotesContributions(1) To the best of our knowledge, this is the first work to represent the skeleton data as a direc.原创 2021-06-08 16:29:08 · 646 阅读 · 0 评论 -
论文阅读:An attention enhanced graph convolutional lstm network for skeleton-based action recognition
An attention enhanced graph convolutional lstm network for skeleton-based action recognition(2019 CVPR)Chenyang Si, Wentao Chen, Wei Wang, Liang Wang, Tieniu TanNotesContributionsThe proposed AGC-LSTM is able to effectively capture discriminat.原创 2021-06-08 15:27:39 · 494 阅读 · 0 评论 -
论文阅读:Actional-Structural Graph Convolutional Networks for Skeleton-based Action Recognition
Actional-Structural Graph Convolutional Networks for Skeleton-based Action Recognition(2019 CVPR)Maosen Li, Siheng Chen, Xu Chen, Ya Zhang, Yanfeng Wang, and Qi TianNotesContributionswe propose the A-link inference module (AIM) to infer action.原创 2021-06-08 15:07:15 · 675 阅读 · 0 评论 -
论文阅读:A new representation of skeleton sequences for 3d action recognition
A new representation of skeleton sequences for 3d action recognition(2017 CVPR)Qiuhong Ke, Mohammed Bennamoun, Senjian An, Ferdous Sohel, Farid BoussaidNotesContributionsWe propose to transform each skeleton sequence to a new representation, i.原创 2021-06-08 14:49:50 · 357 阅读 · 0 评论 -
论文阅读:(DCGAN)Unsupervised representation learning with deep
Unsupervised representation learning with deep convolutional generative adversarial networks(2016 ICLR)Alec Radford, Luke Metz, Soumith ChintalaNotesContributionsWe propose and evaluate a set of constraints on the architectural topology of Con.原创 2021-05-30 15:41:42 · 551 阅读 · 0 评论 -
论文阅读:Colorful Image Colorization
Colorful Image Colorization(2016 ECCV)Richard Zhang, Phillip Isola, Alexei A. EfrosNotesContributionsGiven a grayscale photograph as input, this paper attacks the problem of automatic image colorization. First, we embrace the underlying uncert.原创 2021-05-29 20:18:04 · 883 阅读 · 0 评论 -
论文阅读:Self-supervised video representation learning with space-time cubic puzzles
论文名称:Self-supervised video representation learning with space-time cubic puzzles(2019 AAAI)论文作者:Dahun Kim, Donghyeon Cho, In So Kweon下载地址:https://ojs.aaai.org/index.php/AAAI/article/view/4873ContributionsIn this paper, we introduce a new self..原创 2021-05-27 13:51:14 · 388 阅读 · 0 评论 -
论文阅读:Self-supervised spatio-temporal representation learning for videos by predicting motion and app
目录ContributionsMethod1、Partitioning patterns2、Motion Statistics3、Appearance StatisticsResults论文标题:Self-supervised spatio-temporal representation learning for videos by predicting motion and appearance statistics(2019 CVPR)论文作者:Jiangli..原创 2021-05-27 13:47:23 · 653 阅读 · 0 评论 -
论文阅读:Generating Videos with Scene Dynamics
目录ContributionsMethod1、Video Generator Network2、Video Discriminator NetworkResults1、Quantitative Results on Video Generator2、Video Representation Learning (Video Discriminator)论文名称:Generating Videos with Scene Dynamics(2016 NIPS)论文作者:C.原创 2021-05-27 13:41:00 · 547 阅读 · 0 评论 -
论文阅读:Self-Supervised Video Representation Learning With Odd-One-Out Networks
目录ContributionsMethod1、Model2、Three sampling strategies.3、Video frame encoding.ResultsMore Reference to Follow论文名称:Self-Supervised Video Representation Learning With Odd-One-Out Networks(2017 CVPR)论文作者:Basura Fernando, Hakan Bilen, ..原创 2021-05-27 13:31:35 · 915 阅读 · 0 评论 -
论文阅读:Cross and Learn: Cross-Modal Self-supervision
论文名称:Cross and Learn: Cross-Modal Self-supervision论文作者:Nawid Sayed, Biagio Brattoli, and Bj¨orn Ommer下载地址:https://link.springer.com/chapter/10.1007/978-3-030-12939-2_17ContributionsIn this paper, we use cross-modal information as an alternati..原创 2021-05-27 11:34:30 · 363 阅读 · 0 评论