
Transformer backbone
文章平均质量分 92
Cherry_qy
这个作者很懒,什么都没留下…
展开
-
[Transformer] LightViT: Towards Light-weight Convolution-free Vision Transformers
轻量级无卷积Transformer原创 2022-08-25 15:17:29 · 1350 阅读 · 0 评论 -
[Transformer] Next-ViT: Next Generation Vision Transformer
适用于工业场景原创 2022-08-08 15:01:32 · 1138 阅读 · 0 评论 -
[Transformer] EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
适用于边缘设备的ViT原创 2022-08-05 18:16:24 · 1076 阅读 · 0 评论 -
[Transformer] AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
adaptformer原创 2022-06-24 11:59:59 · 1012 阅读 · 0 评论 -
[Transformer] LITv2(Fast Vision Transformers with HiLo Attention)
LITv2原创 2022-06-23 11:11:42 · 1598 阅读 · 0 评论 -
[Transformer] Inception Transformer
iFormer:灵活移植Inception的卷积与最大池化原创 2022-06-23 10:47:28 · 1411 阅读 · 1 评论 -
[Transformer] TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
TopFormer:打造Arm端实时分割与检测模型,完美超越MobileNet!CVPR2022TopFormer: Token Pyramid Transformer for Mobile Semantic SegmentationPaper: http://arxiv.org/pdf/2204.05525Code: https://github.com/hustvl/TopFormer1 Introduction为了使ViT适应各种密集的预测任务,最近的Vi...原创 2022-04-25 23:15:03 · 1083 阅读 · 0 评论 -
[Transformer] CMT:Convolutional Neural Networks Meet Vision Transformers
论文链接: https://arxiv.org/abs/2107.06263论文代码(个人实现版本): GitHub - FlyEgle/CMT-pytorch: CMT: Convolutional Neural Networks Meet Vision Transformers官方代码暂未开源复现博客: 浅谈Transformer+CNN混合架构:CMT以及从0-1复现1 OverviewCNN+Transformer混合模型:利用CNN对局部特征进行建模,利用Tran...原创 2022-03-11 13:28:13 · 5295 阅读 · 0 评论 -
[Transformer] PVT系列:PVT & CPVT & Twins
PVT:《Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions》论文: https://arxiv.org/abs/2102.12122源码:https://github.com/whai362/PVT把金字塔结构引入Transformer,简单地堆叠多个独立的Transformer encoder,在每个Stage中通过Patch Embedding来逐...原创 2022-03-04 12:59:07 · 1947 阅读 · 0 评论 -
[Transformer] TransVOD: End-to-End Video Object Detection with Spatial-Temporal Transformers
2022年1月https://arxiv.org/abs/2201.05047v3https://github.com/SJTU-LuHe/TransVOD.DETR《End-to-End Object Detection with Transformers》Deformable DETR《Deformable Transformers for End-to-End Object Detection》TransVOD 《End-to-End Video Object Detect..原创 2022-02-28 14:57:53 · 4965 阅读 · 0 评论 -
[Transformer] Deformable DETR:Deformable Transformers for End-to-End Object Detection
2020.10作者单位:商汤目标检测:在DETR中加入了Deformable和多尺度特征融合策略Paper: https://arxiv.org/abs/2010.04159Code: https://github.com/fundamentalvision/Deformable-DETR1 简介DETR可以避免在进行目标检测时手工设计很多组件,但由于transformer注意模块在处理图像特征图时的局限性,导致收敛速度慢,特征空间分辨率有限。为了解决这些问题,提出了D...原创 2022-02-28 11:36:43 · 2480 阅读 · 0 评论 -
[Transformer] DETR: End-to-End Object Detection with Transformers
ECCV2020Facebook AI第一个用Transformer进行目标检测的网络论文: https://arxiv.org/abs/2005.12872代码: https://github.com/facebookresearch/detr搭建参考: 【CODE】Facebook 最新DETR(基于Transformer)目标检测算法实战_哔哩哔哩_bilibili1 简介DETR通过将常见的CNN与Transformer架构相结合,不仅做到了并行化,不需要..原创 2022-02-25 15:01:02 · 1060 阅读 · 0 评论 -
[Transformer] DeiT:Training data-efficient image transformers & distillation through attention
DeiT:Training data-efficient image transformers & distillation through attentionhttps://arxiv.org/pdf/2012.12877.pdfGitHub - facebookresearch/deit: Official DeiT repository2020 Dec现有的基于Transformer的分类模型ViT需要在海量数据上(JFT-300M,3亿张图片)进行预训练,再在ImageN..原创 2022-01-18 20:20:56 · 997 阅读 · 0 评论 -
[Transformer] Conformer: Local Features Coupling Global Representations for Visual Recognition
https://arxiv.org/abs/2105.03889https://github.com/pengzhiliang/ConformerICCV2021Conformer的组成成分:a stem module,dual branches,FCUs,two classifiersStem module:7*7 conv with stride 2,3*3 max pooling with stride 2dual branches:CNN分支(ResNet,提取局部信...原创 2022-01-18 20:07:42 · 1671 阅读 · 0 评论 -
[Transformer] Swin Transformer V2: Scaling Up Capacity and Resolution
论文:代码:1 Introduction在Swin Transformer的基础上,提出了以下的改进措施:1)post normalization technique and scaled cosine attention提升大型视觉模型的稳定性;2)log-spaced continuous position bias 进行低分辨率预训练模型向高分辨率模型迁移。3)implementation details 大幅节省GPU显存占用以使得大视觉模型训练变得可行。...原创 2022-01-18 16:07:02 · 2766 阅读 · 0 评论 -
[Transformer] PyramidTNT
TNT: Transformer in Transformer论文: https://arxiv.org/pdf/2103.00112.pdf代码:https://github.com/huawei-noah/noah-research/tree/master/TNTPyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture论文: https://arxiv.org/abs/22...原创 2022-01-18 15:21:15 · 587 阅读 · 0 评论 -
[Transformer] MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
作者单位:Apple论文:https://arxiv.org/abs/2110.02178代码:GitHub - apple/ml-cvnets: CVNets: A library for training computer vision networks1 Introduction传统CNN易于优化且可根据特定任务整合不同网络,ViT则需要大规模的数据且更难优化,学习量大且计算量大,这是因为ViT缺乏图像固有的归纳偏差。结合CNN和ViT的优势,为移动视觉任务建..原创 2022-01-17 14:15:35 · 1089 阅读 · 1 评论 -
[Transformer] DAT: Vision Transformer with Deformable Attention
论文: https://arxiv.org/abs/2201.00520代码: https://github.com/LeapLabTHU/DAT2022年1月1 简介与CNN模型相比,基于Transformer的模型具有更大的感受野,擅长于建模长期依赖关系,在大量训练数据和模型参数的情况下取得了优异的性能。但是,计算成本较高,收敛速度较慢,过拟合的风险增加。为了降低计算复杂度,Swin Transformer采用基于Window的局部注意力来限制Local Window中的注意原创 2022-01-14 14:44:07 · 3451 阅读 · 2 评论