黄小米吖-优快云博客

原创 [SAM]A Comprehensive Survey on Segment Anything Model for Vision and Beyond

本文是SAM的第一篇综述讲述了SAM的发展历史、进展、在不同任务、不同数据类型下的应用首先介绍专有名词和背景知识其次介绍SAM再图像处理等应用中的优点和局限以及SAM未来展望git链接：https://github.com/liliu-avril/Awesome-Segment-Anythingfoundation model:基于大模型和自监督学习从而可以迁移到不同领域/下游任务LLM有GPT系列，bert,T5。

2023-07-13 22:53:35 2068

原创 [TinyML]EfficientFormer:Vision Transformers at MobileNet Speed

视觉Transformer在计算机视觉任务领域发展循序，取得了令人深刻的优异表现。但是由于模型中注意力机制的设计和海量的参数使得基于ViT的模型往往比轻量级的卷积模型慢好几倍。因此使得VT的实时性部署就十分具有挑战性，尤其想在移动端这种资源受限的硬件上。近期有的研究尝试通过NAS或混合设计化简ViT的计算复杂度，但是推理速度仍然不尽如人意。从而衍生一个重要的问题Transformer真的能在保持高性能的前提下达到和MobileNet一样的速度吗？...

2022-07-16 11:07:32 2335

原创 [TinyML]APQ:Joint Search for Network Architecture, Pruning and Quantization Policy

APQ：联合搜索网络结构、剪枝和量化策略 from CVPR2020  MIT HAN Lab本文提出的APQ是一种新的深度学习模型高效部署的新方法。与之前方法将网络结构的优化、兼职策略、量化策略分开优化不同，本文将上述方法以联合的方式进行优化。为了解决搜索空间过大的问题本文设计一个量化感知的精度预测器辅助进行进化搜索，找到适应性最高的架构。由于直接训练预测器需要收集大量的量化数据，十分费时，因此本文提出使用预测器迁移的方法来获得量化感知预测器。这一过程是这样的：首先本文会生成一个由（网络结构，精度）组

2022-07-14 14:57:36 1963

原创 [TinyML]NetAug:Network Augmentation for Tiny Deep Learning

NetAug:基于TinyDL的网络增强 from ICLR2022  MIT HAN Lab本文提出了网络增强（NetAug)这种新的训练方法来提升微型网络的性能。现有的正则化方法（数据增强、随机失活）等已经在大型神经网络上取得了成功，一般通过添加噪声来克服过拟合。但是我们发现这些方法会降低微型网络的性能、本文认为：小型网络的训练不同于大型网络，不应该增强数据而是应该增强网络。因为小型网络一般容量有限，一般都是欠拟合而不是过拟合。 NetAug增强了网络（通过反随机失活），而不是在数据集或网络中插入

2022-07-13 11:18:33 1364 1

原创 [NAS]MCUNet: Tiny Deep Learning on IoT Devices

基于MCU的微型物联网设备上进行深度学习是一个吸引人但又有挑战性的任务，因为MCU的内存比手机端还要小2-3个数量级。本文提出的MCUNet框架将高效的神经网络架构搜索TinyNAS和轻量级的推理引擎TinyEngine相结合，能够实现在MCU上进行ImageNet的推理。 TinyNAS采用两阶段的NAS搜索方法，首先对搜索空间进行优化满足资源约束，然后在优化后的搜索空间中设计网络结构，这样可以在较低的搜索成本下满足对设备、延迟、功耗、内存等的约束条件。 TinyEngine则是一个内存高效的推理引擎

2022-06-04 16:00:06 3729 1

原创 [NAS]Once-For-All:Train One Network and Specialize itfor Efficient Deployment

OFA：Train One Network and Specialize itfor Efficient Deployment AbstractSection I IntroductionSection II Related WorkSection III MethodPart 1 Problem formalizationPart 2 Architecture SpacePart 3 Trainging the Once-For-All NetworkPart 4 Specialized model de

2022-05-29 11:53:44 942

原创 [NAS]ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

ProxylessNAS：直接为目标任务和硬件进行NAS搜索CNN结构AbstractSection I IntroductionSection II Related WorkSection III MethodPart 1 Construction of over-parameterized networkPart 2 Learning Binarized PathPart 3 Handling Non-differentiable hardware metricsSection IV Experimen

2022-05-27 19:58:28 803

原创 [Transformer]TNASP: A Transformer-based NAS Predictor with a Self-evolution Framework

TNASP：基于Transformer和自进化的的NAS PredictorAbstractSection I IntroductionSection II Related WorkTraining-based network performance predictorsTraining-free network performance predictorsSection III MethodsPart 1 Training-based network performance predictorsPart

2022-05-02 16:44:35 1192 2

原创 [Quantization]F8Net:Fixed-Point8-Bit Only Multiplication for Network Quantization

F8Net:8位定点网络量化方案 AbstractSection I IntroductionSection II Related WorkSection III Analysis of Fixed-Point RepresentationPart 1 Advantages of Fixed-Point ArithmeticPart 2 Statiscal nalysis for Fixed-Point FormatPart 3 Choosing optimal fixed-point formatSect

2022-04-29 16:23:01 964 2

原创 [MLP]UNeXt: MLP-based Rapid Medical Image Segmentation Network

UNeXT:基于MLP的快速医学图像分割网络AbstractSection I IntroductionSection II UNeXTSection III Experiments and ResultsSection IV DiscussionSection V Conclusionfrom JHUAbstractUNet及其Transformer版本-TransUNet在医学图像分割任务中占据主要地位，但是这些算法由于参数量太大、计算复杂、处理缓慢无法用于快速图像分割。因此本文提出基于卷积ML

2022-04-10 20:10:02 6704

原创 [Transformer]DN-DETR:Accelerate DETR Training by Introducing Query DeNoising

DN-DETR:引入query去噪训练用于加速DETR训练 AbstractSection I IntroductionSection II Related WorkSection III Why denoising accelaretes DETR training?Section IV DN-DETRPart 1 OverviewPart 2 Intro to DAB-DETRPart 3 DenoisingPart 4 Attention MaskPart 5 Label EmbeddingSecti

2022-04-02 15:05:11 5709

原创 [Transformer]HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening

HyperTransformer: A Textural and Spectral Feature Fusion Transformer for PansharpeningSection I IntroductionSection II Relarted WorkClassical approachesConvNet based approachesSection III MethodPart 1 Feature Extractor for PAN and LR-HSIPart 2 HE Texture T

2022-03-31 16:54:26 5045

原创 [Transformer]Mobile-Former:Bridging MobileNet and Transformer

Mobile-Former:连接MobileNet与Transformer form AbstractSection I IntroductionSection II Related WorkLight-weight convolutional neural networksCNN 与 ViT结合Section III Our Method： Mobile-FormerPart 1 OverviewCVPR2022 Microsoft & USTCAbstract本文提出的Mobile-F

2022-03-16 09:43:47 4109

原创 [Transformer]Transformers in Time Series: A Survey

Transformer用于时间序列AbstractSection I IntroductionSection II Preliminaries of the TransformerPart 1 Vanilla TransformerSection III 时间序列的Transformer模型Section IV Network Mdification for Time SeriesPart 1 Positional EncodingPart 2 Attention ModulePart 3 Architec

2022-03-10 17:27:25 3274

原创 [Transformer]A Multi-Branch Hybrid Transformer Network for Corneal Endothelial Cell Segmentation

一种用于角膜内皮细胞分割的多分支混合Transformer网络 AbstractSection I IntroductionSection II MethodPart 1 Residual Transformer BlockPart 2 Body，Edge,Final BranchesPart 3 Loss FunctionSection III ExperimentsPart 1 DatasetsPart 2 Comparison with SOTA methodsPart 3 Albation Stud

2022-03-03 16:16:19 1685 1

原创 [Transformer]Lesion-Aware Transformers for Diabetic Retinopathy Grading

Lesion-Aware Transformers for Diabetic Retinopathy GradingAbstractSection I IntroductionSection II Related WorkSection III Lesion-Aware Transformer NetworkPart 1 OverviewPart 2 Pixel Relation based EncoderPart 3 Lesion Filter based DecoderPart 4 DR Grading

2022-03-02 15:24:03 3663 1

原创 [NAS]NSE:Evolving Search Space for Neural Architecture Search

基于搜索空间进化的神经网络架构搜索AbstractSection I IntroductionSection II Related WorkPart 1 NAS 算法设计Part 2 Search Space DesignSection III MethodProblem FormulationPart 1 OverviewPart 2 Supernet TrainingPart 3 Search Space EvolutionSection IV Experimental ResultsPart 1 Se

2022-02-28 16:38:23 1197

原创 [Transformer]Transclaw U-Net: Claw UNet with Transformers for Medical Image Segmentation

TransClaw UNet：Claw UNet结合Transformer用于医学图像分割AbstractSection I IntroductionSection II Related WorksSection III MethodPart 1 Overall StructurePart 2 Transformer in Trans Claw UNetSection IV ExperimentsComparison ResultsAblation ExperimentsSection V Conclusi

2022-02-21 10:08:58 1263

原创 [Transformer]Evo-ViT：Slow-Fast Token Evolution for Dynamic Vision Transformer

Evo-ViT:自激励Token的快慢进化用于动态加速Vision TransformerAbstractSection I IntroductionSection II Related WorkSection III PreliminariesSection IV MethodologyPart 1 Structure perserving token selectionPart 2 Slow-fast updatingPart 3 Training StrategiesSection V Experim

2022-01-13 15:27:56 1268

原创 [Transformer]Efficient Training of Visual Transformers with Small Datasets

使用小数据集高效训练Visual TransformerAbstractSection I IntroductionSection II Related WorkSection III PreliminariesSection IV Dense relative localization taskSection V ExperimentsPart 1 Ablation StudyPart 2 Training from scratchPart 3 Fine-tuningSection VI Conclusi

2022-01-11 14:46:42 2997

原创 [Transformer]U2Former:A Nested U-shaped Transformer for Image Restoration

U2Former:用于图像修复的U形嵌套Transformer AbstractSection I IntroductionSection II Related WorkSection III MethodSection IV ExperimentsPart 1 Albation StudyPart 2 Experiment on Image DerainingPart 3 Experiments on Image DehazingSection V ConclusionPaperAbstract虽然

2022-01-09 14:59:47 3710 1

原创 [Transformer]On the Relationship between Self-Attention and Convolutional Layers

卷积与自注意力之间的关系：MHSA可以表示任意卷积操作AbstractSection I IntroductionSection II Background of Attention Mechanisms for VisionPart 1 Multi-Head Self-Attention LayerPart 2 Attention for ImagesPart 3 Positional Encoding for ImagesSection III Self-Attention as a Convoluti

2022-01-06 12:01:10 2462

原创 [Transformer]Vision Transformer for Small-Size Datasets

ViT用于小规模数据集AbstractSection I IntroductionSection II Related WorkSection III Proposed MethodPart 1 PreliminaryPart 2 SPTPart 3 LSASection IV ExperimentPart 1 ImageNet ClassificationPart 2 Ablation StudySection V ConclusionSupplentaryAbstractViT广泛被用于图像分类等视

2022-01-05 17:21:18 4693

原创 [NAS]MixPath: A Unified Approach for One-shot Neural Architecture Search

MixPath:一种一次性进行多路径搜索的NAS方法AbstractSection I IntroductionSection II Related WorkSection III Mix Path# Part 1 MotivationPart 2 Regularization Statistics with Shadow Batch NormalizationPart 3 Neural Architecture Search with MixPath SupernetSection IV Experime

2022-01-02 20:08:01 1189

原创 [Transformer]CvT:Introducing Convolutions to Vision Transformers

CvT:将卷积引入Transformer AbstractSection I IntroductionSection II Related WorkSection III Convolution vision TransformerPart 1 Convolutional Token EmbeddingPart 2 Convolution Projection for AttentionPart 3 Efficiency ConsiderationsPart 4 Methodological Discuss

2021-12-29 10:59:35 4278 2

原创 [Transformer]BoTNet：Bottleneck Transformers for Visual Recognition

BoTNet：Bottleneck Transformers for Visual RecognitionAbstractSection ISection II Related WorkSection III MethodSection IV ExperimentsPart 1 Instance SegmentationPart 2 Relative Position EncodingPart 3 BoTNet scales well with larger imagesPart 4 Image Clas

2021-12-23 11:26:55 2787

原创 [NAS+Transformer]GLiT: Neural Architecture Search for Global and Local Image Transformer

GLiT：NAS搜索局部和全局Transformer AbstractSection I IntroductionSection II Related workSection III MethodPart 1 Global-Local blockPart 2 Search Space of the global-local blockPart 3 Hierarchical Neural Architecture SearchSection IV ExperimentsPart 1 Results on Im

2021-12-21 15:26:00 3369

原创 [Transformer]SPViT:Pruning Self-attentions into Convolutional Layers in Single Path

SPViT:把Transformer中的MSA剪枝成卷积AbstractSection I IntroductionSection II Related WorkSection III MethodPart 1 Weight-sharing between MSA and convolutional operationsPart 2 Single Path vision transformer pruningPart 1 Main resultsPart 2 Observations on searched

2021-12-19 11:43:27 1111

原创 [Transormer]MT-UNet：Mixed transformer UNet for medical image segmentation

MTUNet:混合Transformer模型用于医学图像分割AbstractSection I IntroductionSection II MethodsPart 1 Overall Structure DesignPart 2 Mixed Transformer ModulePart 3 Local-Global Gaussian-Weight Self-AttentionPart 4 External AttentionSection III ExperimentsPart 1 Ablation St

2021-12-17 11:32:51 3689

原创 [Transformer]MobileViT：Light-weight, General-purpose, and Mobile-friendly Vision Transformer

MobileViT：一种用于移动端的轻量级通用视觉Transformer模型AbstractSection I IntroductionSection II Related WorkSection III MobileViT：A light-weight TransformerPart 1 MobileViTPart 2 Multi-scale Sampler for Training EfficiencySection IV Experiment ResultsPart 1 Image Classific

2021-12-15 20:28:57 4094

原创 [Transformer]On the Integration of Self-Attention and Convolution

On the Integration of Self-Attention and Convolution:融合卷积和自注意力AbstractSection I IntroductionSection II Related WorkPart 1 Self-Attention onlyPart 2 Attention enhanced ConvolutionPart 3 Convolution enhanced AttentionSection III Revisting Convolution and Sel

2021-12-14 20:18:41 4552

原创 [Transformer]MViTv1:Multiscale Vision Transformers

MViT:Multiscale Vision Transformers AbstractSection I IntroductionSection II Related WorkSection III Multiscale Vision TransformerPart 1 Multi Head Pooling AttentionPart 2 Multi-scale Transformer NetworksPart 3 网络实例化细节Section IV Experiments:Video Recogniti

2021-12-13 15:20:29 4244

原创 [Transformer]CoAtNet:Marrying Convolution and Attention for All Data Sizes

CoAtNet:卷积+注意力用于任意规模的分类任务AbstractSection I IntroductionSection II ModelPart 1 Merging Convolution and Self-AttentionPart 2 Vertical Layput DesignSection III Related WorkSection IV ExperimentsPart 1 Main ResultsPart 2 Ablation StudiesSection V Conclusionfr

2021-12-10 17:19:36 2514

原创 [Transformer]MViTv2:Improved Multiscale Vision Transformers for Classification and Detection

MViT:优化的多尺度Transformer用于分类和检测AbstractSection I IntroductionSection II Related WorkSection III Revisting Mulit Scale Vision TransformerSection IV Improved MViTPart 1 Improved Pooling AttentionPart 2 MViT for Object DetectionPart 3 MViT for Video Recognitio

2021-12-09 11:16:51 4958

原创 [Transformer]AutoFormerV2:Searching the Search Space of Vision Transformer

AutoFormeerV2:搜索VisionTransformer的搜索空间AbstractSection I IntroductionSection II ApproachPart 1 Problem FormulationPart 2 Basic Search SpacePart 3 Searching the Search SpacePart 4 Searching in the Searched SpaceSection III Analysis and DiscussionSection IV E

2021-12-06 15:11:54 2329

原创 [Transformer]Is it Time to Replace CNNs with Transformers for Medical Images?

医学图像中Transformer可以取代CNN了吗？AbstractSection II Related WorkSection III MethodsSection IV ExperimentsAre random initialized transformers useful?Does pretraining transformers on ImageNet work in the medical domain?Do transformers benefit from self-supervised i

2021-12-03 11:51:05 862

原创 [Transformer]CAT: Cross Attention in Vision Transformer

CAT：交叉注意力用于Vision TransformerAbstractSection I IntroductionSection II Related WorkSection III MethodPart 1 Overall architecturePart 2 Inner-Patch Self-Attention BlockPart 3 Cross-Patch Self-AttentionPart 4 Cross Attention based TransformerSection IV Experi

2021-12-01 12:00:39 5185 1

原创 [Transformer]CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows

CSWin Transformer:基于交叉十字形窗口的视觉Transformer框架 AbstractSection I IntroductionSection II Related WorkSection III MethodPart 1 Overall ArchitecturePart 2 Cross-Shaped Window Self-AttentionSection IV ExperimentsPart 1 ImageNet-1K ClassificationPart 2 COCO Detect

2021-11-30 14:54:36 1494

原创 [Transformer]Patches Are All You Need?

Patches are All you NeedAbstractSection I IntroductionSection II A simple Model: ConvMixerSection III ExperimentsSection IV Related WorkSection V ConclusionAbstractICLR2022 Under Review Paper  Code    虽然CNN多年来一直作为视觉任务的主流框架，近期的实验表明基于Transformer的模型，尤其

2021-11-29 15:12:16 2865

原创 [Transformer]Eformer: Edge Enhancement based Transformer for Medical Image Denoising

Eformer：基于边缘增强的Transformer用于医学图像去噪 AbstractSection I IntroductionSection II Related WorkSection III Our ApproachPart 1 Sobel-Feldman OperatorPart 2 Transformer based Encoder-DecoderPart 3 上采样和下采样Part 4 残差学习Part 5 OptimizationPart 6 Overall Network Architec

2021-11-28 14:46:30 2674

第三章+傅里叶变换.pptx

空空如也