【深度学习】机器视觉开源代码集合

一、特征提取Feature Extraction:

 

二、图像分割Image Segmentation:

  • Normalized Cut [1] [Matlab code]
  • Gerg Mori’ Superpixel code [2] [Matlab code]
  • Efficient Graph-based Image Segmentation [3] [C++ code] [Matlab wrapper]
  • Mean-Shift Image Segmentation [4] [EDISON C++ code] [Matlab wrapper]
  • OWT-UCM Hierarchical Segmentation [5] [Resources]
  • Turbepixels [6] [Matlab code 32bit] [Matlab code 64bit] [Updated code]
  • Quick-Shift [7] [VLFeat]
  • SLIC Superpixels [8] [Project]
  • Segmentation by Minimum Code Length [9] [Project]
  • Biased Normalized Cut [10] [Project]
  • Segmentation Tree [11-12] [Project]
  • Entropy Rate Superpixel Segmentation [13] [Code]
  • Fast Approximate Energy Minimization via Graph Cuts[Paper][Code]
  • Efficient Planar Graph Cuts with Applications in Computer Vision[Paper][Code]
  • Isoperimetric Graph Partitioning for Image Segmentation[Paper][Code]
  • Random Walks for Image Segmentation[Paper][Code]
  • Blossom V: A new implementation of a minimum cost perfect matching algorithm[Code]
  • An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Computer Vision[Paper][Code]
  • Geodesic Star Convexity for Interactive Image Segmentation[Project]
  • Contour Detection and Image Segmentation Resources[Project][Code]
  • Biased Normalized Cuts[Project]
  • Max-flow/min-cut[Project]
  • Chan-Vese Segmentation using Level Set[Project]
  • A Toolbox of Level Set Methods[Project]
  • Re-initialization Free Level Set Evolution via Reaction Diffusion[Project]
  • Improved C-V active contour model[Paper][Code]
  • A Variational Multiphase Level Set Approach to Simultaneous Segmentation and Bias Correction[Paper][Code]
  • Level Set Method Research by Chunming Li[Project]
  • ClassCut for Unsupervised Class Segmentation[code]
  • SEEDS: Superpixels Extracted via Energy-Driven Sampling [Project][other]

 

三、目标检测Object Detection:

  • A simple object detector with boosting [Project]
  • INRIA Object Detection and Localization Toolkit [1] [Project]
  • Discriminatively Trained Deformable Part Models [2] [Project]
  • Cascade Object Detection with Deformable Part Models [3] [Project]
  • Poselet [4] [Project]
  • Implicit Shape Model [5] [Project]
  • Viola and Jones’s Face Detection [6] [Project]
  • Bayesian Modelling of Dyanmic Scenes for Object Detection[Paper][Code]
  • Hand detection using multiple proposals[Project]
  • Color Constancy, Intrinsic Images, and Shape Estimation[Paper][Code]
  • Discriminatively trained deformable part models[Project]
  • Gradient Response Maps for Real-Time Detection of Texture-Less Objects: LineMOD [Project]
  • Image Processing On Line[Project]
  • Robust Optical Flow Estimation[Project]
  • Where's Waldo: Matching People in Images of Crowds[Project]
  • Scalable Multi-class Object Detection[Project]
  • Class-Specific Hough Forests for Object Detection[Project]
  • Deformed Lattice Detection In Real-World Images[Project]
  • Discriminatively trained deformable part models[Project]

 

四、显著性检测Saliency Detection:

  • Itti, Koch, and Niebur’ saliency detection [1] [Matlab code]
  • Frequency-tuned salient region detection [2] [Project]
  • Saliency detection using maximum symmetric surround [3] [Project]
  • Attention via Information Maximization [4] [Matlab code]
  • Context-aware saliency detection [5] [Matlab code]
  • Graph-based visual saliency [6] [Matlab code]
  • Saliency detection: A spectral residual approach. [7] [Matlab code]
  • Segmenting salient objects from images and videos. [8] [Matlab code]
  • Saliency Using Natural statistics. [9] [Matlab code]
  • Discriminant Saliency for Visual Recognition from Cluttered Scenes. [10] [Code]
  • Learning to Predict Where Humans Look [11] [Project]
  • Global Contrast based Salient Region Detection [12] [Project]
  • Bayesian Saliency via Low and Mid Level Cues[Project]
  • Top-Down Visual Saliency via Joint CRF and Dictionary Learning[Paper][Code]
  • Saliency Detection: A Spectral Residual Approach[Code]

 

五、图像分类、聚类Image Classification, Clustering

  • Pyramid Match [1] [Project]
  • Spatial Pyramid Matching [2] [Code]
  • Locality-constrained Linear Coding [3] [Project] [Matlab code]
  • Sparse Coding [4] [Project] [Matlab code]
  • Texture Classification [5] [Project]
  • Multiple Kernels for Image Classification [6] [Project]
  • Feature Combination [7] [Project]
  • SuperParsing [Code]
  • Large Scale Correlation Clustering Optimization[Matlab code]
  • Detecting and Sketching the Common[Project]
  • Self-Tuning Spectral Clustering[Project][Code]
  • User Assisted Separation of Reflections from a Single Image Using a Sparsity Prior[Paper][Code]
  • Filters for Texture Classification[Project]
  • Multiple Kernel Learning for Image Classification[Project]
  • SLIC Superpixels[Project]

 

六、抠图Image Matting

  • A Closed Form Solution to Natural Image Matting [Code]
  • Spectral Matting [Project]
  • Learning-based Matting [Code]

 

七、目标跟踪Object Tracking:

  • A Forest of Sensors - Tracking Adaptive Background Mixture Models [Project]
  • Object Tracking via Partial Least Squares Analysis[Paper][Code]
  • Robust Object Tracking with Online Multiple Instance Learning[Paper][Code]
  • Online Visual Tracking with Histograms and Articulating Blocks[Project]
  • Incremental Learning for Robust Visual Tracking[Project]
  • Real-time Compressive Tracking[Project]
  • Robust Object Tracking via Sparsity-based Collaborative Model[Project]
  • Visual Tracking via Adaptive Structural Local Sparse Appearance Model[Project]
  • Online Discriminative Object Tracking with Local Sparse Representation[Paper][Code]
  • Superpixel Tracking[Project]
  • Learning Hierarchical Image Representation with Sparsity, Saliency and Locality[Paper][Code]
  • Online Multiple Support Instance Tracking [Paper][Code]
  • Visual Tracking with Online Multiple Instance Learning[Project]
  • Object detection and recognition[Project]
  • Compressive Sensing Resources[Project]
  • Robust Real-Time Visual Tracking using Pixel-Wise Posteriors[Project]
  • Tracking-Learning-Detection[Project][OpenTLD/C++ Code]
  • the HandVu:vision-based hand gesture interface[Project]
  • Learning Probabilistic Non-Linear Latent Variable Models for Tracking Complex Activities[Project]

 

八、Kinect:

 

九、3D相关:

  • 3D Reconstruction of a Moving Object[Paper] [Code]
  • Shape From Shading Using Linear Approximation[Code]
  • Combining Shape from Shading and Stereo Depth Maps[Project][Code]
  • Shape from Shading: A Survey[Paper][Code]
  • A Spatio-Temporal Descriptor based on 3D Gradients (HOG3D)[Project][Code]
  • Multi-camera Scene Reconstruction via Graph Cuts[Paper][Code]
  • A Fast Marching Formulation of Perspective Shape from Shading under Frontal Illumination[Paper][Code]
  • Reconstruction:3D Shape, Illumination, Shading, Reflectance, Texture[Project]
  • Monocular Tracking of 3D Human Motion with a Coordinated Mixture of Factor Analyzers[Code]
  • Learning 3-D Scene Structure from a Single Still Image[Project]

 

十、机器学习算法:

  • Matlab class for computing Approximate Nearest Nieghbor (ANN) [Matlab class providing interface toANN library]
  • Random Sampling[code]
  • Probabilistic Latent Semantic Analysis (pLSA)[Code]
  • FASTANN and FASTCLUSTER for approximate k-means (AKM)[Project]
  • Fast Intersection / Additive Kernel SVMs[Project]
  • SVM[Code]
  • Ensemble learning[Project]
  • Deep Learning[Net]
  • Deep Learning Methods for Vision[Project]
  • Neural Network for Recognition of Handwritten Digits[Project]
  • Training a deep autoencoder or a classifier on MNIST digits[Project]
  • THE MNIST DATABASE of handwritten digits[Project]
  • Ersatz:deep neural networks in the cloud[Project]
  • Deep Learning [Project]
  • sparseLM : Sparse Levenberg-Marquardt nonlinear least squares in C/C++[Project]
  • Weka 3: Data Mining Software in Java[Project]
  • Invited talk "A Tutorial on Deep Learning" by Dr. Kai Yu (余凯)[Video]
  • CNN - Convolutional neural network class[Matlab Tool]
  • Yann LeCun's Publications[Wedsite]
  • LeNet-5, convolutional neural networks[Project]
  • Training a deep autoencoder or a classifier on MNIST digits[Project]
  • Deep Learning 大牛Geoffrey E. Hinton's HomePage[Website]
  • Multiple Instance Logistic Discriminant-based Metric Learning (MildML) and Logistic Discriminant-based Metric Learning (LDML)[Code]
  • Sparse coding simulation software[Project]
  • Visual Recognition and Machine Learning Summer School[Software]

 

十一、目标、行为识别Object, Action Recognition:

  • Action Recognition by Dense Trajectories[Project][Code]
  • Action Recognition Using a Distributed Representation of Pose and Appearance[Project]
  • Recognition Using Regions[Paper][Code]
  • 2D Articulated Human Pose Estimation[Project]
  • Fast Human Pose Estimation Using Appearance and Motion via Multi-Dimensional Boosting Regression[Paper][Code]
  • Estimating Human Pose from Occluded Images[Paper][Code]
  • Quasi-dense wide baseline matching[Project]
  • ChaLearn Gesture Challenge: Principal motion: PCA-based reconstruction of motion histograms[Project]
  • Real Time Head Pose Estimation with Random Regression Forests[Project]
  • 2D Action Recognition Serves 3D Human Pose Estimation[Project]
  • A Hough Transform-Based Voting Framework for Action Recognition[Project]
  • Motion Interchange Patterns for Action Recognition in Unconstrained Videos[Project]
  • 2D articulated human pose estimation software[Project]
  • Learning and detecting shape models [code]
  • Progressive Search Space Reduction for Human Pose Estimation[Project]
  • Learning Non-Rigid 3D Shape from 2D Motion[Project]

 

十二、图像处理:

  • Distance Transforms of Sampled Functions[Project]
  • The Computer Vision Homepage[Project]
  • Efficient appearance distances between windows[code]
  • Image Exploration algorithm[code]
  • Motion Magnification 运动放大 [Project]
  • Bilateral Filtering for Gray and Color Images 双边滤波器 [Project]
  • A Fast Approximation of the Bilateral Filter using a Signal Processing Approach [Project]

 

十三、一些实用工具:

  • EGT: a Toolbox for Multiple View Geometry and Visual Servoing[Project] [Code]
  • a development kit of matlab mex functions for OpenCV library[Project]
  • Fast Artificial Neural Network Library[Project]

 

十四、人手及指尖检测与识别:

  • finger-detection-and-gesture-recognition [Code]
  • Hand and Finger Detection using JavaCV[Project]
  • Hand and fingers detection[Code]

 

十五、场景解释:

  • Nonparametric Scene Parsing via Label Transfer [Project]

 

十六、光流Optical flow:

  • High accuracy optical flow using a theory for warping [Project]
  • Dense Trajectories Video Description [Project]
  • SIFT Flow: Dense Correspondence across Scenes and its Applications[Project]
  • KLT: An Implementation of the Kanade-Lucas-Tomasi Feature Tracker [Project]
  • Tracking Cars Using Optical Flow[Project]
  • Secrets of optical flow estimation and their principles[Project]
  • implmentation of the Black and Anandan dense optical flow method[Project]
  • Optical Flow Computation[Project]
  • Beyond Pixels: Exploring New Representations and Applications for Motion Analysis[Project]
  • A Database and Evaluation Methodology for Optical Flow[Project]
  • optical flow relative[Project]
  • Robust Optical Flow Estimation [Project]
  • optical flow[Project]

 

十七、图像检索Image Retrieval

  • Semi-Supervised Distance Metric Learning for Collaborative Image Retrieval [Paper][code]

 

十八、马尔科夫随机场Markov Random Fields:

  • Markov Random Fields for Super-Resolution [Project]
  • A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors [Project]

 

十九、运动检测Motion detection:

  • Moving Object Extraction, Using Models or Analysis of Regions [Project]
  • Background Subtraction: Experiments and Improvements for ViBe [Project]
  • A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications [Project]
  • changedetection.net: A new change detection benchmark dataset[Project]
  • ViBe - a powerful technique for background detection and subtraction in video sequences[Project]
  • Background Subtraction Program[Project]
  • Motion Detection Algorithms[Project]
  • Stuttgart Artificial Background Subtraction Dataset[Project]
  • Object Detection, Motion Estimation, and Tracking[Project]

 

Feature Detection and Description

General Libraries: 

  • VLFeat – Implementation of various feature descriptors (including SIFT, HOG, and LBP) and covariant feature detectors (including DoG, Hessian, Harris Laplace, Hessian Laplace, Multiscale Hessian, Multiscale Harris). Easy-to-use Matlab interface. See Modern features: Software – Slides providing a demonstration of VLFeat and also links to other software. Check also VLFeat hands-on session training
  • OpenCV – Various implementations of modern feature detectors and descriptors (SIFT, SURF, FAST, BRIEF, ORB, FREAK, etc.)

 

Fast Keypoint Detectors for Real-time Applications: 

  • FAST – High-speed corner detector implementation for a wide variety of platforms
  • AGAST – Even faster than the FAST corner detector. A multi-scale version of this method is used for the BRISK descriptor (ECCV 2010).

 

Binary Descriptors for Real-Time Applications: 

  • BRIEF – C++ code for a fast and accurate interest point descriptor (not invariant to rotations and scale) (ECCV 2010)
  • ORB – OpenCV implementation of the Oriented-Brief (ORB) descriptor (invariant to rotations, but not scale)
  • BRISK – Efficient Binary descriptor invariant to rotations and scale. It includes a Matlab mex interface. (ICCV 2011)
  • FREAK – Faster than BRISK (invariant to rotations and scale) (CVPR 2012)

 

SIFT and SURF Implementations: 

 

Other Local Feature Detectors and Descriptors: 

  • VGG Affine Covariant features – Oxford code for various affine covariant feature detectors and descriptors.
  • LIOP descriptor – Source code for the Local Intensity order Pattern (LIOP) descriptor (ICCV 2011).
  • Local Symmetry Features – Source code for matching of local symmetry features under large variations in lighting, age, and rendering style (CVPR 2012).

 

Global Image Descriptors: 

  • GIST – Matlab code for the GIST descriptor
  • CENTRIST – Global visual descriptor for scene categorization and object detection (PAMI 2011)

 

Feature Coding and Pooling 

  • VGG Feature Encoding Toolkit – Source code for various state-of-the-art feature encoding methods – including Standard hard encoding, Kernel codebook encoding, Locality-constrained linear encoding, and Fisher kernel encoding.
  • Spatial Pyramid Matching – Source code for feature pooling based on spatial pyramid matching (widely used for image classification)

 

Convolutional Nets and Deep Learning 

  • EBLearn – C++ Library for Energy-Based Learning. It includes several demos and step-by-step instructions to train classifiers based on convolutional neural networks.
  • Torch7 – Provides a matlab-like environment for state-of-the-art machine learning algorithms, including a fast implementation of convolutional neural networks.
  • Deep Learning - Various links for deep learning software.

 

Part-Based Models 

 

Attributes and Semantic Features 

 

Large-Scale Learning 

  • Additive Kernels – Source code for fast additive kernel SVM classifiers (PAMI 2013).
  • LIBLINEAR – Library for large-scale linear SVM classification.
  • VLFeat – Implementation for Pegasos SVM and Homogeneous Kernel map.

 

Fast Indexing and Image Retrieval 

  • FLANN – Library for performing fast approximate nearest neighbor.
  • Kernelized LSH – Source code for Kernelized Locality-Sensitive Hashing (ICCV 2009).
  • ITQ Binary codes – Code for generation of small binary codes using Iterative Quantization and other baselines such as Locality-Sensitive-Hashing (CVPR 2011).
  • INRIA Image Retrieval – Efficient code for state-of-the-art large-scale image retrieval (CVPR 2011).

 

Object Detection 

 

3D Recognition 

 

Action Recognition 


 

Datasets

 

Attributes 

  • Animals with Attributes – 30,475 images of 50 animals classes with 6 pre-extracted feature representations for each image.
  • aYahoo and aPascal – Attribute annotations for images collected from Yahoo and Pascal VOC 2008.
  • FaceTracer – 15,000 faces annotated with 10 attributes and fiducial points.
  • PubFig – 58,797 face images of 200 people with 73 attribute classifier outputs.
  • LFW – 13,233 face images of 5,749 people with 73 attribute classifier outputs.
  • Human Attributes – 8,000 people with annotated attributes. Check also this link for another dataset of human attributes.
  • SUN Attribute Database – Large-scale scene attribute database with a taxonomy of 102 attributes.
  • ImageNet Attributes – Variety of attribute labels for the ImageNet dataset.
  • Relative attributes – Data for OSR and a subset of PubFig datasets. Check also this link for the WhittleSearch data.
  • Attribute Discovery Dataset – Images of shopping categories associated with textual descriptions.

 

Fine-grained Visual Categorization 

 

Face Detection 

  • FDDB – UMass face detection dataset and benchmark (5,000+ faces)
  • CMU/MIT – Classical face detection dataset.

 

Face Recognition 

  • Face Recognition Homepage – Large collection of face recognition datasets.
  • LFW – UMass unconstrained face recognition dataset (13,000+ face images).
  • NIST Face Homepage – includes face recognition grand challenge (FRGC), vendor tests (FRVT) and others.
  • CMU Multi-PIE – contains more than 750,000 images of 337 people, with 15 different views and 19 lighting conditions.
  • FERET – Classical face recognition dataset.
  • Deng Cai’s face dataset in Matlab Format – Easy to use if you want play with simple face datasets including Yale, ORL, PIE, and Extended Yale B.
  • SCFace – Low-resolution face dataset captured from surveillance cameras.

 

Handwritten Digits 

  • MNIST – large dataset containing a training set of 60,000 examples, and a test set of 10,000 examples.

 

Pedestrian Detection

 

Generic Object Recognition 

  • ImageNet – Currently the largest visual recognition dataset in terms of number of categories and images.
  • Tiny Images – 80 million 32x32 low resolution images.
  • Pascal VOC – One of the most influential visual recognition datasets.
  • Caltech 101 / Caltech 256 – Popular image datasets containing 101 and 256 object categories, respectively.
  • MIT LabelMe – Online annotation tool for building computer vision databases.

 

Scene Recognition

 

Feature Detection and Description 

 

Action Recognition

 

RGBD Recognition 

<think>好的,我现在需要帮助用户寻找关于机器视觉深度学习项目的资源或教程。首先,我应该仔细分析用户的问题,看看他具体需要哪些方面的信息。用户提到了“机器视觉”、“深度学习”、“项目现”、“教程”和“示例”,所以他的需求可能集中在如何从零开始学习并践相关项目。 接下来,我需要回忆系统提供的引用内容,看看有没有可以直接利用的信息。引用[1]提到了机器视觉的应用方向、项目流程、常规算法流程以及工程师需要具备的能力,这可能会对用户有帮助。引用[2]提供了视觉机器学习20讲的MATLAB示例,特别是第18讲的深度学习算法,这可能是一个具体的资源。引用[3]讨论了DeepID2中的metric learning应用,引用[4]则介绍了CNN及其应用领域,这些可以作为深度学习的理论支持。 用户可能需要分步骤的指导,比如学习路径、推荐的教程、项目示例以及工具框架。我应该将这些内容结构化,分点说明,让用户更容易理解和跟进。 首先,学习路径方面,引用[1]提到图像处理知识、编程能力和硬件知识都是必须的,所以应该建议用户先掌握这些基础。然后推荐具体的教程,比如引用[2]中的MATLAB示例和引用[4]中的CNN应用案例。项目示例部分,可以结合引用中的信息,列举如目标检测、图像分类等常见项目,并推荐公开数据集。工具框架方面,Python和TensorFlow、PyTorch是主流,需要提到。 另外,用户可能对战资源感兴趣,比如GitHub项目、Kaggle竞赛,这些可以增强践经验。引用[3]提到的DeepID2可以作为进阶学习的例子,结合metric learning,这可能是一个具体的项目参考。 需要注意的是,回答中要正确应用LaTeX格式,但用户的问题主要涉及资源推荐,可能数学公式较少,不过如果有提及算法或模型,可以适当加入,比如CNN的结构公式,如$y = \text{Conv}(x, W) + b$,但根据系统指令,行内公式需要用$...$,独立公式用$$...$$。不过当前回答中可能不需要太多公式,但需要检查是否有必要的地方正确应用了格式。 最后,生成相关问题,引导用户进一步探索。问题应围绕学习资源、项目难点、框架选择、数据处理等,这些都是用户可能遇到的后续问题。 总结起来,回答的结构应该是: 1. 学习路径建议 2. 推荐教程和书籍 3. 项目示例及数据集推荐 4. 工具和框架 5. 战资源 6. 相关问题 需要确保每个部分都引用合适的站内引用,比如引用[1]、[2]、[4]等,并自然地在段落末尾添加引用标识。同时,检查是否有遗漏的重要资源或建议,比如社区推荐或进阶学习的方向。</think>根据您对机器视觉深度学习项目的需求,我将从学习路径、战资源和典型案例三个方面提供结构化建议,并标注相关资源来源: $$项目现流程 = 数据采集 + 算法设计 + 模型训练 + 部署应用$$ 这是机器视觉项目的核心公式[^1] 一、系统学习路径 1. 基础准备阶段: - 数学基础:掌握线性代数(矩阵运算$A \cdot B$)、概率统计(贝叶斯定理$P(A|B)$) - 编程能力:Python语法+OpenCV库(图像处理$I_{new}=cv2.filter(I)$) - 工具框架:TensorFlow/PyTorch基础 2. 专项能力培养: - 图像处理:特征提取(SIFT算子$\nabla I$)、图像分割(Mask R-CNN) - 深度学习:CNN架构设计(卷积核$W \ast X$)、目标检测(YOLO算法) 二、推荐战资源 1. 开源项目集合: - MMDetection(目标检测框架):github.com/open-mmlab/mmdetection - Detectron2(Facebook视觉库):含预训练模型和教程[^4] 2. 经典教程: ```python # 使用PyTorch现图像分类的典型代码结构 model = torchvision.models.resnet18(pretrained=True) criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(model.parameters(), lr=0.001) ``` 3. 数据集资源: - COCO(通用物体识别):包含80类物体的标注数据 - Cityscapes(自动驾驶场景):5000张精细标注街景图 三、典型案例解析 1. 工业质检项目: - 技术栈:Halcon + Python - 现流程:缺陷检测(形态学运算$I_{erode} = I \ominus B$)→分类模型→结果可视化 2. 人脸识别系统: - 关键技术:DeepID2网络结构(联合验证损失分类损失)[^3] - 性能指标:特征向量相似度计算$sim(v_1,v_2) = \frac{v_1 \cdot v_2}{|v_1||v_2|}$
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值