[深度学习论文笔记] Convolutional Neuron Networks and its Applications

本文探讨了计算机视觉领域的挑战,特别是物体识别难题,并介绍了卷积神经网络(CNN)作为当前最先进的解决方案。文章概述了CNN在图像分类、物体定位、检测及分割等任务上的应用进展,并列举了多个顶级学术会议和期刊的相关研究工作。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

In artificial intelligence, there exists a Moravec’s Paradox, 1 “High-level reasoning requires very little computation, but low-level sensorimotor skills require enormous computational resources”. It is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility.


Computer vision is one of the such low-level sensorimotor skills. The task of recognizing an object is trivial for human, but it is quite hard for computers due to the semantic gap. Computers only see a collection of integers from 0 to 255. It is hard to write an explicit algorithm for compute to identity object from a 3D array of numbers. Therefore, inspired by the human learning process, we are going to provide the compute with many examples of each class and let the compute learn from data. This is called data-driven approach. 


Convolution Neural Network (CNN) is the state-of-the-art approach to object recognition, and it has show greatly advance on the performance of many compute vision tasks. To have a deep understanding of CNN and to inspire ideas for cutting-edge research, I think the most fundamental and effective way is to look at recent CNN publications from top-tier vision conferences and journals. Therefore, I decided to write a note to take down the basic ideas and my understandings of those publications. At present, this note contains around 60 papers from ICCV, ECCV, CVPR, NIPS, ICML, ICLR and so on. The content covers the basic topics in computer vision including image classification, object localization, object detection, object segmentation, image and language, video classification, GAN, etc. 


I would like to give acknowledgment to the followings for providing fabulous materials on CNN/deep learning.

• Andrew Ng et al. “UFLDL: Deep Learning Tutorial.” Stanford.
• Fei-Fei Li, Andrej Karpathy, and Justin Johnson. “cs231n: Convolutional Neural Networks for Visual Recognition.” Stanford.
• Andrea Vedaldi, Andrew Zisserman. “VGG Convolutional Neural Networks Practical.” Oxford Visual Geometry Group.
• Ian Goodfellow, Aaron Courville, and Yoshua Bengio. “Deep Learning.” Book in preparation for MIT Press. 2015.

• Jianxin Wu. “Introduction to Convolutional Neural Networks”. Nanjing University. 


This note is still under continuous update. If you have any question or advice, please feel free to contact with me via email.


The pdf file can be download at here.


1 https://en.wikipedia.org/wiki/Moravec’s_paradox.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值