机器学习八大优质数据库

最新推荐文章于 2025-07-30 01:00:00 发布

haoji007

最新推荐文章于 2025-07-30 01:00:00 发布

阅读量1.4k

点赞数

CC 4.0 BY-SA版权

分类专栏：【挑战赛及数据集】

原文链接：https://www.cs.toronto.edu/~kriz/cifar.html

【挑战赛及数据集】专栏收录该内容

68 篇文章 ¥19.90 ¥99.00

订阅专栏

本文列举了机器学习八大优质数据库，包括CIFAR-10、ImageNet、COCO等，以及TensorFlow的十大优质资源，如教程、模型/项目和相关库，帮助学习者深入理解和应用机器学习与深度学习。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

机器学习八大优质数据库：

一、教程CIFAR-10 & CIFAR-100

CIFAR-10包含10个类别，50,000个训练图像，彩色图像大小：32x32，10,000个测试图像。

（类别：airplane，automobile, bird, cat, deer, dog, frog, horse, ship, truck）

（作者：Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton）

（数据格式：Python版本、Matlab版本、二进制版本<for C程序>）

CIFAR-100与CIFAR-10类似，包含100个类，每类有600张图片，其中500张用于训练，100张用于测试；这100个类分组成20个超类。每个图像有一个"find" label和一个"coarse"label。

网址：https://www.cs.toronto.edu/~kriz/cifar.html

二、图像分类结果及对应的论文

图像分类结果及应的论文，包含数据集：MNIST、CIFAR-10、CIFAR-100、STL-10、SVHN、ILSVRC2012 task 1

网址：http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html

ILSVRC： ImageNet Large Scale Visual Recognition Challenge

网址：ImageNet Large Scale Visual Recognition Competition 2014 (ILSVRC2014)

三、ImageNet

ImageNet相关信息如下：

1）Total number of non-empty synsets: 21841
2）Total number of images: 14,197,122
3）Number of images with bounding box annotations: 1,034,908
4）Number of synsets with SIFT features: 1000
5）Number of images with SIFT features: 1.2 million

网址：ImageNet

四、COCO

COCO(Common Objects in Context)是一个新的图像识别、分割、和字幕数据集，它有如下特点：

1）Object segmentation

2）Recognition in Context
3）Multiple objects per image
4）More than 300,000 images
5）More than 2 Million instances
6）80 object categories
7）5 captions per image
8）Keypoints on 100,000 people

COCO 2016 Detection Challenge(2016.6.1-2016.9.9)和COCO 2016 Keypoint Challenge(2016.6.1-2016.9.9)已经由Microsoft发起由ECCV 2016(ECCV：European Conference On Computer Vision )。

网址：Common Objects in Context

五. 3D数据

1）RGB-D People Dataset

Dr. Luciano Spinello

2）NYU Hand Pose Datasetcode

http://cims.nyu.edu/~tompson/NYU_Hand_Pose_Dataset.htm

3）Human3.6M (3D Human Pose Dataset)

- 《Iterated Second-Order Label Sensitive Pooling for 3D Human Pose Estimation》

http://vision.imar.ro/human3.6m/description.php

六. 人脸Dataset

1）LFW (Labeled Faces in the Wild)

http://vis-www.cs.umass.edu/lfw/index.html

七. Stereo Datasets

2）Middlebury Stereo Datasets

http://vision.middlebury.edu/stereo/data/

3）KITTI Vision Benchmark Suite

The KITTI Vision Benchmark Suite

八. 普林斯顿大学人工智能自动驾驶汽车项目

1）Deep Drive

DeepDrive

2）Source Code and Data

DeepDriving

本部分转自博客园

原文链接：机器学习数据集(Dataset) - wl_v_2016 - 博客园

TensorFlow十大优质资源：

什么是 TensorFlow？

TensorFlow 是一个开源软件库，用于使用数据流图进行数值计算。换句话说，即是构建深度学习模型的最佳方式。

本文整理了一些优秀的有关 TensorFlow 的实践、库和项目的列表。

一、教程

TensorFlow Tutorial 1 — 从基础到更有趣的 TensorFlow 应用
TensorFlow Tutorial 2 — 基于 Google TensorFlow 框架的深度学习简介，这些教程是 Newmu 的Theano 直接端口
TensorFlow Examples — 给初学者的 TensorFlow 教程和代码示例
Sungjoon's TensorFlow-101 — 通过 Python 使用 Jupyter Notebook 编写的 TensorFlow 教程
Terry Um’s TensorFlow Exercises — 从其他 TensorFlow 示例重新创建代码
Installing TensorFlow on Raspberry Pi 3 — TensorFlow 在树莓派上正确编译和运行
Classification on time series — 在 TensorFlow 中使用 LSTM 对手机传感器数据进行递归神经网络分类

二、模型/项目

Show, Attend and Tell — 基于聚焦机制的图像字幕生成器（聚焦机制「Attention Mechanism」是当下深度学习前沿热点之一，能够逐个关注输入的不同部分，给出一系列理解）
Neural Style — Neural Style 的实现（Neural Style 是让机器模仿已有画作的绘画风格把一张图片重新绘制的算法）
Pretty Tensor — Pretty Tensor 提供了一个高级构建器 API
Neural Style — Neural Style 的实现
TensorFlow White Paper Notes — 带注释的笔记和 TensorFlow 白皮书的摘要，以及 SVG 图形和文档链接
NeuralArt — 艺术风格神经算法的实现
使用 TensorFlow 和 PyGame 来深度强化学习乒乓球
Generative Handwriting Demo using TensorFlow — 尝试实现 Alex Graves 的论文中随机手写生成部分
Neural Turing Machine in TensorFlow — 神经图灵机的 TensorFlow 实现
GoogleNet Convolutional Neural Network Groups Movie Scenes By Setting — 根据对象，地点和其中显示的其他内容来搜索、过滤和描述视频
Neural machine translation between the writings of Shakespeare and modern English using TensorFlow — 单语翻译，从现代英语到莎士比亚，反之亦然
Chatbot — “一个神经会话模型”的实现
Colornet - Neural Network to colorize grayscale images — 通过神经网络给灰度图像着色
Neural Caption Generator with Attention — 图像理解的 Tensorflow 实现
Weakly_detector — “学习深层特征以区分本地化”的 TensorFlow 实现
Dynamic Capacity Networks — “动态容量网络”的实现
HMM in TensorFlow — HMM 的维特比和前向/后向算法的实现
DeepOSM — 使用 OpenStreetMap 功能和卫星图像训练 TensorFlow 神经网络
DQN-tensorflow — 使用 TensorFlow 通过 OpenAI Gym 实现 DeepMind 的“通过深度强化学习的人类水平控制”
Highway Network — "深度网络训练" 的 TensorFlow 实现
Sentence Classification with CNN — TensorFlow 实现“卷积神经网络的句子分类”
End-To-End Memory Networks — 端到端记忆网络的实现
Character-Aware Neural Language Models — 字符感知神经语言模型的 TensorFlow 实现
YOLO TensorFlow ++ — TensorFlow 实现的 “YOLO：实时对象检测”，具有训练和支持在移动设备上实时运行的功能
Wavenet — WaveNet 生成神经网络架构的 TensorFlow 实现，用于生成音频
Mnemonic Descent Method — 助记符下降法：应用于端对端对准的复现过程

三、由 TensorFlow 提供技术支持

YOLO TensorFlow — 实现 “YOLO：实时对象检测”
Magenta — 音乐和艺术的生成与机器智能（研究项目）

四、与 TensorFlow 有关的库

Scikit Flow (TF Learn) — 深度/机器学习的简化接口（现在是 TensorFlow 的一部分）
tensorflow.rb — 使用 SWIG 用于 Ruby 的 TensorFlow 本地接口
tflearn — 深度学习库，具有更高级别的 API
TensorFlow-Slim — 在 TensorFlow 中定义、训练和评估模型的轻量级库
TensorFrames — Apache Spark 的 TensorFlow 绑定，Apache Spark 上 DataFrames 的 Tensorflow 包裹器
caffe-tensorflow — 将 Caffe 模型转换为 TensorFlow 格式
keras — 用于 TensorFlow 和 Theano 的最小、模块化深度学习库
SyntaxNet: Neural Models of Syntax — TensorFlow 实现全球标准化中基于过渡的神经网络描述的模型

五、视频

TensorFlow Guide 1 — TensorFlow 安装和使用指南 1
TensorFlow Guide 2 — TensorFlow 安装和使用指南 2
TensorFlow Basic Usage — 基本使用指南
TensorFlow Deep MNIST for Experts — 深入了解 MNIST
TensorFlow Udacity Deep Learning — 在具有 1Gb 数据的 Cloud 9 在线服务上免费安装 TensorFlow 的基本步骤
为什么 Google 希望每个人都有权访问 TensorFlow
2016/1/19 TensorFlow 硅谷见面会
2016/1/21 TensorFlow 硅谷见面会
Stanford CS224d Lecture 7 - Introduction to TensorFlow, 19th Apr 2016 — CS224d 用于自然语言处理的深度学习
Diving into Machine Learning through TensorFlow — 通过 TensorFlow 进入机器学习，2016 Pycon 大会
Large Scale Deep Learning with TensorFlow — Jeff Dean Spark Summit 2016 主题演讲
Tensorflow and deep learning - without at PhD — TensorFlow 和深度学习（by Martin Görner）

六、论文/文献

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems — 介绍了 TensorFlow 接口以及在 Google 上构建的该接口的实现
Comparative Study of Deep Learning Software Frameworks — 该研究在几种类型的深度学习架构上进行，我们评估上述框架在单个机器上用于（多线程）CPU 和 GPU（Nvidia Titan X）设置时的性能
Distributed TensorFlow with MPI — 在本文中，我们对最近提出的 Google TensorFlow 使用消息传递接口（MPI）在大规模集群上执行进行扩展
Globally Normalized Transition-Based Neural Networks — 本文介绍了 SyntaxNet 背后的模型
TensorFlow: A system for large-scale machine learning — 本文介绍了 TensorFlow 数据流模型与现有系统的对比，并展示了引人注目的性能

七、官方公告

TensorFlow: smarter machine learning, for everyone — 介绍 TensorFlow
Announcing SyntaxNet: The World’s Most Accurate Parser Goes Open Source — SyntaxNet 的发布声明，“一个在 TensorFlow 中实现的开源神经网络框架，为自然语言理解系统提供了基础。

八、博客文章

为什么 TensorFlow 会改变游戏的 AI
TensorFlow for Poets — 完成 TensorFlow 的实现
Scikit 流简介，简化 TensorFlow 接口 — 主要特点说明
Building Machine Learning Estimator in TensorFlow — 了解 TensorFlow 的内部学习估计器
TensorFlow — 不只是用于深度学习
indico 机器学习团队对 TensorFlow 的采纳
The Good, Bad, & Ugly of TensorFlow — 一份六个月快速演变的调查
Fizz Buzz in TensorFlow — Joel Grus 的一个笑话
在 TensorFlow 使用 RNNs 的实用指南和未记录的功能 — 分步指南，在 GitHub 上提供完整的代码示例
使用 TensorBoard 在 TensorFlow 中可视化图像分类的重新训练

九、社区

Stack Overflow TensorFlow 专区
@TensorFlo 推特账号
Reddit 的 TensorFlow 版块
邮件列表

十、书籍

与 TensorFlow 的初次接触 — 作者：Jordi Torres，UPC Barcelona Tech 教授，巴塞罗那超级计算中心研究经理和高级顾问
使用 Python 进行深度学习 — 使用 Keras 在 Theano 和 TensorFlow 上开发深度学习模型（By Jason Brownlee）
用于机器智能的 TensorFlow — 一份完整指南 — 使用 TensorFlow 从图形计算的基础到深度学习模型，并在生产环境中使用它（Bleeding Edge 出版）
TensorFlow 入门 — 使用 Google 的最新数值计算库开始运行，并深入了解您的数据（By Giancarlo Zaccone）
使用 Scikit-Learn 和 TensorFlow 的实践机器学习 — 涵盖 ML 基本原理，使用 TensorFlow，最新的 CNN，RNN 和 Autoencoder 架构在多个服务器和 GPU 上训练和部署深度网络，以及强化学习（Deep Q）
使用 TensorFlow 构建机器学习项目 — 本书涵盖了 TensorFlow 中的各种项目，揭示了 TensorFlow 在不同情况下可以做什么。还提供了关于训练模型，机器学习，深度学习和各种使用神经网络的项目。每个项目都是一个有吸引力和有见地的练习，将教你如何使用 TensorFlow，并告诉您如何通过使用 Tensors 来探索数据层。