Datasets - Related to Human

本文概述了多个用于人体姿态估计和解析的数据集,包括MPII人类姿态数据集、Leeds Sports Pose数据集等。这些数据集包含了大量带有注释的人体关键点图像,适用于训练和评估计算机视觉中的人体姿态估计算法。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

  • MPII Human Pose Dataset

    • Human Pose Estimation
    • 25K images containing over 40K people with annotated body joints
    • 410 human activities and each image is provided with an activity label
    • Extracted from YouTube video
    • For the test set, richer annotations including body part occlusions and 3D torso and head orientations
  • Frames Labeled In Cinema (FLIC)

    • Human Pose Estimation
    • 20928 examples
  • Leeds Sports Pose Dataset

    • Human Pose Estimation
    • 2000 pose annotated images
  • COCO Dataset

    • Object segmentation
    • Recognition in Context
    • Multiple objects per image
    • More than 300,000 images
    • More than 2 Million instances
    • 80 object categories
    • 5 captions per image
    • Keypoints on 100,000 people
  • Look into Person - LIP Dataset

    • Human Pose Estimation, Human Parsing
    • 30462 images for training set, 10000 images for validation set and 10000 for testing set
  • Pascal VOC data sets

    • Classification/Detection
    • Segmentation
    • Action Classification
    • Person Layout/Person Parsing
  • Large-scale Fashion (DeepFashion) Database

    • Human Clothes Database
    • Category and Attribute Prediction Benchmark
    • In-shop Clothes Retrieval Benchmark
    • Consumer-to-shop Clothes Retrieval Benchmark
    • Fashion Landmark Detection Benchmark
  • Clothes Segmentation

    • Clothings Semantic Segmentation

推荐:
CVonline: Image Databases

The human visual cortex is biased towards shape components while CNNs produce texture biased features. This fact may explain why the performance of CNN significantly degrades with low-labeled input data scenarios. In this paper, we propose a frequency re-calibration U-Net (FRCU-Net) for medical image segmentation. Representing an object in terms of frequency may reduce the effect of texture bias, resulting in better generalization for a low data regime. To do so, we apply the Laplacian pyramid in the bottleneck layer of the U-shaped structure. The Laplacian pyramid represents the object proposal in different frequency domains, where the high frequencies are responsible for the texture information and lower frequencies might be related to the shape. Adaptively re-calibrating these frequency representations can produce a more discriminative representation for describing the object of interest. To this end, we first propose to use a channel-wise attention mechanism to capture the relationship between the channels of a set of feature maps in one layer of the frequency pyramid. Second, the extracted features of each level of the pyramid are then combined through a non-linear function based on their impact on the final segmentation output. The proposed FRCU-Net is evaluated on five datasets ISIC 2017, ISIC 2018, the PH2, lung segmentation, and SegPC 2021 challenge datasets and compared to existing alternatives, achieving state-of-the-art results.请详细介绍这段话中的技术点和实现方式
05-29
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值