
开源数据
文章平均质量分 91
共享数据,助力人工智能发展。
希尔贝壳AISHELL
以开放数据、技术变革创新为理念,实现人工智能民主化
展开
-
AISHELL-4 多通道中文会议语音数据库
The AISHELL-4 is a sizable real-recorded Mandarin speech dataset collected by 8-channel circular microphone array for speech processing in conference scenario. The dataset consists of211recorded meeting sessions, each containing 4 to 8 speakers, with a t原创 2022-03-10 10:51:39 · 6266 阅读 · 0 评论 -
AISHELL-DMASH 中文普通话麦克风阵列家居场景语音数据库
The AISHELL-DMASH dataset is recorded in real smart home scenarios with two different rooms. The dataset contains30000 hoursspeech data. The recording devices include one close-talking microphone and seven groups of devices at seven different positions o原创 2022-03-09 18:45:03 · 340 阅读 · 0 评论 -
AISHELL-WakeUp-1 中英文唤醒词语音数据库
This paper presents a far-field text-dependent speaker verification database named HI-MIA. We aim to meet the data requirement for far-field microphone array based speaker verification since most of the publicly available databases are single channel close原创 2022-03-09 18:03:41 · 2814 阅读 · 0 评论 -
AISHELL-3 高保真中文语音数据库
In this paper, we present AISHELL-3, a large-scale and high-fidelity multi-speaker Mandarin speech corpus which could be used to train multi-speaker Text-to-Speech (TTS) systems. The corpus contains roughly 85 hours of emotion-neutral recordings spoken by原创 2022-03-09 17:49:52 · 1286 阅读 · 0 评论 -
AISHELL-2 中文语音数据库
AISHELL-1 is by far the largest open-source speech corpus available for Mandarin speech recognition research. It was released with a baseline system containing solid training and testing pipelines for Mandarin ASR. In AISHELL-2, 1000 hours of cle原创 2022-03-09 16:41:03 · 7241 阅读 · 0 评论 -
AISHELL-ASR0009-OS1 开源中文语音数据库
An open-source Mandarin speech corpus called AISHELL-1 is released. It is by far the largest corpus which is suitable for conducting the speech recognition research and building speech recognition systems for Mandarin. The recording procedure, including au原创 2022-03-09 17:28:17 · 820 阅读 · 0 评论