在跑开源项目时卡在了从huggingface上下载数据集,这里以该数据集为例jingyaogong/minimind_dataset · Datasets at HF MirrorWe’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://hf-mirror.com/datasets/jingyaogong/minimind_dataset总结了以下三种方式
方法一: 通过git clone来下载
git clone https://hf-mirror.com/datasets/jingyaogong/minimind_dataset
使用该方式下载数据下载完还需要进行解压
方法二: 通过huggingface-cli下载
1.安装huggingface-cli
:
pip install huggingface_hub
2.设置环境变量(如果需要使用镜像站):
export HF_ENDPOINT=https://hf-mirror.com
3.下载数据集:
huggingface-cli download \
--repo-type dataset \
--repo-id jingyaogong/minimind_dataset \
--local-dir /path/to/download