Ubuntu20.04配置qwen0.5B记录

环境简介

Ubuntu20.04、
NVIDIA-SMI 545.29.06、
Cuda 11.4、
python3.10、
pytorch1.11.0

开始搭建

python环境设置

创建虚拟环境

conda create --name qewn python==3.10

预安装modelscope和transformers

pip install modelscope
pip install transformers

安装pytorch

conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3
模型需要下载

创建一个python文件

gedit download.py

里面复制如下内容

from modelscope.hub.file_download import model_file_download
 	
model_dir = model_file_download(model_id='qwen/Qwen1.5-0.5B-Chat-GGUF',file_path='qwen1_5-0_5b-chat-q5_k_m.gguf',revision='master',cache_dir='path/to/local/dir')

运行python文件进行下载

python download.py
下载llama.cpp

使⽤git命令克隆llama.cpp项⽬

git clone https://github.com/ggerganov/llama.cpp

克隆完成之后我们进入llama.cpp目录中,对项目进行编译

cd llama.cpp
make -j
模型下载

在魔搭社区中下载模型运行
https://www.modelscope.cn/models/qwen/Qwen1.5-0.5B-Chat-GGUF/files
本人下载的是qwen1_5-0_5b-chat-q5_k_m.gguf
终端运行,其中模型替换为自己的模型地址(官方给的-cml参数在help中没有找到,且影响运行,所以我删除掉了)
官方:

./main -m /path/to/local/dir/qwen/Qwen1.5-0.5B-Chat-GGUF/qwen1_5-0_5b-chat-q5_k_m.gguf -n 512 --color -i -cml -f prompts/chat-with-qwen.txt

我运行:

./main -m /path/to/local/dir/qwen/Qwen1.5-0.5B-Chat-GGUF/qwen1_5-0_5b-chat-q5_k_m.gguf -n 512 --color -i -f prompts/chat-with-qwen.txt

help内容:

usage: ./main [options]

general:

  -h,    --help, --usage          print usage and exit
         --version                show version and build info
  -v,    --verbose                print verbose information
         --verbosity N            set specific verbosity level (default: 0)
         --verbose-prompt         print a verbose prompt before generation (default: false)
         --no-display-prompt      don't print prompt at generation (default: false)
  -co,   --color                  colorise output to distinguish prompt and user input from generations (default: false)
  -s,    --seed SEED              RNG seed (default: -1, use random seed for < 0)
  -t,    --threads N              number of threads to use during generation (default: 8)
  -tb,   --threads-batch N        number of threads to use during batch and prompt processing (default: same as --threads)
  -td,   --threads-draft N        number of threads to use during generation (default: same as --threads)
  -tbd,  --threads-batch-draft N  number of threads to use during batch and prompt processing (default: same as --threads-draft)
         --draft N                number of tokens to draft for speculative decoding (default: 5)
  -ps,   --p-split N              speculative decoding split probability (default: 0.1)
  -lcs,  --lookup-cache-static FNAME
                                  path to static lookup cache to use for lookup decoding (not updated by generation)
  -lcd,  --lookup-cache-dynamic FNAME
                                  path to dynamic lookup cache to use for lookup decoding (updated by generation)
  -c,    --ctx-size N             size of the prompt context (default: 0, 0 = loaded from model)
  -n,    --predict N              number of tokens to predict (default: -1, -1 = infinity, -2 = until context filled)
  -b,    --batch-size N           logical maximum batch size (default: 2048)
  -ub,   --ubatch-size N          physical maximum batch size (default: 512)
         --keep N                 number of tokens to keep from the initial prompt (default: 0, -1 = all)
         --chunks N               max number of chunks to process (default: -1, -1 = all)
  -fa,   --flash-attn             enable Flash Attention (default: disabled)
  -p,    --prompt PROMPT          prompt to start generation with (default: '')
  -f,    --file FNAME             a file containing the prompt (default: none)
         --in-file FNAME          an input file (repeat to specify multiple files)
  -bf,   --binary-file FNAME      binary file containing the prompt (default: none)
  -e,    --escape                 process escapes sequences 
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

袁博特

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值