方案一 ==> 已验证
已验证
:https://zhuanlan.zhihu.com/p/1902008703462406116 ,系统是:22.04成功
- 未验证:https://blog.youkuaiyun.com/liutian_ye/article/details/147069770
- 测试平台:AutoDL,Ubuntu-22.04
- 准备环境
# 查看系统版本: “22.04.1 LTS”,cuda driver 570.124.04
>> cat /etc/os-release
cuda-12.8
nvcc --version #12.8
g++ --version # 13.3 # 我的使用系统默认,版本:11.3.0
gcc --version # 13.3 # 我的使用系统默认,版本:11.3.0;“conda list |grep gcc” 显示 libgcc-ng=11.2.0
cmake --version # 3.28 # 我的使用系统默认,版本:4.0.2;“conda list |grep cmake” 显示 4.0.2
ninja --version # 1.11 # 单独安装:sudo apt-get install -y ninja-builde-11.1;“conda list |grep ninja ” 显示1.11.1.4
- 安装过程
# 第一步:未勾选“nvidia-fs”安装
>> ./cuda_12.8.0_570.86.10_linux.run --override
# 第二步
>> ./Miniconda3-py311_25.3.1-1-Linux-x86_64.sh -b -p ~/miniconda3 # 指定conda安装目录
>> vim ~/.bashrc # 追加 export PATH=/root/autodl-tmp/miniconda3/bin:$PATH
>> source ~/.bashrc
# 第三步
>> conda create --name vllm python=3.11 -y
>> conda activate vllm
>> pip install torch==2.7.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
# 第四步
>> git clone https://github.com/vllm-project/vllm.git # vllm项目不能与自己的运行脚本在同一路劲,否则报错(类似“装成功但报错”)
>> cd vllm
>> git fetch --all
>> git checkout v0.8.5
# 第五步
>> python use_existing_torch.py # 使用已安装的torch,即不重新安装torch
>> pip install -r requirements/build.txt
>> pip install -r requirements/common.txt
>> pip install --upgrade pip setuptools setuptools-scm # 旧版本的 setuptools 可能导致 editable 模式兼容性问题
# 默认MAX_JOBS=128,自己系统没那么多CPU也会安装不成功
# 装成功后,vllm 项目不能删除,否则报“ModuleNotFoundError: No module named 'vllm'”
>> MAX_JOBS=8 SETUPTOOLS_SCM_PRETEND_VERSION=0.8.5 pip install -e . --no-build-isolation -v # 加 “-v” 查看更多日志,“-VV(两个v)”更多
# 第六步:升级nccl,解决多卡无法并行的问题
# 参考(整个安装流程按该链接步骤也能成功):https://blog.youkuaiyun.com/qq_63455100/article/details/148613949?sharetype=blogdetail&sharerId=148613949&sharerefer=PC&sharesource=qq_63455100&spm=1011.2480.3001.8118
>> pip install -U nvidia-nccl-cu12 # 先编译vllm后再升级nccl,升级完nccl不需要重新编译
# 第七步:前五步能装上,但是报“ImportError: cannot import name 'PoolingParams' from 'vllm' (unknown location)”。遂执行 “安装lsmod、dkms”,再重复“第五步”
- 装成功但报错:上述
第五步
后应该就成功了,可能我忽略了以下提示,导致又执行了第六步
/root/autodl-tmp/miniconda3/envs/vllm/lib/python3.11/site-packages/setuptools/command/editable_wheel.py:351: InformationOnly: Editable installation.
!!
********************************************************************************
Please be careful with folders in your working directory with the same
name as your package as they may take precedence during imports.
********************************************************************************
!!
with strategy, WheelFile(wheel_path, "w") as wheel_obj:
Building editable for vllm (pyproject.toml) ... done
Created wheel for vllm: filename=vllm-0.8.5+cu128-0.editable-cp311-cp311-linux_x86_64.whl size=12865 sha256=2028bb7922040ce6d9b4028d586236391302223b9fb27cd701326b646d775ec2
Stored in directory: /tmp/pip-ephem-wheel-cache-bp4c02jv/wheels/55/19/4d/0ba1030ee37cc8a147ca29552d435863f639435c056b9aa440
Successfully built vllm
Installing collected packages: vllm
changing mode of /root/autodl-tmp/miniconda3/envs/vllm/bin/vllm to 755
Successfully installed vllm-0.8.5+cu128
解释
:
- 含义:这是 setuptools 在 editable 模式(开发模式)安装时的常规提醒。editable 安装允许直接从源码目录导入包(无需重新安装),但如果工作目录中存在与包同名的文件夹,可能会导致导入冲突(例如项目根目录和包名均为 vllm)。
- 处理建议:
如果你只是正常使用 vllm(而非开发修改其源码),这个提示可以忽略。如果后续开发中遇到导入问题,检查项目结构即可。 - 示例:执行“git clone vllm”后,vllm项目路劲为“/root/vllm”,然后我的qwen3启动脚本路劲也在"/root"下,导致运行时提示
ImportError: cannot import name 'PoolingParams' from 'vllm' (unknown location)
,解决方式
:将启动程序放在和vllm项目不一样的路劲
- 踩坑记录
- 日志提示1:
/root/autodl-tmp/vllm/csrc/moe/marlin_moe_wna16/kernel.h(42): warning #20281-D: in whole program compilation mode ("-rdc=false"), a __global__ function template instantiation or specialization ("marlin_moe_wna16::Marlin< ::__nv_bfloat16, (long)1125899906843648l, (int)128, (int)4, (int)8, (int)4, (bool)0, (int)4, (bool)0, (bool)1, (int)8, (bool)0> ") will be required to have a definition in the current translation unit, when "-static-global-template-stub" will be set to "true" by default in the future. To resolve this issue, either use "-rdc=true", or explicitly set "-static-global-template-stub=false" (but see nvcc documentation about downsides of turning it off)
- 处理方式:忽略不管。若执行 “export CUDAFLAGS=“-rdc=true -lcudadevrt” & pip install ” 操作则报错
Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - broken
CMake Error at /root/autodl-tmp/miniconda3/envs/vllm/lib/python3.11/site-packages/cmake/data/share/cmake-4.0/Modules/CMakeTestCUDACompiler.cmake:59 (message):
The CUDA compiler
"/usr/local/cuda/bin/nvcc"
is not able to compile a simple test program.
It fails with the following output:
Change Dir: '/tmp/tmp2v82_f1s.build-temp/CMakeFiles/CMakeScratch/TryCompile-rE8LAN'
Run Build Command(s): /root/autodl-tmp/miniconda3/envs/vllm/bin/ninja -v cmTC_835b4
[1/2] /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -rdc=true -lcudadevrt "--generate-code=arch=compute_52,code=[compute_52,sm_52]" -MD -MT CMakeFiles/cmTC_835b4.dir/main.cu.o -MF CMakeFiles/cmTC_835b4.dir/main.cu.o.d -x cu -c /tmp/tmp2v82_f1s.build-temp/CMakeFiles/CMakeScratch/TryCompile-rE8LAN/main.cu -o CMakeFiles/cmTC_835b4.dir/main.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[2/2] : && /usr/bin/g++ CMakeFiles/cmTC_835b4.dir/main.cu.o -o cmTC_835b4 -lcudadevrt -lcudart_static -lrt -lpthread -ldl -L"/usr/local/cuda/targets/x86_64-linux/lib/stubs" -L"/usr/local/cuda/targets/x86_64-linux/lib" && :
FAILED: cmTC_835b4
: && /usr/bin/g++ CMakeFiles/cmTC_835b4.dir/main.cu.o -o cmTC_835b4 -lcudadevrt -lcudart_static -lrt -lpthread -ldl -L"/usr/local/cuda/targets/x86_64-linux/lib/stubs" -L"/usr/local/cuda/targets/x86_64-linux/lib" && :
/usr/bin/ld: CMakeFiles/cmTC_835b4.dir/main.cu.o: in function `__sti____cudaRegisterAll()':
tmpxft_00007a42_00000000-6_main.cudafe1.cpp:(.text+0x127): undefined reference to `__cudaRegisterLinkedBinary_e73a7380_7_main_cu_main'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
- 坑2:由于
ImportError: cannot import name 'PoolingParams' from 'vllm' (unknown location)
提示,我还执行过以下操作,但是环境都存在以下包
,所以下方操作没卵用
apt-get update && apt-get install -y build-essential python3-dev
# 安装 Python 依赖
pip install setuptools wheel
- 我的conda环境
# vllm.yml,是使用“conda activate vllm & conda env create -f vllm.yml”生成; “conda pack -n vllm(conda环境名) -o env.tar.gz” 可打包整个环境
name: vllm
channels:
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _openmp_mutex=5.1=1_gnu
- bzip2=1.0.8=h5eee18b_6
- ca-certificates=2025.2.25=h06a4308_0
- ld_impl_linux-64=2.40=h12ee557_0
- libffi=3.4.4=h6a678d5_1
- libgcc-ng=11.2.0=h1234567_1
- libgomp=11.2.0=h1234567_1
- libstdcxx-ng=11.2.0=h1234567_1
- libuuid=1.41.5=h5eee18b_0
- ncurses=6.4=h6a678d5_0
- openssl=3.0.16=h5eee18b_0
- python=3.11.11=he870216_0
- readline=8.2=h5eee18b_0
- sqlite=3.45.3=h5eee18b_0
- tk=8.6.14=h39e8969_0
- tzdata=2025b=h04d1e81_0
- wheel=0.45.1=py311h06a4308_0
- xz=5.6.4=h5eee18b_1
- zlib=1.2.13=h5eee18b_1
- pip:
- aiohappyeyeballs==2.6.1
- aiohttp==3.11.18
- aiosignal==1.3.2
- airportsdata==20250224
- annotated-types==0.7.0
- anyio==4.9.0
- astor==0.8.1
- attrs==25.3.0
- blake3==1.0.4
- cachetools==5.5.2
- certifi==2025.4.26
- charset-normalizer==3.4.2
- click==8.2.0
- cloudpickle==3.1.1
- cmake==4.0.2
- compressed-tensors==0.9.3
- cupy-cuda12x==13.4.1
- deprecated==1.2.18
- depyf==0.18.0
- dill==0.4.0
- diskcache==5.6.3
- distro==1.9.0
- dnspython==2.7.0
- einops==0.8.1
- email-validator==2.2.0
- fastapi==0.115.12
- fastapi-cli==0.0.7
- fastrlock==0.8.3
- filelock==3.18.0
- frozenlist==1.6.0
- fsspec==2024.6.1
- gguf==0.16.3
- googleapis-common-protos==1.70.0
- grpcio==1.72.0
- h11==0.16.0
- hf-xet==1.1.1
- httpcore==1.0.9
- httptools==0.6.4
- httpx==0.28.1
- huggingface-hub==0.31.2
- idna==3.10
- importlib-metadata==8.0.0
- interegular==0.3.3
- jinja2==3.1.6
- jiter==0.9.0
- jsonschema==4.23.0
- jsonschema-specifications==2025.4.1
- lark==1.2.2
- llguidance==0.7.19
- llvmlite==0.44.0
- lm-format-enforcer==0.10.11
- markdown-it-py==3.0.0
- markupsafe==2.1.5
- mdurl==0.1.2
- mistral-common==1.5.4
- mpmath==1.3.0
- msgpack==1.1.0
- msgspec==0.19.0
- multidict==6.4.3
- nest-asyncio==1.6.0
- networkx==3.3
- ninja==1.11.1.4
- numba==0.61.2
- numpy==2.1.2
- nvidia-cublas-cu12==12.8.3.14
- nvidia-cuda-cupti-cu12==12.8.57
- nvidia-cuda-nvrtc-cu12==12.8.61
- nvidia-cuda-runtime-cu12==12.8.57
- nvidia-cudnn-cu12==9.7.1.26
- nvidia-cufft-cu12==11.3.3.41
- nvidia-cufile-cu12==1.13.0.11
- nvidia-curand-cu12==10.3.9.55
- nvidia-cusolver-cu12==11.7.2.55
- nvidia-cusparse-cu12==12.5.7.53
- nvidia-cusparselt-cu12==0.6.3
- nvidia-nccl-cu12==2.26.2
- nvidia-nvjitlink-cu12==12.8.61
- nvidia-nvtx-cu12==12.8.55
- openai==1.78.1
- opencv-python-headless==4.11.0.86
- opentelemetry-api==1.26.0
- opentelemetry-exporter-otlp==1.26.0
- opentelemetry-exporter-otlp-proto-common==1.26.0
- opentelemetry-exporter-otlp-proto-grpc==1.26.0
- opentelemetry-exporter-otlp-proto-http==1.26.0
- opentelemetry-proto==1.26.0
- opentelemetry-sdk==1.26.0
- opentelemetry-semantic-conventions==0.47b0
- opentelemetry-semantic-conventions-ai==0.4.8
- outlines==0.1.11
- outlines-core==0.1.26
- packaging==25.0
- partial-json-parser==0.2.1.1.post5
- pillow==11.0.0
- pip==25.1.1
- prometheus-client==0.21.1
- prometheus-fastapi-instrumentator==7.1.0
- propcache==0.3.1
- protobuf==4.25.7
- psutil==7.0.0
- py-cpuinfo==9.0.0
- pycountry==24.6.1
- pydantic==2.11.4
- pydantic-core==2.33.2
- pygments==2.19.1
- python-dotenv==1.1.0
- python-json-logger==3.3.0
- python-multipart==0.0.20
- pyyaml==6.0.2
- pyzmq==26.4.0
- ray==2.46.0
- referencing==0.36.2
- regex==2024.11.6
- requests==2.32.3
- rich==14.0.0
- rich-toolkit==0.14.6
- rpds-py==0.24.0
- safetensors==0.5.3
- scipy==1.15.3
- sentencepiece==0.2.0
- setuptools==80.4.0
- setuptools-scm==8.3.1
- shellingham==1.5.4
- sniffio==1.3.1
- starlette==0.46.2
- sympy==1.13.3
- tiktoken==0.9.0
- tokenizers==0.21.1
- torch==2.7.0+cu128
- torchaudio==2.7.0+cu128
- torchvision==0.22.0+cu128
- tqdm==4.67.1
- transformers==4.51.3
- triton==3.3.0
- typer==0.15.3
- typing-extensions==4.12.2
- typing-inspection==0.4.0
- urllib3==2.4.0
- uvicorn==0.34.2
- uvloop==0.21.0
- vllm==0.8.5+cu128
- watchfiles==1.0.5
- websockets==15.0.1
- wrapt==1.17.2
- xgrammar==0.1.18
- yarl==1.20.0
- zipp==3.21.0
prefix: /root/autodl-tmp/miniconda3/envs/vllm
- 安装指定版本 gcc、cmake等
- 安装 GCC/G++ 13.3
# 安装 software-properties-common 包
sudo apt update
sudo apt install software-properties-common
# 添加PPA仓库
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt update
# 安装GCC 13
sudo apt install gcc-13 g++-13
# 设置默认版本
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-13 130 --slave /usr/bin/g++ g++ /usr/bin/g++-13 --slave /usr/bin/gcov gcov /usr/bin/gcov-13
# 验证版本
gcc --version
g++ --version
- 安装 CMake 3.28
# 安装依赖
sudo apt-get install -y wget libssl-dev
# 下载并编译CMake
wget https://github.com/Kitware/CMake/releases/download/v3.28.0/cmake-3.28.0.tar.gz
tar -zxvf cmake-3.28.0.tar.gz
cd cmake-3.28.0
./bootstrap --prefix=/usr/local
make -j$(nproc)
sudo make install
# 验证版本
cmake --version
- 安装 Ninja 1.11
# 安装依赖
sudo apt-get install -y wget
# 下载并编译Ninja
wget https://github.com/ninja-build/ninja/archive/v1.11.1.tar.gz
tar -zxvf v1.11.1.tar.gz
cd ninja-1.11.1
./configure.py --bootstrap
# 安装到系统路径
sudo cp ninja /usr/local/bin/
sudo chmod +x /usr/local/bin/ninja
# 验证版本
ninja --version
- 其他技巧
# 卸载所有包
>> pip freeze > req.txt
>> pip uninstall -r req.txt -y
>> sudo apt-get install -y build-essential g++-13.3 gcc-13.3 cmake-3.28 ninja-builde
# 临时增加pip源
>> pip install -i https://mirrors.huaweicloud.com/repository/pypi/simple/
>> pip install -i https://pypi.tuna.tsinghua.edu.cn/simple
# 国内git
https://gitee.com/
# miniconda 下载
https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/?C=M&O=D