情况描述
环境:
linux
transformers 4.39.0
tokenizers 0.15.2
torch 2.1.2+cu121
flash-attn 2.3.3
在使用vllm运行xverse/XVERSE-13B-256K时(代码如下):
qwen_model = AutoModelForSequenceClassification.from_pretrained(
args.pre_train,
trust_remote_code=True,
attn_implementation="flash_attention_2",
torch_dtype=torch.bfloat16,
device_map="auto", # balanced_low_0
num_labels=5
)
报错如下
Traceback (most recent call last):
File "/usr/local/app/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1364, in _get_mo

最低0.47元/天 解锁文章
1057





