1 问题描述
“通义千问”部署:https://blog.youkuaiyun.com/lsb2002/article/details/135084490
部署完成后,运行web程序测试python web_demo.py

输入问题提交后,并没有显示答案,但通过日志查看,答案已经返回,如下所示:
(qwen) [root@localhost Qwen]# python web_demo.py
The model is automatically converting to fp16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained".
Try importing flash-attention for faster inference...
Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary
Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm
Warning: import flash_attn fail, please install FlashAttention to get higher efficiency https://github.com/Dao-AILab/flash-attention
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:05<00:00, 1.44it/s]
Running on local URL: http://192.168.1.150:8000
To create a public link, set `share=True` in `launch()`.
User: 肿瘤患者居家注意什么
History: []
Qwen-Chat: 1. 避免接触烟雾和有害气体;<br>2. 保持室内空气新鲜,避免尘埃、花粉等刺激性物质;<br>3. 定期复查和评估病情,及时调整治疗方案。
2 问题分析
查看源码,没有发现问题:
chatbot = gr.Chatbot(label='Qwen-Chat', elem_classes="control-height")
query = gr.Textbox(lines=2, label='Input')
task_history = gr.State([])
with gr.Row():
empty_btn = gr.Button("🧹 Clear History (清除历史)")
submit_btn = gr.Button("🚀 Submit (发送)")
regen_btn = gr.Button("🤔️ Regenerate (重试)")
submit_btn.click(predict, [query, chatbot, task_history], [chatbot], show_progress=True)
s

最低0.47元/天 解锁文章
1785

被折叠的 条评论
为什么被折叠?



