第一步:克隆仓库

git clone https://gitcode.com/tencent_hunyuan/Hunyuan-4B-Instruct-FP8

【免费下载链接】Hunyuan-4B-Instruct-FP8 腾讯开源混元高效大语言模型系列成员,专为多场景部署优化。支持FP8量化与256K超长上下文,具备混合推理模式与强大智能体能力,在数学、编程、科学等领域表现卓越。轻量化设计兼顾边缘设备与高并发生产环境,提供流畅高效的AI体验 【免费下载链接】Hunyuan-4B-Instruct-FP8 项目地址: https://ai.gitcode.com/tencent_hunyuan/Hunyuan-4B-Instruct-FP8

第二步:安装依赖

pip install "transformers>=4.56.0" torch accelerate

第三步:基础推理示例

from transformers import AutoModelForCausalLM, AutoTokenizer import re

model_name_or_path = "tencent/Hunyuan-4B-Instruct" tokenizer = AutoTokenizer.from_pretrained(model_name_or_path) model = AutoModelForCausalLM.from_pretrained(model_name_or_path, device_map="auto")

快思维模式示例

messages = [{"role": "user", "content": "/no_think 请解释什么是人工智能"}] inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device) outputs = model.generate(inputs, max_new_tokens=2048) print(tokenizer.decode(outputs[0], skip_special_tokens=True))

慢思维模式示例

messages = [{"role": "user", "content": "/think 解方程:3x + 7 = 22"}] inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device) outputs = model.generate(inputs, max_new_tokens=2048)

提取思考过程和答案

output_text = tokenizer.decode(outputs[0]) think_pattern = r'(.*?)' think_matches = re.findall(think_pattern, output_text, re.DOTALL) think_content = [match.strip() for match in think_matches][0] answer_content = [match.strip() for match in think_matches][1]

print(f"思考过程: {think_content}\n\n答案: {answer_content}")

【免费下载链接】Hunyuan-4B-Instruct-FP8 腾讯开源混元高效大语言模型系列成员,专为多场景部署优化。支持FP8量化与256K超长上下文,具备混合推理模式与强大智能体能力,在数学、编程、科学等领域表现卓越。轻量化设计兼顾边缘设备与高并发生产环境,提供流畅高效的AI体验 【免费下载链接】Hunyuan-4B-Instruct-FP8 项目地址: https://ai.gitcode.com/tencent_hunyuan/Hunyuan-4B-Instruct-FP8

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值