LLM基础（一）

最新推荐文章于 2025-01-10 09:00:00 发布

rebegin_2023

最新推荐文章于 2025-01-10 09:00:00 发布

阅读量387

点赞数 13

分类专栏：科研文章标签： python 语言模型开源 gpt

本文链接：https://blog.youkuaiyun.com/weixin_56742836/article/details/143714367

版权

科研专栏收录该内容

5 篇文章

订阅专栏

1.什么是LLM

LLM，即Large Language Model,大语言模型。

对应的自然就是SLM,Small Language Model，小语言模型。

但其实凡事都是相对的，多“大”的模型叫“大模型”？其实也没有明确的定义，当然，像GPT这样的，肯定可以算大模型。

2.为什么要用LLM

LLM目前的性能越来越强，作为一个计算机专业的学生，感觉GPT真的可以帮我们解决很多代码问题，比如代码生成、代码注释生成、代码错误解释...

但是，LLM也不是万能的，有一些不常见的问题，它就很难回答出来，因为训练数据少嘛。

3.目前常见LLM

LLM可以分为开源LLM 和不开源的LLM。

开源的LLM有：Llama、codellama、Qwen、bart...

不开源的LLM有：GPT、deepseek-chat...

4.怎么用LLM（python）

开源的可以部署到本地（其实就是从huggingface下载权重文件到本地），然后可以微调啥的。其中有一些模型下载是需要申请的（比如Llama3.2），这个网上也有很多教程。

权重文件下载好了之后，来看一个简单的bart使用的例子（model_path就是你下载的路径）

from transformers import BartForConditionalGeneration, BartTokenizer

model_path="xxx"#bart_model_path

# 加载BART模型和分词器
model = BartForConditionalGeneration.from_pretrained(model_path)
tokenizer = BartTokenizer.from_pretrained(model_path)



# 输入文本并加上提示词
text = """
    Summarize the following description or code:

def add_numbers(a, b):
    return a + b

"""

# 对文本进行编码
inputs = tokenizer([text], max_length=1024, return_tensors="pt", truncation=True)

# 使用模型生成摘要
summary_ids = model.generate(inputs["input_ids"], num_beams=4, max_length=1024, early_stopping=True)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

print(len(summary))
print("Summary:", summary)

不开源的就需要通过官方的API来调用。给一个gpt例子（api_key需要申请，base_url即你申请api_key的那个网站）

from openai import OpenAI
import openai
# 设置 OpenAI API Key


client = OpenAI(api_key="xxx", base_url="xxx")



response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    stream=False,
    messages=[
        {"role": "system", "content": "You are an assistant specialized in writing."},
        {"role": "user", "content": "Write a paper to demonstrate the software security problem.}"}
    ]
)

print(response.choices[0].message)