这是我对transformers库查看了原始文档后,进行的学习总结。
第一部分是将如何调用加载本地模型,使用模型,修改模型,保存模型
之后还会更新如何使用自定义的数据集训练以及对模型进行微调,感觉这样这个库基本就能玩熟了。
# 加载本地模型须知
* 1.使用transformers库加载预训练模型,99%的时间都是用于模型的下载。
为此,我直接从清华大学软件("https://mirrors.tuna.tsinghua.edu.cn/hugging-face-models/")把模型放在了我的本地目录地址:"H:\\code\\Model\\"下,这里可以进行修改。
* 2.下载的模型通常会是"模型名称-"+"config.json"的格式例如(bert-base-cased-finetuned-mrpc-config.json),但如果使用transformers库加载本地模型,需要的是模型路径中是config.json、vocab.txt、pytorch_model.bin、tf_model.h5、tokenizer.json等形式,为此在加载前,需要将把文件前面的模型名称,才能加载成功
我自己写的处理代码如下:
#coding=utf-8
import os
import os.path
# 模型存放路径
rootdir = r"H:\code\Model\bert-large-uncased-whole-word-masking-finetuned-squad"# 指明被遍历的文件夹
for parent,dirnames,filenames in os.walk(rootdir):#三个参数:分别返回1.父目录 2.所有文件夹名字(不含路径) 3.所有文件名字
for filename in filenames:#文件名
# nameList=filename.split('.')
# print(nameList)
print(filename)
# filenew=nameList[0]+'.jpg'
# print(filenew)
#模型的名称
newName=filename.replace('bert-large-uncased-whole-word-masking-finetuned-squad-','')
os.rename(os.path.join(parent,filename),os.path.join(parent,newName))#重命名
处理完后就可以使用transformers库进行代码加载了。
模型使用
序列分类(以情感分类为例)
1.使用管道
model_path="H:\\code\\Model\\bert-base-cased-finetuned-mrpc\\"
from transformers import pipeline
#使用当前模型+使用Tensorflow框架,默认应该是使用PYTORCH框架
nlp = pipeline("sentiment-analysis",model=model_path, tokenizer=model_path, framework="tf")
result = nlp("I hate you")[0]
print(f"label: {result['label']}, with score: {round(result['score'], 4)}")
result = nlp("I love you")[0]
print(f"label: {result['label']}, with score: {round(result['score'], 4)}")
2.直接使用模型
model_path="H:\\code\\Model\\bert-base-cased-finetuned-mrpc\\"
#pytorch框架
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(model_path)
classes = ["not paraphrase", "is paraphrase"]
sequence_0 = "The company HuggingFace is based in New York City"
sequence_1 = "Apples are especially bad for your health"
sequence_2 = "HuggingFace's headquarters are situated in Manhattan"
paraphrase = tokenizer(sequence_0, sequence_2, return_tensors="pt")
not_paraphrase = tokenizer(sequence_0, sequence_1, return_tensors="pt")
paraphrase_classification_logits = model(**paraphrase).logits
not_paraphrase_classification_logits = model(**not_paraphrase).logits
paraphrase_results = torch.softmax(paraphrase_classification_logits, dim=1).tolist()[0]
not_paraphrase_results = torch.softmax(not_paraphrase_classification_logits, dim=1).tolist()[0]
# Should be paraphrase
for i in range(len(classes)):
print(f"{classes[i]}: {int(round(paraphrase_results[i] * 100))}%")
# Should not be paraphrase
for i in range(len(classes)):
print(f"{classes[i]}: {int(round(not_paraphrase_results[i] * 100))}%")
#tensorflow框架
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
import tensorflow as tf
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = TFAutoModelForSequenceClassification.from_pretrained(model_path)
classes = ["not paraphrase", "is paraphrase"]
sequence_0 = "The company HuggingFace is based in New York City"
sequence_1 = "Apples are especially bad for your health"
sequence_2 = "HuggingFace's headquarters are situated in Manhattan"
paraphrase = tokenizer(sequence_0, sequence_2, return_tensors="tf")
not_paraphrase = tokenizer(sequence_0, sequence_1, return_tensors="tf")
paraphrase_classification_logits = model(paraphrase)[0]
not_paraphrase_classification_logits = model(not_paraphrase)[0]
paraphrase_results = tf.nn.softmax(paraphrase_classification_logits, axis=1).numpy()[0]
not_paraphrase_results = tf.nn.softmax(not_paraphrase_classification_logits, axis=1).numpy()[0]
# Should be paraphrase
for i in range(len(classes)):
print(f"{classes[i]}: {int(round(paraphrase_results[i] * 100))}%")
# Should not be paraphrase
for i in range(len(classes)):
print(f"{classes[i]}: {int(round(not_paraphrase_results[i] * 100))}%")
提取式问答
1.使用管道
model_path="H:\\code\\Model\\bert-large-uncased-whole-word-masking-finetuned-squad\\"
from transformers import pipeline
nlp = pipeline("question-answering",model=model_path, tokenizer=model_path)
context = r"""
Extractive Question Answering is the task of extracting an answer from a text given a question. An example of a
question

本文档详细介绍了如何使用transformers库加载本地预训练模型,包括处理模型文件名、序列分类、问答提取、文本生成、命名实体识别、文本摘要和翻译等任务。同时,还涵盖了模型的微调、保存与加载,以及如何根据自定义数据集进行训练。
最低0.47元/天 解锁文章
1099

被折叠的 条评论
为什么被折叠?



