大模型笔记1: Longformer环境配置

论文:

https://arxiv.org/abs/2004.05150

目录

库安装

LongformerForQuestionAnswering


库安装

首先保证电脑上配置了git.

git环境配置:

https://blog.youkuaiyun.com/Andone_hsx/article/details/87937329

3.1、找到git安装路径中bin的位置,如:D:\Program Files\Git\bin

        找到git安装路径中git-core的位置,如:D:\Program Files\Git\libexec\git-core;

        注:"D:\Program Files\Git\"是安装路径,可能与你的安装路径不一样,要按照你自己的路径替换"D:\Program Files\Git\"

        3.2、右键“计算机”->“属性”->“高级系统设置”->“环境变量”->在下方的“系统变量”中找到“path”->选中“path”并选择“编辑”->将            3.1中找到的bin和git-core路径复制到其中->保存并退出

        注:“path”中,每个路径之间要以英文输入状态下的分号——“;”作为间隔

D:\Program Files\Git\mingw64\bin

D:\Program Files\Git\mingw64\libexec\git-core

安装环境

conda create --name longformer python=3.7

y

conda activate longformer

conda install cudatoolkit=10.0

y

pip install git+https://github.com/allenai/longformer.git

报错:

ERROR: Could not find a version that satisfies the requirement pandas>=0.20.3 (from test-tube) (from versions: none)

ERROR: No matching distribution found for pandas>=0.20.3

No module named 'pandas'

Install装不上, 在anaconda navigator装的

更换清华源后似乎可以继续运行了, 参考:

https://www.cnblogs.com/raiuny/p/15950043.html

conda config --add channels Index of /anaconda/cloud/pytorch/ | 清华大学开源软件镜像站 | Tsinghua Open Source Mirror

conda config --set show_channel_urls yes

conda config --set auto_activate_base false

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

好几次报错128, 也许是RP问题, 总之重新运行几次后安装成功了.

环境安装成功会显示successful.

接着跑tests文件夹的test_readme.py, 注意需要下载longformer-base-4096.tar.gz

这个文件我放在项目目录下的/tmp文件夹和tests/tmp都无法读取, 因此修改了self.model_dir路径为绝对路径, 并注释下载解压代码, 就可以运行了:

LongformerForQuestionAnswering

1)test_readme中默认使用的Longformer模型输出是embedding, 缺少LMHead把embedding映射成tokenid或logits等, 无法输出文字. 如果使用Longformer完成最终任务, 需要自己写映射并训练.

2)文档其它longformer模型. 大部分为分类模型. 其中LongformerForQuestionAnswering符合extractive summarization

3)coding过程中可以参考huggingface上的文档例子从transformers库里面调用其它种类的longformer

from transformers import AutoTokenizer, LongformerForQuestionAnswering

import torch

tokenizer = AutoTokenizer.from_pretrained("allenai/longformer-large-4096-finetuned-triviaqa")

model = LongformerForQuestionAnswering.from_pretrained("allenai/longformer-large-4096-finetuned-triviaqa")

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

encoding = tokenizer(question, text, return_tensors="pt")

input_ids = encoding["input_ids"]

# default is local attention everywhere

# the forward method will automatically set global attention on question tokens

attention_mask = encoding["attention_mask"]

outputs = model(input_ids, attention_mask=attention_mask)

start_logits = outputs.start_logits

end_logits = outputs.end_logits

all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())

answer_tokens = all_tokens[torch.argmax(start_logits) : torch.argmax(end_logits) + 1]

answer = tokenizer.decode(

    tokenizer.convert_tokens_to_ids(answer_tokens)

)  # remove space prepending space token

如果加载其它qa模型(longformer_base_4096_QA_SQUAD)不配套会报错:

Some weights of the model checkpoint at tmp/longformer_base_4096_QA_SQUAD were not used when initializing LongformerForQuestionAnswering: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out_proj.bias']

按照示例代码加载longformer-large-4096-finetuned-triviaqa后报错

start_logits = outputs.start_logits

AttributeError: 'tuple' object has no attribute 'start_logits'

这个报错的意思是返回值不是对象而是元组, 因此判断如果是元组, 则手动解析

if isinstance(outputs, tuple):

    loss,start_logits, end_logits,hidden_states,attentions = outputs

else:

    start_logits = outputs.start_logits

    end_logits = outputs.end_logits

库中LongformerForQuestionAnswering类代码有两处可能返回

1.

output = (start_logits, end_logits) + outputs[2:]

2.

return SequenceClassifierOutput(

            loss=loss,

            logits=logits,

            hidden_states=outputs.hidden_states,

            attentions=outputs.attentions,

        )

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值