pytorch-crf的不收敛的巨坑

最新推荐文章于 2025-12-10 18:07:12 发布

原创最新推荐文章于 2025-12-10 18:07:12 发布 · 132 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#pytorch #人工智能 #python #深度学习

部署运行你感兴趣的模型镜像

pytorch-crf文档地址：

https://pytorch-crf.readthedocs.io/en/stable/

git地址：

https://github.com/kmkurn/pytorch-crf

环境参数：

OS：Ubuntu 24.04.3 LTS
Python 3.12.11
NVIDIA-SMI 575.64.05
CUDA Version: 12.9 
torch     2.8.0+cu129
ytorch-crf  0.7.2

实体抽取使用BERT+CRF，使用ytorch-crf实现CRF，

出现的现象是：

loss值很飘逸，nan，inf……………………偶尔会出现收敛值。

问题的原因：

# 此处代码取自pytorch-crf源码
# 地址：https://github.com/kmkurn/pytorch-crf/blob/master/torchcrf/__init__.py
# 38至58行
    def __init__(self, num_tags: int, batch_first: bool = False) -> None:
        if num_tags <= 0:
            raise ValueError(f'invalid number of tags: {num_tags}')
        super().__init__()
        self.num_tags = num_tags
        self.batch_first = batch_first
        self.start_transitions = nn.Parameter(torch.empty(num_tags))
        self.end_transitions = nn.Parameter(torch.empty(num_tags))
        self.transitions = nn.Parameter(torch.empty(num_tags, num_tags))

        self.reset_parameters()

    def reset_parameters(self) -> None:
        """Initialize the transition parameters.

        The parameters will be initialized randomly from a uniform distribution
        between -0.1 and 0.1.
        """
        nn.init.uniform_(self.start_transitions, -0.1, 0.1)
        nn.init.uniform_(self.end_transitions, -0.1, 0.1)
        nn.init.uniform_(self.transitions, -0.1, 0.1)

在初始化时，调用reset_parameters限制参数值域。

但是，实际情况是，运行了，不过没有生效。

解决方法：

model = BertCrfForNer.from_pretrained(pretrain_path, config=config)
model.crf.reset_parameters()
model.to(device)

在把model移动至显卡前，调用一下限制参数值域

您可能感兴趣的与本文相关的镜像

GPT-oss:20b

图文对话

Gpt-oss

GPT OSS 是OpenAI 推出的重量级开放模型，面向强推理、智能体任务以及多样化开发场景