问题:
之前写的SBERT模型接口部署上线后最近报了RuntimeError: Already borrowed的错误,在这里记录下。
现象:
具体的报错如下:
File "/home/XXXX/XXX/src/sentence_proc.py", line 77, in compute_sentence_vectors
sentences = self.tokenizer(sentences, padding='max_length', truncation=True, max_length=self.max_seq_length,return_tensors="tf")
File "/home/XXX/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2249, in __call__
return self.batch_encode_plus(
File "/home/XXXX/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2434, in batch_encode_plus
return self._batch_encode_plus(
File "/home/XXXX/.local/lib/python3.8/site-packages/transformers/tokenization_utils_fast.py", line 370, in _batch_encode_plus
self.set_truncation_and_padding(
File "/home/XXXX/.local/lib/python3.8/site-packages/transformers/tokenization_utils

本文记录了一个关于SBERT模型接口部署后出现的RuntimeError:Alreadyborrowed错误。该错误发生于多线程环境下使用同一tokenizer时,解决方案包括采用TFRecord进行数据转换并通过文件流读取,最终采取了对批量数据进行切分的方法。
最低0.47元/天 解锁文章
4943





