主要解决cl100k_base 缺失的问题,如下这种错误
Exception type: <class 'requests.exceptions.ConnectionError'>
Exception value: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7fdf29eef1f0>: Failed to resolve 'openaipublic.blob.core.windows.net' ([Errno -3] Temporary failure in name resolution)"))
步骤:
1、下载cl100k_base.tiktoken
https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken
2、上传cl100k_base.tiktoken到目标目录,如:/workspace/models
修改文件名为9b5ad71b2ce5302211f9c61530b329a4922fc6a4
3、验证
import os
cache_key = "s9b5ad71b2ce5302211f9c61530b329a4922fc6a4"
tiktoken_cache_dir = "/workspace/models"
os.environ["TIKTOKEN_CACHE_DIR"] = tiktoken_cache_dir
# validate
assert os.path.exists(os.path.join(tiktoken_cache_dir, cache_key))
import tiktoken
encoding = tiktoken.get_encoding("cl100k_base")
encoding.encode("Hello, world")