D:\dify-1.8.0\dify\pdf_blueprint_drawing\venv\Scripts\python.exe "D:/Program Files/JetBrains/PyCharm 2025.2.1.1/plugins/python-ce/helpers/pydev/pydevd.py" --multiprocess --qt-support=auto --client 127.0.0.1 --port 55730 --file D:\dify-1.8.0\dify\pdf_blueprint_drawing\tools\pdf_blueprint_drawing.py
Connected to: <socket.socket fd=704, family=2, type=1, proto=0, laddr=('127.0.0.1', 55731), raddr=('127.0.0.1', 55730)>.
已连接到 pydev 调试器(内部版本号 252.25557.178)2025-09-06 13:52:33,991 - PDFExtractor - INFO - 正在初始化OCR引擎...
2025-09-06 13:52:33,992 - PDFExtractor - ERROR - OCR初始化失败: Unknown argument: use_gpu,尝试无参数初始化
��Ϣ: ���ṩ��ģʽ���ҵ��ļ���
D:\dify-1.8.0\dify\pdf_blueprint_drawing\venv\Lib\site-packages\paddle\utils\cpp_extension\extension_utils.py:717: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
warnings.warn(warning_message)
Creating model: ('PP-LCNet_x1_0_doc_ori', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `C:\Users\zy532\.paddlex\official_models\PP-LCNet_x1_0_doc_ori`.
Creating model: ('UVDoc', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `C:\Users\zy532\.paddlex\official_models\UVDoc`.
Creating model: ('PP-LCNet_x1_0_textline_ori', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `C:\Users\zy532\.paddlex\official_models\PP-LCNet_x1_0_textline_ori`.
Creating model: ('PP-OCRv5_server_det', None)
Using official model (PP-OCRv5_server_det), the model files will be automatically downloaded and saved in `C:\Users\zy532\.paddlex\official_models\PP-OCRv5_server_det`.
Processing 5 items: 0%| | 0.00/5.00 [00:00<?, ?it/s]
Downloading [config.json]: 0%| | 0.00/2.80k [00:00<?, ?B/s]
Downloading [README.md]: 0%| | 0.00/15.5k [00:00<?, ?B/s]
Downloading [config.json]: 100%|██████████| 2.80k/2.80k [00:07<00:00, 387B/s]
Processing 5 items: 20%|██ | 1.00/5.00 [00:09<00:36, 9.13s/it]
Downloading [inference.pdiparams]: 0%| | 0.00/83.9M [00:00<?, ?B/s]
Downloading [inference.json]: 0%| | 0.00/393k [00:00<?, ?B/s]
Downloading [inference.pdiparams]: 1%| | 1.00M/83.9M [00:07<09:50, 147kB/s]
Downloading [inference.pdiparams]: 14%|█▍ | 12.0M/83.9M [00:07<00:31, 2.41MB/s]
Downloading [inference.pdiparams]: 27%|██▋ | 23.0M/83.9M [00:07<00:11, 5.48MB/s]
Downloading [inference.pdiparams]: 41%|████ | 34.0M/83.9M [00:07<00:05, 9.59MB/s]
Downloading [inference.pdiparams]: 52%|█████▏ | 44.0M/83.9M [00:07<00:02, 14.3MB/s]
Downloading [inference.pdiparams]: 63%|██████▎ | 53.0M/83.9M [00:07<00:01, 19.1MB/s]
Downloading [inference.pdiparams]: 74%|███████▍ | 62.0M/83.9M [00:07<00:00, 23.0MB/s]
Downloading [inference.pdiparams]: 82%|████████▏ | 69.0M/83.9M [00:08<00:00, 25.8MB/s]
Downloading [inference.pdiparams]: 89%|████████▉ | 75.0M/83.9M [00:08<00:00, 28.4MB/s]
Downloading [inference.pdiparams]: 100%|██████████| 83.9M/83.9M [00:08<00:00, 10.3MB/s]
Processing 5 items: 40%|████ | 2.00/5.00 [00:19<00:30, 10.1s/it]
Downloading [inference.yml]: 0%| | 0.00/903 [00:00<?, ?B/s]
Downloading [inference.json]: 100%|██████████| 393k/393k [00:09<00:00, 42.0kB/s]
Processing 5 items: 60%|██████ | 3.00/5.00 [00:23<00:14, 7.10s/it]
Downloading [inference.yml]: 100%|██████████| 903/903 [00:10<00:00, 82.2B/s]
Processing 5 items: 80%|████████ | 4.00/5.00 [00:32<00:07, 7.85s/it]
Downloading [README.md]: 100%|██████████| 15.5k/15.5k [00:57<00:00, 278B/s]
Processing 5 items: 100%|██████████| 5.00/5.00 [01:02<00:00, 12.5s/it]
Creating model: ('PP-OCRv5_server_rec', None)
Using official model (PP-OCRv5_server_rec), the model files will be automatically downloaded and saved in `C:\Users\zy532\.paddlex\official_models\PP-OCRv5_server_rec`.
Processing 5 items: 0%| | 0.00/5.00 [00:00<?, ?it/s]
Downloading [README.md]: 0%| | 0.00/15.5k [00:00<?, ?B/s]
Downloading [inference.json]: 0%| | 0.00/318k [00:00<?, ?B/s]
Downloading [README.md]: 100%|██████████| 15.5k/15.5k [00:07<00:00, 2.25kB/s]
Processing 5 items: 20%|██ | 1.00/5.00 [00:11<00:46, 11.6s/it]
Processing 5 items: 40%|████ | 2.00/5.00 [00:36<00:58, 19.7s/it]
Downloading [inference.pdiparams]: 1%| | 1.00M/80.5M [00:06<09:03, 153kB/s]
Downloading [inference.pdiparams]: 5%|▍ | 4.00M/80.5M [00:06<01:41, 793kB/s]
Downloading [inference.pdiparams]: 10%|▉ | 8.00M/80.5M [00:07<00:39, 1.95MB/s]
Downloading [inference.pdiparams]: 24%|██▎ | 19.0M/80.5M [00:07<00:10, 6.32MB/s]
Downloading [inference.pdiparams]: 37%|███▋ | 30.0M/80.5M [00:07<00:04, 12.0MB/s]
Downloading [inference.pdiparams]: 51%|█████ | 41.0M/80.5M [00:07<00:02, 19.2MB/s]
Downloading [inference.pdiparams]: 65%|██████▍ | 52.0M/80.5M [00:07<00:01, 27.8MB/s]
Downloading [inference.pdiparams]: 77%|███████▋ | 62.0M/80.5M [00:07<00:00, 31.9MB/s]
Downloading [inference.pdiparams]: 87%|████████▋ | 70.0M/80.5M [00:07<00:00, 33.8MB/s]
Downloading [inference.pdiparams]: 100%|██████████| 80.5M/80.5M [00:08<00:00, 10.3MB/s]
Downloading [inference.yml]: 0%| | 0.00/145k [00:00<?, ?B/s]
Downloading [inference.yml]: 100%|██████████| 145k/145k [00:13<00:00, 11.0kB/s]
Downloading [inference.json]: 0%| | 0.00/318k [01:15<?, ?B/s]
Downloading [inference.json]: 0%| | 0.00/318k [00:00<?, ?B/s]
Downloading [inference.json]: 100%|██████████| 318k/318k [00:18<00:00, 17.7kB/s]
Processing 5 items: 40%|████ | 2.00/5.00 [01:45<02:37, 52.6s/it]
Encounter exception when download model from aistudio:
HTTPSConnectionPool(host='git.aistudio.baidu.com', port=443): Read timed out..
PaddleX would try to download from other model sources.
Using official model (PP-OCRv5_server_rec), the model files will be automatically downloaded and saved in `C:\Users\zy532\.paddlex\official_models\PP-OCRv5_server_rec`.
Downloading Model from https://www.modelscope.cn to directory: C:\Users\zy532\AppData\Local\Temp\tmpubnr8mih\temp_dir
2025-09-06 13:57:02,048 - modelscope - INFO - Got 5 files, start to download ...
Processing 5 items: 0%| | 0.00/5.00 [00:00<?, ?it/s]
Downloading [config.json]: 0%| | 0.00/344k [00:00<?, ?B/s]
Downloading [inference.pdiparams]: 0%| | 0.00/80.5M [00:00<?, ?B/s]
Downloading [inference.json]: 0%| | 0.00/318k [00:00<?, ?B/s]
Downloading [inference.yml]: 0%| | 0.00/145k [00:00<?, ?B/s]
Downloading [README.md]: 0%| | 0.00/15.5k [00:00<?, ?B/s]
Downloading [README.md]: 100%|██████████| 15.5k/15.5k [00:01<00:00, 8.72kB/s]
Processing 5 items: 20%|██ | 1.00/5.00 [00:01<00:07, 1.83s/it]
Downloading [inference.json]: 100%|██████████| 318k/318k [00:01<00:00, 176kB/s]
Downloading [inference.yml]: 100%|██████████| 145k/145k [00:01<00:00, 78.7kB/s]
Downloading [inference.pdiparams]: 6%|▌ | 5.00M/80.5M [00:01<00:22, 3.48MB/s]
Downloading [inference.pdiparams]: 12%|█▏ | 10.0M/80.5M [00:02<00:09, 7.80MB/s]
Downloading [config.json]: 100%|██████████| 344k/344k [00:02<00:00, 166kB/s]
Processing 5 items: 80%|████████ | 4.00/5.00 [00:02<00:00, 2.34it/s]
Downloading [inference.pdiparams]: 16%|█▌ | 13.0M/80.5M [00:02<00:06, 10.4MB/s]
Downloading [inference.pdiparams]: 21%|██ | 17.0M/80.5M [00:02<00:04, 14.4MB/s]
Downloading [inference.pdiparams]: 26%|██▌ | 21.0M/80.5M [00:02<00:03, 17.8MB/s]
Downloading [inference.pdiparams]: 30%|██▉ | 24.0M/80.5M [00:02<00:03, 18.3MB/s]
Downloading [inference.pdiparams]: 34%|███▎ | 27.0M/80.5M [00:02<00:02, 19.0MB/s]
Downloading [inference.pdiparams]: 37%|███▋ | 30.0M/80.5M [00:02<00:02, 20.3MB/s]
Downloading [inference.pdiparams]: 41%|████ | 33.0M/80.5M [00:03<00:02, 21.2MB/s]
Downloading [inference.pdiparams]: 45%|████▍ | 36.0M/80.5M [00:03<00:02, 21.5MB/s]
Downloading [inference.pdiparams]: 50%|████▉ | 40.0M/80.5M [00:03<00:01, 24.7MB/s]
Downloading [inference.pdiparams]: 55%|█████▍ | 44.0M/80.5M [00:03<00:01, 27.7MB/s]
Downloading [inference.pdiparams]: 60%|█████▉ | 48.0M/80.5M [00:03<00:01, 28.0MB/s]
Downloading [inference.pdiparams]: 66%|██████▌ | 53.0M/80.5M [00:03<00:00, 32.4MB/s]
Downloading [inference.pdiparams]: 71%|███████ | 57.0M/80.5M [00:03<00:00, 30.5MB/s]
Downloading [inference.pdiparams]: 76%|███████▌ | 61.0M/80.5M [00:03<00:00, 29.8MB/s]
Downloading [inference.pdiparams]: 80%|███████▉ | 64.0M/80.5M [00:04<00:00, 27.5MB/s]
Downloading [inference.pdiparams]: 84%|████████▍ | 68.0M/80.5M [00:04<00:00, 29.8MB/s]
Downloading [inference.pdiparams]: 89%|████████▉ | 72.0M/80.5M [00:04<00:00, 30.8MB/s]
Downloading [inference.pdiparams]: 100%|██████████| 80.5M/80.5M [00:04<00:00, 18.5MB/s]
Processing 5 items: 100%|██████████| 5.00/5.00 [00:04<00:00, 1.09it/s]
2025-09-06 13:57:06,618 - modelscope - INFO - Download model 'PaddlePaddle/PP-OCRv5_server_rec' successfully.
2025-09-06 13:57:10,366 - PDFExtractor - INFO - 使用无参数初始化OCR成功
2025-09-06 13:57:10,366 - PDFExtractor - INFO - 使用解析后的路径: D:/dify-1.8.0/dify/pdf_blueprint_drawing/tools/building_drawing.pdf
2025-09-06 13:57:10,366 - PDFExtractor - INFO - 开始处理PDF文件: D:/dify-1.8.0/dify/pdf_blueprint_drawing/tools/building_drawing.pdf
2025-09-06 13:57:10,468 - PDFExtractor - INFO - 页面 0 文本层内容不足,启用OCR
D:\dify-1.8.0\dify\pdf_blueprint_drawing\tools\pdf_blueprint_drawing.py:127: DeprecationWarning: Please use `predict` instead.
ocr_result = self.ocr.ocr(img_array)
进程已结束,退出代码为 -1073741819 (0xC0000005)