1、安装基础环境
1.1、驱动程序和nv软件
操作系统为rocky 9.0 ,驱动程序和cuda通过cuda-repo-rhel9-12-6-local-12.6.0_560.28.03-1.x86_64.rpm安装。TensorRT10.5.0.18使用 nv-tensorrt-local-repo-rhel9-10.5.0-cuda-12.6-1.0-1.x86_64.rpm安装,TensorRT 10.5.0 supports cuDNN 8.9.7. 使用 cudnn-local-repo-rhel9-8.9.7.29-1.0-1.x86_64.rpm安装。
sudo rpm -i cuda-repo-rhel9-12-6-local-12.6.0_560.28.03-1.x86_64.rpm
sudo dnf clean all
sudo dnf -y install cuda-toolkit-12-6
安装内核模块,运行nvcc -V检查CUDA版本
legacy kernel module flavor:
dnf module install nvidia-driver:latest-dkms
open kernel module flavor:
dnf module install nvidia-driver:open-dkms
切换不同类型的内核模块
dnf module switch-to nvidia-driver:latest-dkms --allowerasing
dnf module switch-to nvidia-driver:open-dkms --allowerasing
检查CUDA版本吧,只需在终端输入 nvidia-smi,第一行显示驱动版本和驱动支持的最高CUDA版本。
安装cudnn:
sudo rpm -i cudnn-local-repo-rhel9-8.9.7.29-1.0-1.x86_64.rpm
sudo dnf clean all
sudo rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-*
sudo dnf install libcudnn8
sudo dnf install libcudnn8-devel
dnf list installed | grep cudnn
ls /usr/local/cuda/lib64/libcudnn*
ls /usr/local/cuda/include/cudnn.h
检查cudnn.h文件是否存在于CUDA的include目录,或运行ldconfig确认库文件加载成功。
安装tensorrt:
sudo dnf install nv-tensorrt-local-repo-rhel9-10.5.0-cuda-12.6-1.0-1.x86_64.rpm
sudo cp /var/nv-tensorrt-local-repo-rhel9-10.5.0-cuda-12.6-1.0-1/*.pub /etc/pki/rpm-gpg/
sudo rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-*
sudo dnf install tensorrt
python3 -c "import tensorrt; print(f'TensorRT版本: {tensorrt.__version__}')"
安装pip模块
#python -m ensurepip
创建并激活python虚拟环境,安装时均在虚拟环境中进行
python -m venv paddleocr
source paddleocr/bin/activate
#退出虚拟环境
deactivate
1.2、环境检测
# GPU 版本,需显卡驱动程序版本 ≥550.54.14(Linux)或 ≥550.54.14(Windows)
python -m pip install paddlepaddle-gpu==3.2.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
python -m pip install paddlepaddle-gpu==3.2.2 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
1、需要确认 python 的版本是否满足要求
python3 --version
使用以下命令确认是 3.9/3.10/3.11/3.12/3.13
2、需要确认 pip 的版本是否满足要求,要求 pip 版本为 20.2.2 或更高版本
python3 -m pip --version
3、需要确认 Python 和 pip 是 64bit,并且处理器架构是 x86_64(或称作 x64、Intel 64、AMD64)架构或者 arrch64 结构(仅提供 cpu 版本的 wheel 包)。
如果是 x86_64 架构,下面的第一行输出的是"64bit",第二行输出的是"x86_64"、"x64"或"AMD64"即可,
如果是 arrch64 架构,下面的第一行输出的是"64bit",第二行输出的是"arm64"即可。
python3 -c "import platform;print(platform.architecture()[0]);print(platform.machine())"
4、默认提供的安装包需要计算机支持 MKL, Intel 芯片都支持 MKL
如果您的计算机有 NVIDIA® GPU,请确保满足以下条件并且安装GPU 版 PaddlePaddle,依赖库环境版本要求如下:
GPU 运算能力超过 6.0 的硬件设备
您可参考 NVIDIA 官方文档了解 CUDA、CUDNN 和 TensorRT 的安装流程和配置方法,请见CUDA,cuDNN,TensorRT
[paddle@rocky9 ~]$ which python3
/usr/bin/python3
[paddle@rocky9 ~]$ python3 --version
Python 3.9.10
[paddle@rocky9 ~]$ python -m pip --version
pip 21.2.3 from /usr/local/lib/python3.9/site-packages/pip (python 3.9)
[paddle@rocky9 ~]$ python3 -c "import platform;print(platform.architecture()[0]);print(platform.machine())"
64bit
x86_64
[paddle@rocky9 ~]$ python -m venv paddleocr
[paddle@rocky9 ~]$ source paddleocr/bin/activate
(paddleocr) [paddle@rocky9 ~]$ ll -lha
总用量 12K
drwx------ 3 paddle paddle 79 11月 22 22:15 .
drwxr-xr-x. 5 root root 41 11月 22 22:11 ..
-rw-r--r-- 1 paddle paddle 18 5月 16 2022 .bash_logout
-rw-r--r-- 1 paddle paddle 141 5月 16 2022 .bash_profile
-rw-r--r-- 1 paddle paddle 492 5月 16 2022 .bashrc
drwxr-xr-x 5 paddle paddle 74 11月 22 22:15 paddleocr
(paddleocr) [paddle@rocky9 ~]$
(paddleocr) [paddle@rocky9 ~]$ du paddleocr/ -sh
14M paddleocr/
(paddleocr) [paddle@rocky9 ~]$
2、离线安装PaddlePaddle框架和PaddleOCR
2.1、下载GPU版本paddlepaddle,CUDA 12.6版本
pip download paddlepaddle-gpu==3.2.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ -d ./offline/pp3.2.1-gpu
文件下载到offline/pp3.2.1-gpu目录下。
2.2、下载CPU版本paddlepaddle
pip download paddlepaddle==3.2.1 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/ -d ./offline/pp3.2.1-cpu
2.3、下载paddleocr
python -m pip download "paddleocr[all]" -d ./paddleocr
2.4、下载paddlex
python -m pip download "paddlex[all]==3.3.6" -d ./paddlex
python -m pip download "paddlex[ocr]==3.3.6" -d ./paddlex
3、离线安装
python -m pip install --no-index --find-links=/path/to/offline_packages 包名
python -m pip install --no-index --find-links=./ paddlepaddle
python -m pip install --no-index --find-links=./ paddleocr paddlex
python -m pip list | grep paddle
python -m pip download aiohappyeyeballs==2.6.1 -d ./paddlex_serving
python -m pip download aiohttp==3.13.2 -d ./paddlex_serving
python -m pip download aiosignal==1.4.0 -d ./paddlex_serving
python -m pip download annotated-doc==0.0.4 -d ./paddlex_serving
python -m pip download async-timeout==5.0.1 -d ./paddlex_serving
python -m pip download attrs==25.4.0 -d ./paddlex_serving
python -m pip download fastapi==0.121.3 -d ./paddlex_serving
python -m pip download filetype==1.2.0 -d ./paddlex_serving
python -m pip download frozenlist==1.8.0 -d ./paddlex_serving
python -m pip download multidict==6.7.0 -d ./paddlex_serving
python -m pip download propcache==0.4.1 -d ./paddlex_serving
python -m pip download starlette==0.49.3 -d ./paddlex_serving
python -m pip download uvicorn==0.38.0 -d ./paddlex_serving
python -m pip download yarl==1.22.0 -d ./paddlex_serving
批量安装所有whl文件
for file in *.whl; do python -m pip install --no-index --find-links=./ "$file"; done
清理缓存
$ pip cache dir
/home/xxx/.cache/pip
$ du /home/xxx/.cache/pip -sh
476M /home/xxx/.cache/pip
$ pip cache purge
Files removed: 823 (496.9 MB)
$ du /home/xxx/.cache/pip -sh
4.0K /home/xxx/.cache/pip
4、在线安装
2.1 CPU 版的 PaddlePaddle
python3 -m pip install paddlepaddle==3.2.1 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/
2.2 GPU 版的 PaddlePaddle
2.2.1 CUDA11.8 的 PaddlePaddle(如果需要使用 TensorRT 可自行安装 TensorRT8.5.3.1)
python3 -m pip install paddlepaddle-gpu==3.2.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/
2.2.2 CUDA12.6 的 PaddlePaddle(如果需要使用 TensorRT 可自行安装 TensorRT10.5.0.18)
python3 -m pip install paddlepaddle-gpu==3.2.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
2.2.3 CUDA12.9 的 PaddlePaddle(如果需要使用 TensorRT 可自行安装 TensorRT10.5.0.18)
python3 -m pip install paddlepaddle-gpu==3.2.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/
2.2.4 CUDA13.0 的 PaddlePaddle(如果需要使用 TensorRT 可自行安装 TensorRT10.5.0.18)
python3 -m pip install paddlepaddle-gpu==3.2.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu130/
注:
飞桨对于主流各 python 版本均提供了对应的安装包,而您环境中可能有多个 Python,请确认你想使用的 python 版本并下载对应的 paddlepaddle 安装包。例如您想使用 python3.10 的环境,则安装命令为 python3.10 -m pip install paddlepaddle。
上述命令默认安装avx、mkl的包。
判断你的机器是否支持avx,可以输入以下命令,如果输出中包含avx,则表示机器支持avx。飞桨不再支持noavx指令集的安装包。
cat /proc/cpuinfo | grep -i avx
4.1、安装paddlepaddle
python -m pip install paddlepaddle-gpu==3.2.0
安装完成后您可以使用 python3 进入 python 解释器,输入import paddle ,再输入 paddle.utils.run_check()
如果出现PaddlePaddle is installed successfully!,说明您已成功安装。
$ python
Python 3.9.10 (main, Feb 9 2022, 00:00:00)
[GCC 11.2.1 20220127 (Red Hat 11.2.1-9)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import paddle
>>> paddle.utils.run_check()
Running verify PaddlePaddle program ...
.local/lib/python3.9/site-packages/paddle/pir/math_op_patch.py:219: UserWarning: Value do not have 'place' interface for pir graph mode, try not to use it. None will be returned.
warnings.warn(
I1104 22:36:19.934154 554814 pir_interpreter.cc:1524] New Executor is Running ...
I1104 22:36:19.934784 554814 pir_interpreter.cc:1547] pir interpreter is running by multi-thread mode ...
PaddlePaddle works well on 1 CPU.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.
>>>
import paddle
print(paddle.__version_)
print(paddle.device.get_device()) # 应显示GPU设备信息
print(paddle.is_compiled_with_cuda()) # 应返回True
请使用以下命令卸载 PaddlePaddle:
CPU 版本的 PaddlePaddle: python3 -m pip uninstall paddlepaddle
GPU 版本的 PaddlePaddle: python3 -m pip uninstall paddlepaddle-gpu
4.2、安装paddleocr:
# If you only want to use the basic text recognition feature (returns text position coordinates and content), including the PP-OCR series
python -m pip install paddleocr
# If you want to use all features such as document parsing, document understanding, document translation,
key information extraction, etc.
python -m pip install "paddleocr[all]"
python -m pip install "paddleocr[doc-parser]"
查看安装结果:
pip list | grep paddle
paddleocr 3.3.1
paddlepaddle 3.2.0
paddlex 3.3.6
测试:
paddleocr doc_img_orientation_classification -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/img_rot180_demo.jpg
paddleocr text_image_unwarping -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/doc_test.jpg
paddleocr textline_orientation_classification -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg
paddleocr text_detection -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_001.png
paddleocr text_recognition -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png
paddleocr ocr -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png
paddleocr pp_structurev3 -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_structure_v3_demo.png
# 默认使用 PP-OCRv5 模型
paddleocr ocr -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png \
--use_doc_orientation_classify False \
--use_doc_unwarping False \
--use_textline_orientation False \
--save_path ./output \
--device gpu:0
# 通过 --ocr_version 指定 PP-OCR 其他版本
paddleocr ocr -i ./general_ocr_002.png --ocr_version PP-OCRv4
# 通过 --text_detection_model_dir 指定本地模型路径
paddleocr ocr -i ./general_ocr_002.png --text_detection_model_dir your_det_model_path
# 默认使用 PP-OCRv5_server_det 模型作为默认文本检测模型,如果微调的不是该模型,通过 --text_detection_model_name 修改模型名称
paddleocr ocr -i ./general_ocr_002.png --text_detection_model_name PP-OCRv5_mobile_det --text_detection_model_dir your_v5_mobile_det_model_path
paddleocr pp_structurev3 -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_structure_v3_demo.png
# 通过 --use_doc_orientation_classify 指定是否使用文档方向分类模型
paddleocr pp_structurev3 -i ./pp_structure_v3_demo.png --use_doc_orientation_classify True
# 通过 --use_doc_unwarping 指定是否使用文本图像矫正模块
paddleocr pp_structurev3 -i ./pp_structure_v3_demo.png --use_doc_unwarping True
# 通过 --use_textline_orientation 指定是否使用文本行方向分类模型
paddleocr pp_structurev3 -i ./pp_structure_v3_demo.png --use_textline_orientation False
# 通过 --device 指定模型推理时使用 GPU
paddleocr pp_structurev3 -i ./pp_structure_v3_demo.png --device gpu
4.3、安装 PaddleX 服务化部署插件
执行如下命令,通过 PaddleX CLI 安装 PaddleX 服务化部署插件:
paddlex --install serving
# 启动PP-OCRv5服务
paddlex --serve --pipeline OCR --port 8080
# 启动PP-StructureV3文档解析服务
paddlex --serve --pipeline PP-StructureV3 --port 8081
启动时,有时会报错:AttributeError: ‘TextDetPredictor‘ object has no attribute ‘_pp_option‘.
发现,.paddlex/official_models/PP-OCRv5_server_det 模型文件中少了inference.json文件,按照程序运行中的提示把模型文件夹/root/.paddlex/official_models/PP-OCRv5_server_det删除,然后重新运行程序来触发重新下载模型,就没有上面的错误了。参见: PaddleOCR 报错 AttributeError: ‘TextDetPredictor‘ object has no attribute ‘_pp_option‘. -优快云博客
yum install ccache
5、测试程序
5.1、python版本:
import base64
import requests
API_URL = "http://localhost:8080/ocr"
file_path = "./general_ocr_002.png"
with open(file_path, "rb") as file:
file_bytes = file.read()
file_data = base64.b64encode(file_bytes).decode("ascii")
payload = {"file": file_data, "fileType": 1}
response = requests.post(API_URL, json=payload)
print("\r\nresponse :")
print(response)
assert response.status_code == 200
result = response.json()["result"]
print("\r\nresponse result:")
for item in result:
print(item)
print("\r\nresult ocrResults:")
for i, res in enumerate(result["ocrResults"]):
for item in res:
print(item)
print("\r\nresult dataInfo:")
for i, res in enumerate(result["dataInfo"]):
for item in res:
print(item)
5.2、c++版本:
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include "cpp-httplib/httplib.h" // https://github.com/Huiyicc/cpp-httplib
#include "nlohmann/json.hpp" // https://github.com/nlohmann/json
#include "base64.hpp" // https://github.com/tobiaslocker/base64
int main() {
httplib::Client client("localhost", 8080);
const std::string filePath = "./demo.jpg";
std::ifstream file(filePath, std::ios::binary | std::ios::ate);
if (!file) {
std::cerr << "Error opening file: " << filePath << std::endl;
return 1;
}
std::streamsize size = file.tellg();
file.seekg(0, std::ios::beg);
std::vector buffer(size);
if (!file.read(buffer.data(), size)) {
std::cerr << "Error reading file." << std::endl;
return 1;
}
std::string bufferStr(buffer.data(), static_cast(size));
std::string encodedFile = base64::to_base64(bufferStr);
nlohmann::json jsonObj;
jsonObj["file"] = encodedFile;
jsonObj["fileType"] = 1;
auto response = client.Post("/formula-recognition", jsonObj.dump(), "application/json");
if (response && response->status == 200) {
nlohmann::json jsonResponse = nlohmann::json::parse(response->body);
auto result = jsonResponse["result"];
if (!result.is_object() || !result["formulaRecResults"].is_array()) {
std::cerr << "Unexpected response format." << std::endl;
return 1;
}
for (size_t i = 0; i < result["formulaRecResults"].size(); ++i) {
auto res = result["formulaRecResults"][i];
if (res.contains("prunedResult")) {
std::cout << "Recognized formula: " << res["prunedResult"].dump() << std::endl;
}
if (res.contains("outputImages") && res["outputImages"].is_object()) {
for (auto& [imgName, imgData] : res["outputImages"].items()) {
std::string outputPath = imgName + "_" + std::to_string(i) + ".jpg";
std::string decodedImage = base64::from_base64(imgData.get());
std::ofstream outFile(outputPath, std::ios::binary);
if (outFile.is_open()) {
outFile.write(decodedImage.c_str(), decodedImage.size());
outFile.close();
std::cout << "Saved image: " << outputPath << std::endl;
} else {
std::cerr << "Failed to write image: " << outputPath << std::endl;
}
}
}
}
} else {
std::cerr << "Request failed." << std::endl;
if (response) {
std::cerr << "HTTP status: " << response->status << std::endl;
std::cerr << "Response body: " << response->body << std::endl;
}
return 1;
}
return 0;
}
5.3、java版本:
import okhttp3.*;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.node.ObjectNode;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Base64;
public class Main {
public static void main(String[] args) throws IOException {
String API_URL = "http://localhost:8080/formula-recognition";
String imagePath = "./demo.jpg";
File file = new File(imagePath);
byte[] fileContent = java.nio.file.Files.readAllBytes(file.toPath());
String base64Image = Base64.getEncoder().encodeToString(fileContent);
ObjectMapper objectMapper = new ObjectMapper();
ObjectNode payload = objectMapper.createObjectNode();
payload.put("file", base64Image);
payload.put("fileType", 1);
OkHttpClient client = new OkHttpClient();
MediaType JSON = MediaType.get("application/json; charset=utf-8");
RequestBody body = RequestBody.create(JSON, payload.toString());
Request request = new Request.Builder()
.url(API_URL)
.post(body)
.build();
try (Response response = client.newCall(request).execute()) {
if (response.isSuccessful()) {
String responseBody = response.body().string();
JsonNode root = objectMapper.readTree(responseBody);
JsonNode result = root.get("result");
JsonNode formulaRecResults = result.get("formulaRecResults");
for (int i = 0; i < formulaRecResults.size(); i++) {
JsonNode item = formulaRecResults.get(i);
int finalI = i;
JsonNode prunedResult = item.get("prunedResult");
System.out.println("Pruned Result [" + i + "]: " + prunedResult.toString());
JsonNode outputImages = item.get("outputImages");
if (outputImages != null && outputImages.isObject()) {
outputImages.fieldNames().forEachRemaining(imgName -> {
try {
String imgBase64 = outputImages.get(imgName).asText();
byte[] imgBytes = Base64.getDecoder().decode(imgBase64);
String imgPath = imgName + "_" + finalI + ".jpg";
try (FileOutputStream fos = new FileOutputStream(imgPath)) {
fos.write(imgBytes);
System.out.println("Saved image: " + imgPath);
}
} catch (IOException e) {
System.err.println("Failed to save image: " + e.getMessage());
}
});
}
}
} else {
System.err.println("Request failed with HTTP code: " + response.code());
}
}
}
}
5.4、c#版本:
using System;
using System.IO;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json.Linq;
class Program
{
static readonly string API_URL = "http://localhost:8080/formula-recognition";
static readonly string inputFilePath = "./demo.jpg";
//static readonly string API_URL = "http://192.168.31.41:8080/ocr";
//static readonly string inputFilePath = "./general_ocr_002.png";
static async Task Main(string[] args)
{
var httpClient = new HttpClient();
byte[] fileBytes = File.ReadAllBytes(inputFilePath);
string fileData = Convert.ToBase64String(fileBytes);
var payload = new JObject
{
{ "file", fileData },
{ "fileType", 1 }
};
var content = new StringContent(payload.ToString(), Encoding.UTF8, "application/json");
HttpResponseMessage response = await httpClient.PostAsync(API_URL, content);
response.EnsureSuccessStatusCode();
string responseBody = await response.Content.ReadAsStringAsync();
JObject jsonResponse = JObject.Parse(responseBody);
JArray formulaRecResults = (JArray)jsonResponse["result"]["formulaRecResults"];
for (int i = 0; i < formulaRecResults.Count; i++)
{
var res = formulaRecResults[i];
Console.WriteLine($"[{i}] prunedResult:\n{res["prunedResult"]}");
JObject outputImages = res["outputImages"] as JObject;
if (outputImages != null)
{
foreach (var img in outputImages)
{
string imgName = img.Key;
string base64Img = img.Value?.ToString();
if (!string.IsNullOrEmpty(base64Img))
{
string imgPath = $"{imgName}_{i}.jpg";
byte[] imageBytes = Convert.FromBase64String(base64Img);
File.WriteAllBytes(imgPath, imageBytes);
Console.WriteLine($"Output image saved at {imgPath}");
}
}
}
}
}
}
2610

被折叠的 条评论
为什么被折叠?



