inner class & inner static class diff

本文主要讨论了Java中的内嵌类(InnerClasses)和内嵌静态类(InnerStaticClasses)之间的关键区别,包括它们对外层类的关联性、对外部成员的访问权限以及是否可以包含静态成员。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

In Java, inner classes and inner static classes are both nested classes, but they have some key differences.

  1. Inner Class:

    • An inner class is a class defined within another class.
    • It has access to the instance variables and methods of the outer class (including private members).
    • An instance of an inner class cannot exist without an instance of the outer class.
    • Inner classes are associated with instances of the outer class, so they have access to instance members of the outer class.
    • Inner classes cannot contain static members (other than compile-time constant fields).
    • Example:
      
      
      public class Outer {
          private int outerField;
          
          public class Inner {
              public void doSomething() {
                  outerField = 10; // Accessing outer class member
              }
          }
      }
      

  2. Inner Static Class (Static Nested Class):

    • An inner static class is a static nested class defined within another class.
    • It does not have access to the instance variables and methods of the outer class.
    • An instance of a static nested class can exist without an instance of the outer class.
    • Static nested classes are associated with the class itself, not with instances of the outer class.
    • Static nested classes can contain static members.
    • Example:
      public class Outer {
          private static int outerStaticField;
          
          public static class StaticInner {
              public void doSomething() {
                  outerStaticField = 10; // Accessing outer class static member
              }
          }
      }
      

In summary, the main differences between inner classes and inner static classes in Java are related to their association with the outer class, access to outer class members, and whether they can have static members.

#pragma once #include <vector> #include <opencv2/opencv.hpp> #include <tesseract/baseapi.h> #include <leptonica/allheaders.h> #include <thread> #include <atomic> #include <mutex> #include <string> #include <algorithm> #include "afxwin.h" #include <shlwapi.h> // 包含 StrCmpLogicalW 函数 #include <atlimage.h> // 添加CImage头文件 #include <tesseract/baseapi.h> #include <leptonica/allheaders.h> #include <iostream> #include <Windows.h> // 用于获取屏幕分辨率 using namespace cv; using namespace std; #pragma comment(lib, "Shlwapi.lib") // 链接 StrCmpLogicalW 所需的库 //#pragma comment(lib, "atlimage.lib") // 链接CImage库 #define WM_UPDATE_DISPLAY_IMAGE (WM_USER + 2) // B2saomiaotuxiangchuangkou 对话框 class B2saomiaotuxiangchuangkou : public CDialogEx { DECLARE_DYNAMIC(B2saomiaotuxiangchuangkou) public: B2saomiaotuxiangchuangkou(CWnd* pParent = NULL); // 标准构造函数 virtual ~B2saomiaotuxiangchuangkou(); enum { WM_LOAD_FIRST_IMAGE = WM_USER + 1 }; // 自定义控件布局调整函数(类似 SetControlPosition) void SetControlPosition(CWnd* pCtrl, int x, int y, int width, int height); void AdjustLayoutWhenResized(); // 对话框数据 #ifdef AFX_DESIGN_TIME enum { IDD = IDD_B2saomiaotuxiangchuangkou }; #endif protected: virtual void DoDataExchange(CDataExchange* pDX); virtual BOOL OnInitDialog(); afx_msg void OnSize(UINT nType, int cx, int cy); afx_msg void OnBnClickedButtonSelectFolder(); afx_msg void OnBnClickedButtonStart(); afx_msg void OnBnClickedButtonPause(); afx_msg void OnBnClickedButtonStop(); afx_msg void OnBnClickedButtonReset(); afx_msg void OnCbnSelchangeComboSaveOption(); afx_msg void OnTimer(UINT_PTR nIDEvent); afx_msg void OnLvnItemchangedListImages(NMHDR* pNMHDR, LRESULT* pResult); afx_msg LRESULT OnLoadFirstImageMessage(WPARAM wParam, LPARAM lParam); afx_msg LRESULT OnUpdateDisplayImage(WPARAM wParam, LPARAM lParam); afx_msg BOOL OnMouseWheel(UINT nFlags, short zDelta, CPoint pt); afx_msg BOOL OnEraseBkgnd(CDC* pDC); DECLARE_MESSAGE_MAP() private: void ProcessImages(); void ProcessImage(const CString& filePath); bool LoadImageToControl(const CString& filePath, UINT controlID); HBITMAP LoadImageWithOpenCV(const CString& filePath); HBITMAP LoadImageWithGDIPlus(const CString& filePath); HBITMAP LoadImageWithCImage(const CString& filePath); // 新增CImage加载函数 HBITMAP MatToBitmap(cv::Mat& mat, UINT controlID); // 新增函数 void UpdateProgressBars(); void Cleanup(); void CreateBackup(); void RestoreBackup(); void SaveImage(cv::Mat& img, const CString& filePath); /*void SimpleBinarization(cv::Mat& input, cv::Mat& output);*/ void DisplayFirstImage(); CString GetFileNameFromPath(const CString& filePath); void LoadFilesIntoListControl(); void UpdateDisplayImage(const CString& filePath); private: CEdit m_editFolderPath; CListCtrl m_listImages; CComboBox m_saveOption; CButton m_btnStart; CButton m_btnPause; CButton m_btnStop; CButton m_btnReset; CButton m_btnSelectFolder; CProgressCtrl m_totalProgress; CProgressCtrl m_fileProgress; CStatic m_picOriginal; CStatic m_picProcessed; CEdit m_editSavePath; std::vector<CString> m_imageFiles; CString m_folderPath; CString m_savePath; CString m_backupPath; std::atomic<bool> m_processing{ false }; std::atomic<bool> m_paused{ false }; std::atomic<bool> m_stopped{ false }; std::atomic<int> m_currentIndex{ 0 }; std::thread m_processingThread; std::mutex m_mutex; std::mutex m_imageMutex; std::mutex m_bitmapMutex; bool m_firstImageLoaded{ false }; bool m_firstLoadCompleted{ false }; bool m_imageLoaded{ false }; HBITMAP m_currentBitmap{ NULL }; enum ImageLoadMode { MODE_OPENCV, MODE_GDIPlus, MODE_CIMAGE }; // 新增CIMAGE模式 ImageLoadMode m_loadMode{ MODE_CIMAGE }; // 默认使用CImage加载 CRect m_originalRect; CRect m_processedRect; private: // 新增OCR相关函数 cv::Mat rotateImage(const cv::Mat& src, double angle); cv::Mat resizeToFitScreen(const cv::Mat& img); void PerformOCRAndRotation(const CString& imagePath, cv::Mat& inputImage, cv::Mat& outputImage); // 新增Tesseract实例 tesseract::TessBaseAPI m_tess; bool m_tessInitialized{ false }; };// B2saomiaotuxiangchuangkou.cpp : 实现文件 // #include "stdafx.h" #include "Imageprocessing.h" #include "B2saomiaotuxiangchuangkou.h" #include "afxdialogex.h" #include <string> #include <chrono> #include <Gdiplus.h> #include <shlwapi.h> #pragma comment(lib, "Gdiplus.lib") #pragma comment(lib, "shlwapi.lib") #pragma comment(lib, "Shell32.lib") // GDI+初始化全局变量 Gdiplus::GdiplusStartupInput gdiplusInput; ULONG_PTR gdiplusToken; IMPLEMENT_DYNAMIC(B2saomiaotuxiangchuangkou, CDialogEx) B2saomiaotuxiangchuangkou::B2saomiaotuxiangchuangkou(CWnd* pParent /*=NULL*/) : CDialogEx(IDD_B2saomiaotuxiangchuangkou, pParent) { } B2saomiaotuxiangchuangkou::~B2saomiaotuxiangchuangkou() { Cleanup(); Gdiplus::GdiplusShutdown(gdiplusToken); // 释放GDI+资源 // 释放Tesseract资源 if (m_tessInitialized) { m_tess.End(); } } void B2saomiaotuxiangchuangkou::DoDataExchange(CDataExchange* pDX) { CDialogEx::DoDataExchange(pDX); DDX_Control(pDX, IDC_EDIT_FOLDER_PATH, m_editFolderPath); DDX_Control(pDX, IDC_LIST_IMAGES, m_listImages); DDX_Control(pDX, IDC_COMBO_SAVE_OPTION, m_saveOption); DDX_Control(pDX, IDC_BUTTON_START, m_btnStart); DDX_Control(pDX, IDC_BUTTON_PAUSE, m_btnPause); DDX_Control(pDX, IDC_BUTTON_STOP, m_btnStop); DDX_Control(pDX, IDC_BUTTON_RESET, m_btnReset); DDX_Control(pDX, IDC_BUTTON_SELECT_FOLDER, m_btnSelectFolder); DDX_Control(pDX, IDC_PROGRESS_TOTAL, m_totalProgress); DDX_Control(pDX, IDC_PROGRESS_FILE, m_fileProgress); DDX_Control(pDX, IDC_STATIC_ORIGINAL_IMAGE, m_picOriginal); DDX_Control(pDX, IDC_STATIC_PROCESSED_IMAGE, m_picProcessed); DDX_Control(pDX, IDC_EDIT2, m_editSavePath); } BEGIN_MESSAGE_MAP(B2saomiaotuxiangchuangkou, CDialogEx) ON_WM_SIZE() ON_BN_CLICKED(IDC_BUTTON_SELECT_FOLDER, &B2saomiaotuxiangchuangkou::OnBnClickedButtonSelectFolder) ON_BN_CLICKED(IDC_BUTTON_START, &B2saomiaotuxiangchuangkou::OnBnClickedButtonStart) ON_BN_CLICKED(IDC_BUTTON_PAUSE, &B2saomiaotuxiangchuangkou::OnBnClickedButtonPause) ON_BN_CLICKED(IDC_BUTTON_STOP, &B2saomiaotuxiangchuangkou::OnBnClickedButtonStop) ON_BN_CLICKED(IDC_BUTTON_RESET, &B2saomiaotuxiangchuangkou::OnBnClickedButtonReset) ON_CBN_SELCHANGE(IDC_COMBO_SAVE_OPTION, &B2saomiaotuxiangchuangkou::OnCbnSelchangeComboSaveOption) ON_WM_TIMER() ON_NOTIFY(LVN_ITEMCHANGED, IDC_LIST_IMAGES, &B2saomiaotuxiangchuangkou::OnLvnItemchangedListImages) ON_MESSAGE(WM_LOAD_FIRST_IMAGE, &B2saomiaotuxiangchuangkou::OnLoadFirstImageMessage) ON_MESSAGE(WM_UPDATE_DISPLAY_IMAGE, &B2saomiaotuxiangchuangkou::OnUpdateDisplayImage) ON_WM_MOUSEWHEEL() ON_WM_ERASEBKGND() END_MESSAGE_MAP() BOOL B2saomiaotuxiangchuangkou::OnInitDialog() { CDialogEx::OnInitDialog(); ////////////////////////////////////////////////////////////////////////////////////////////// // 初始化Tesseract OCR if (m_tess.Init(nullptr, "chi_sim")) { AfxMessageBox(_T("无法初始化Tesseract OCR引擎!")); m_tessInitialized = false; } else { m_tessInitialized = true; } ////////////////////////////////////////////////////////////////////////////////////////////// m_listImages.SetExtendedStyle(LVS_EX_FULLROWSELECT | LVS_EX_GRIDLINES); m_listImages.InsertColumn(0, _T("文件名"), LVCFMT_LEFT, 200); m_listImages.InsertColumn(1, _T("大小"), LVCFMT_RIGHT, 100); m_listImages.InsertColumn(2, _T("类型"), LVCFMT_LEFT, 100); m_saveOption.AddString(_T("覆盖原图")); m_saveOption.AddString(_T("新路径")); m_saveOption.SetCurSel(0); m_totalProgress.SetRange(0, 100); m_fileProgress.SetRange(0, 100); m_totalProgress.SetPos(0); m_fileProgress.SetPos(0); m_btnPause.EnableWindow(FALSE); m_btnStop.EnableWindow(FALSE); m_btnReset.EnableWindow(FALSE); m_picOriginal.ModifyStyle(0, SS_BITMAP); m_picProcessed.ModifyStyle(0, SS_BITMAP); m_picOriginal.GetWindowRect(&m_originalRect); ScreenToClient(&m_originalRect); m_picProcessed.GetWindowRect(&m_processedRect); ScreenToClient(&m_processedRect); // 初始化GDI+ Gdiplus::GdiplusStartup(&gdiplusToken, &gdiplusInput, NULL); if (!m_folderPath.IsEmpty()) { PostMessage(WM_LOAD_FIRST_IMAGE, 0, 0); } return TRUE; } void B2saomiaotuxiangchuangkou::SetControlPosition(CWnd* pCtrl, int x, int y, int width, int height) { if (pCtrl) { CRect rect(x, y, x + width, y + height); pCtrl->MoveWindow(rect); } } void B2saomiaotuxiangchuangkou::AdjustLayoutWhenResized() { CRect clientRect; GetClientRect(&clientRect); int cx = clientRect.Width(); int cy = clientRect.Height(); // 获取屏幕尺寸(可选,若需要基于屏幕比例布局) int sW = GetSystemMetrics(SM_CXSCREEN); int sH = GetSystemMetrics(SM_CYSCREEN); int pc_W1 = (cx - 530) / 3;//图像显示控件宽 int pc_H1;//图像显示控件高 int pc_W2 = (cy - 530) / 3 * 2;//图像显示控件宽 int pc_H2;//图像显示控件高 // 示例:调整子界面中控件的位置(类似你提供的代码) SetControlPosition(GetDlgItem(IDC_STATIC1), 0, 5, 80, 30); /**/ SetControlPosition(GetDlgItem(IDC_STATIC2), 0, 50, 80, 30); /**/SetControlPosition(GetDlgItem(IDC_EDIT_FOLDER_PATH), 85, 43, 280, 30); /**/SetControlPosition(GetDlgItem(IDC_BUTTON_SELECT_FOLDER), 366, 44, 70, 30); /**/SetControlPosition(GetDlgItem(IDC_BUTTON_START), 440, 43, 70, 30);/**/SetControlPosition(GetDlgItem(IDC_STATIC_ORIGINAL_IMAGE), 530, 5, pc_W1 - 1, 490);/**/SetControlPosition(GetDlgItem(IDC_STATIC_PROCESSED_IMAGE), 530 + pc_W1 + 2, 5, 2 * pc_W1 - 5, cy - 10); SetControlPosition(GetDlgItem(IDC_STATIC3), 0, 90, 80, 30); /**/SetControlPosition(GetDlgItem(IDC_EDIT2), 85, 83, 280, 30); /**/SetControlPosition(GetDlgItem(IDC_COMBO_SAVE_OPTION), 366, 89, 70, 30); /**/SetControlPosition(GetDlgItem(IDC_BUTTON_PAUSE), 440, 84, 70, 30); SetControlPosition(GetDlgItem(IDC_STATIC4), 0, 155, 80, 30); /**/ /**/SetControlPosition(GetDlgItem(IDC_BUTTON_STOP), 440, 124, 70, 30); SetControlPosition(GetDlgItem(IDC_STATIC6), 0, 200, 80, 30); /**/SetControlPosition(GetDlgItem(IDC_PROGRESS_TOTAL), 85, 197, 280, 20); /**/SetControlPosition(GetDlgItem(IDC_BUTTON_RESET), 440, 164, 70, 30); SetControlPosition(GetDlgItem(IDC_STATIC7), 0, 240, 80, 30); /**/SetControlPosition(GetDlgItem(IDC_PROGRESS_FILE), 85, 237, 180, 20); SetControlPosition(GetDlgItem(IDC_STATIC8), 0, 305, 80, 30); /**/ /*SetControlPosition(GetDlgItem(IDC_STATIC8), 0, 305, 80, 30); /**/ /*SetControlPosition(GetDlgItem(IDC_STATIC8), 0, 305, 80, 30);*/ /**/ SetControlPosition(GetDlgItem(IDC_LIST_IMAGES), 0, 500, 530 + pc_W1 - 1, cy - 500 - 5); /**/ } void B2saomiaotuxiangchuangkou::OnSize(UINT nType, int cx, int cy) { CDialogEx::OnSize(nType, cx, cy); if (nType != SIZE_MINIMIZED) { AdjustLayoutWhenResized(); } } void B2saomiaotuxiangchuangkou::LoadFilesIntoListControl() { m_listImages.DeleteAllItems(); m_imageFiles.clear(); m_imageLoaded = false; m_firstLoadCompleted = false; std::vector<CString> files; CString searchPath = m_folderPath + _T("\\*.*"); WIN32_FIND_DATA findFileData; HANDLE hFind = FindFirstFile(searchPath, &findFileData); if (hFind != INVALID_HANDLE_VALUE) { do { if (!(findFileData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)) { CString fileName = findFileData.cFileName; CString ext = fileName.Right(4).MakeLower(); // 扩展支持的图像格式 if (ext == _T(".jpg") || ext == _T(".jpeg") || ext == _T(".png") || ext == _T(".bmp") || ext == _T(".tif") || ext == _T(".tiff") || ext == _T(".webp") || ext == _T(".jp2")) { files.push_back(fileName); } } } while (FindNextFile(hFind, &findFileData)); FindClose(hFind); } std::sort(files.begin(), files.end(), [](const CString& a, const CString& b) { return StrCmpLogicalW(a, b) < 0; }); for (size_t i = 0; i < files.size(); ++i) { CString fileName = files[i]; CString filePath = m_folderPath + _T("\\") + fileName; m_imageFiles.push_back(filePath); int nIndex = m_listImages.InsertItem(i, fileName); WIN32_FILE_ATTRIBUTE_DATA fad; ULONGLONG fileSize = 0; if (GetFileAttributesEx(filePath, GetFileExInfoStandard, &fad)) { fileSize = ((ULONGLONG)fad.nFileSizeHigh << 32) | fad.nFileSizeLow; } CString sizeStr; if (fileSize < 1024) sizeStr.Format(_T("%d B"), fileSize); else if (fileSize < 1024 * 1024) sizeStr.Format(_T("%.2f KB"), fileSize / 1024.0); else sizeStr.Format(_T("%.2f MB"), fileSize / (1024.0 * 1024.0)); m_listImages.SetItemText(nIndex, 1, sizeStr); CString typeStr = fileName.Right(4).Mid(1).MakeUpper(); m_listImages.SetItemText(nIndex, 2, typeStr); } } void B2saomiaotuxiangchuangkou::OnBnClickedButtonSelectFolder() { CFolderPickerDialog dlg; if (dlg.DoModal() == IDOK) { m_folderPath = dlg.GetFolderPath(); m_editFolderPath.SetWindowText(m_folderPath); LoadFilesIntoListControl(); OnCbnSelchangeComboSaveOption(); m_btnStart.EnableWindow(!m_imageFiles.empty()); if (!m_imageFiles.empty()) { m_listImages.SetItemState(0, LVIS_SELECTED | LVIS_FOCUSED, LVIS_SELECTED | LVIS_FOCUSED); m_listImages.EnsureVisible(0, FALSE); m_listImages.UpdateWindow(); PostMessage(WM_LOAD_FIRST_IMAGE, 0, 0); } } } LRESULT B2saomiaotuxiangchuangkou::OnLoadFirstImageMessage(WPARAM wParam, LPARAM lParam) { DisplayFirstImage(); return 0; } void B2saomiaotuxiangchuangkou::DisplayFirstImage() { if (m_imageFiles.empty()) return; CString filePath = m_imageFiles[0]; if (!LoadImageToControl(filePath, IDC_STATIC_ORIGINAL_IMAGE)) { return; } m_firstImageLoaded = true; m_firstLoadCompleted = true; m_imageLoaded = true; CWnd* pProcessed = GetDlgItem(IDC_STATIC_PROCESSED_IMAGE); if (pProcessed) { auto pStatic = reinterpret_cast<CStatic*>(pProcessed); HBITMAP hOldBmp = pStatic->GetBitmap(); if (hOldBmp) { DeleteObject(hOldBmp); pStatic->SetBitmap(NULL); } } } bool B2saomiaotuxiangchuangkou::LoadImageToControl(const CString& filePath, UINT controlID) { std::lock_guard<std::mutex> lock(m_imageMutex); DWORD fileAttr = GetFileAttributes(filePath); if (fileAttr == INVALID_FILE_ATTRIBUTES) { return false; } HBITMAP hBitmap = NULL; if (m_loadMode == MODE_CIMAGE) { hBitmap = LoadImageWithCImage(filePath); if (!hBitmap) { // CImage加载失败时,切换到OpenCV并重试 m_loadMode = MODE_OPENCV; hBitmap = LoadImageWithOpenCV(filePath); if (!hBitmap) { // OpenCV加载失败时,切换到GDI+ m_loadMode = MODE_GDIPlus; hBitmap = LoadImageWithGDIPlus(filePath); } } } else if (m_loadMode == MODE_OPENCV) { hBitmap = LoadImageWithOpenCV(filePath); if (!hBitmap) { m_loadMode = MODE_GDIPlus; hBitmap = LoadImageWithGDIPlus(filePath); if (!hBitmap) { m_loadMode = MODE_CIMAGE; hBitmap = LoadImageWithCImage(filePath); } } } else { hBitmap = LoadImageWithGDIPlus(filePath); if (!hBitmap) { m_loadMode = MODE_OPENCV; hBitmap = LoadImageWithOpenCV(filePath); if (!hBitmap) { m_loadMode = MODE_CIMAGE; hBitmap = LoadImageWithCImage(filePath); } } } if (!hBitmap) { return false; } CWnd* pWnd = GetDlgItem(controlID); if (pWnd) { CStatic* pStatic = dynamic_cast<CStatic*>(pWnd); if (pStatic) { HBITMAP oldBmp = pStatic->GetBitmap(); if (oldBmp) DeleteObject(oldBmp); pStatic->SetBitmap(hBitmap); pStatic->Invalidate(); return true; } } DeleteObject(hBitmap); return false; } // 使用CImage加载图像并保持比例 HBITMAP B2saomiaotuxiangchuangkou::LoadImageWithCImage(const CString& filePath) { CImage image; if (image.Load(filePath) != S_OK) { return NULL; } CRect rect; m_picOriginal.GetClientRect(&rect); int ctrlWidth = rect.Width(); int ctrlHeight = rect.Height(); int srcWidth = image.GetWidth(); int srcHeight = image.GetHeight(); // 计算等比例缩放尺寸,确保不超出控件范围 double ratioX = static_cast<double>(ctrlWidth) / srcWidth; double ratioY = static_cast<double>(ctrlHeight) / srcHeight; double ratio = std::min(ratioX, ratioY); // 取较小比例保持原始比例 int drawWidth = static_cast<int>(srcWidth * ratio); int drawHeight = static_cast<int>(srcHeight * ratio); int x = (ctrlWidth - drawWidth) / 2; int y = (ctrlHeight - drawHeight) / 2; // 创建兼容位图 HDC hdcScreen = ::GetDC(NULL); HBITMAP hBitmap = CreateCompatibleBitmap(hdcScreen, ctrlWidth, ctrlHeight); HDC hdcMem = CreateCompatibleDC(hdcScreen); HBITMAP hOldBmp = (HBITMAP)SelectObject(hdcMem, hBitmap); // 填充背景为白色 HBRUSH hBrush = CreateSolidBrush(RGB(255, 255, 255)); RECT rc = { 0, 0, ctrlWidth, ctrlHeight }; FillRect(hdcMem, &rc, hBrush); DeleteObject(hBrush); // 使用GDI+高质量插值缩放图像 HDC hdcImage = image.GetDC(); // 设置高质量拉伸模式 SetStretchBltMode(hdcMem, HALFTONE); // 执行高质量缩放 StretchBlt( hdcMem, x, y, drawWidth, drawHeight, hdcImage, 0, 0, srcWidth, srcHeight, SRCCOPY ); // 释放CImage的设备上下文 image.ReleaseDC(); // 恢复设备上下文并释放资源 SelectObject(hdcMem, hOldBmp); DeleteDC(hdcMem); ::ReleaseDC(NULL, hdcScreen); return hBitmap; } HBITMAP B2saomiaotuxiangchuangkou::LoadImageWithOpenCV(const CString& filePath) { cv::Mat img; #ifdef _UNICODE int len = WideCharToMultiByte(CP_UTF8, 0, filePath, -1, NULL, 0, NULL, NULL); std::vector<char> utf8Buf(len); WideCharToMultiByte(CP_UTF8, 0, filePath, -1, utf8Buf.data(), len, NULL, NULL); img = cv::imread(utf8Buf.data(), cv::IMREAD_COLOR); #else img = cv::imread(CW2A(filePath), cv::IMREAD_COLOR); #endif if (img.empty()) { return NULL; } cv::cvtColor(img, img, cv::COLOR_BGR2RGB); CRect rect; m_picOriginal.GetClientRect(&rect); int dstWidth = rect.Width(); int dstHeight = rect.Height(); double ratio = std::min(static_cast<double>(dstWidth) / img.cols, static_cast<double>(dstHeight) / img.rows); int drawWidth = static_cast<int>(img.cols * ratio); int drawHeight = static_cast<int>(img.rows * ratio); int x = (dstWidth - drawWidth) / 2; int y = (dstHeight - drawHeight) / 2; HDC hdcScreen = ::GetDC(NULL); HBITMAP hBitmap = CreateCompatibleBitmap(hdcScreen, dstWidth, dstHeight); HDC hdcMem = CreateCompatibleDC(hdcScreen); HBITMAP hOldBmp = (HBITMAP)SelectObject(hdcMem, hBitmap); HBRUSH hBrush = CreateSolidBrush(RGB(255, 255, 255)); RECT rc = { 0, 0, dstWidth, dstHeight }; FillRect(hdcMem, &rc, hBrush); DeleteObject(hBrush); cv::Mat resized; // 根据缩放比例选择插值算法(放大用双三次,缩小用区域插值) int interpolation = (ratio >= 1.0) ? cv::INTER_CUBIC : cv::INTER_AREA; cv::resize(img, resized, cv::Size(drawWidth, drawHeight), 0, 0, interpolation); BITMAPINFOHEADER bmiHeader = { 0 }; bmiHeader.biSize = sizeof(BITMAPINFOHEADER); bmiHeader.biWidth = resized.cols; bmiHeader.biHeight = -resized.rows; bmiHeader.biPlanes = 1; bmiHeader.biBitCount = 24; bmiHeader.biCompression = BI_RGB; SetStretchBltMode(hdcMem, HALFTONE); StretchDIBits( hdcMem, x, y, drawWidth, drawHeight, 0, 0, resized.cols, resized.rows, resized.data, (BITMAPINFO*)&bmiHeader, DIB_RGB_COLORS, SRCCOPY ); SelectObject(hdcMem, hOldBmp); DeleteDC(hdcMem); ::ReleaseDC(NULL, hdcScreen); return hBitmap; } HBITMAP B2saomiaotuxiangchuangkou::LoadImageWithGDIPlus(const CString& filePath) { Gdiplus::Image image(filePath); if (image.GetLastStatus() != Gdiplus::Ok) { return NULL; } CRect rect; m_picOriginal.GetClientRect(&rect); int ctrlWidth = rect.Width(); int ctrlHeight = rect.Height(); int srcWidth = image.GetWidth(); int srcHeight = image.GetHeight(); double scale = std::min(static_cast<double>(ctrlWidth) / srcWidth, static_cast<double>(ctrlHeight) / srcHeight); int drawWidth = static_cast<int>(srcWidth * scale); int drawHeight = static_cast<int>(srcHeight * scale); int x = (ctrlWidth - drawWidth) / 2; int y = (ctrlHeight - drawHeight) / 2; Gdiplus::Bitmap bitmap(ctrlWidth, ctrlHeight, PixelFormat32bppARGB); Gdiplus::Graphics graphics(&bitmap); // 设置最高质量渲染参数 graphics.SetSmoothingMode(Gdiplus::SmoothingModeHighQuality); graphics.SetPixelOffsetMode(Gdiplus::PixelOffsetModeHighQuality); graphics.SetInterpolationMode(Gdiplus::InterpolationModeHighQualityBicubic); graphics.Clear(Gdiplus::Color(255, 255, 255)); graphics.DrawImage(&image, x, y, drawWidth, drawHeight); HBITMAP hBitmap = NULL; bitmap.GetHBITMAP(Gdiplus::Color::White, &hBitmap); return hBitmap; } // 新增函数:将OpenCV Mat转换为HBITMAP并适配控件 HBITMAP B2saomiaotuxiangchuangkou::MatToBitmap(cv::Mat& mat, UINT controlID) { if (mat.empty()) return NULL; // 确保是3通道RGB图像 if (mat.channels() == 1) { cv::cvtColor(mat, mat, cv::COLOR_GRAY2BGR); } else if (mat.channels() == 4) { cv::cvtColor(mat, mat, cv::COLOR_BGRA2BGR); } CRect rect; CWnd* pWnd = GetDlgItem(controlID); if (!pWnd) return NULL; pWnd->GetClientRect(&rect); int dstWidth = rect.Width(); int dstHeight = rect.Height(); double ratio = std::min(static_cast<double>(dstWidth) / mat.cols, static_cast<double>(dstHeight) / mat.rows); int drawWidth = static_cast<int>(mat.cols * ratio); int drawHeight = static_cast<int>(mat.rows * ratio); int x = (dstWidth - drawWidth) / 2; int y = (dstHeight - drawHeight) / 2; cv::Mat resized; // 根据缩放比例选择插值算法 int interpolation = (ratio >= 1.0) ? cv::INTER_CUBIC : cv::INTER_AREA; cv::resize(mat, resized, cv::Size(drawWidth, drawHeight), ratio, ratio, interpolation); HDC hdcScreen = ::GetDC(NULL); HBITMAP hBitmap = CreateCompatibleBitmap(hdcScreen, dstWidth, dstHeight); HDC hdcMem = CreateCompatibleDC(hdcScreen); HBITMAP hOldBmp = (HBITMAP)SelectObject(hdcMem, hBitmap); HBRUSH hBrush = CreateSolidBrush(RGB(255, 255, 255)); RECT rc = { 0, 0, dstWidth, dstHeight }; FillRect(hdcMem, &rc, hBrush); DeleteObject(hBrush); BITMAPINFOHEADER bmiHeader = { 0 }; bmiHeader.biSize = sizeof(BITMAPINFOHEADER); bmiHeader.biWidth = resized.cols; bmiHeader.biHeight = -resized.rows; bmiHeader.biPlanes = 1; bmiHeader.biBitCount = 24; bmiHeader.biCompression = BI_RGB; SetStretchBltMode(hdcMem, HALFTONE); StretchDIBits( hdcMem, x, y, drawWidth, drawHeight, 0, 0, resized.cols, resized.rows, resized.data, (BITMAPINFO*)&bmiHeader, DIB_RGB_COLORS, SRCCOPY ); SelectObject(hdcMem, hOldBmp); DeleteDC(hdcMem); ::ReleaseDC(NULL, hdcScreen); return hBitmap; } BOOL B2saomiaotuxiangchuangkou::OnMouseWheel(UINT nFlags, short zDelta, CPoint pt) { return TRUE; } BOOL B2saomiaotuxiangchuangkou::OnEraseBkgnd(CDC* pDC) { CRect rect; GetClientRect(&rect); CBrush brush(RGB(255, 255, 255)); pDC->FillRect(&rect, &brush); return TRUE; } void B2saomiaotuxiangchuangkou::OnCbnSelchangeComboSaveOption() { int option = m_saveOption.GetCurSel(); if (option == 0) { m_savePath = m_folderPath; m_editSavePath.SetWindowText(m_savePath); } else if (option == 1) { if (m_savePath.IsEmpty() || m_savePath == m_folderPath) { CFolderPickerDialog dlg; if (dlg.DoModal() == IDOK) { m_savePath = dlg.GetFolderPath(); m_editSavePath.SetWindowText(m_savePath); } else { m_saveOption.SetCurSel(0); m_savePath = m_folderPath; m_editSavePath.SetWindowText(m_savePath); } } } } void B2saomiaotuxiangchuangkou::OnBnClickedButtonStart() { if (m_folderPath.IsEmpty()) { AfxMessageBox(_T("请先选择文件夹!")); return; } if (m_imageFiles.empty()) { AfxMessageBox(_T("文件夹中没有图像文件!")); return; } if (m_saveOption.GetCurSel() == 1 && m_savePath.IsEmpty()) { AfxMessageBox(_T("请先选择保存路径!")); return; } if (m_saveOption.GetCurSel() == 0) { CreateBackup(); } m_processing = true; m_paused = false; m_stopped = false; m_currentIndex = 0; m_btnStart.EnableWindow(FALSE); m_btnPause.EnableWindow(TRUE); m_btnStop.EnableWindow(TRUE); m_btnReset.EnableWindow(FALSE); m_totalProgress.SetPos(0); m_fileProgress.SetPos(0); if (m_processingThread.joinable()) { m_processingThread.join(); } m_processingThread = std::thread(&B2saomiaotuxiangchuangkou::ProcessImages, this); SetTimer(1, 100, NULL); } void B2saomiaotuxiangchuangkou::OnBnClickedButtonPause() { m_paused = !m_paused; m_btnPause.SetWindowText(m_paused ? _T("继续") : _T("暂停")); } void B2saomiaotuxiangchuangkou::OnBnClickedButtonStop() { m_stopped = true; m_processing = false; m_btnStart.EnableWindow(TRUE); m_btnPause.EnableWindow(FALSE); m_btnStop.EnableWindow(FALSE); m_btnReset.EnableWindow(TRUE); if (m_processingThread.joinable()) { m_processingThread.join(); } KillTimer(1); } void B2saomiaotuxiangchuangkou::OnBnClickedButtonReset() { OnBnClickedButtonStop(); if (m_saveOption.GetCurSel() == 0 && !m_backupPath.IsEmpty()) { RestoreBackup(); } m_currentIndex = 0; m_totalProgress.SetPos(0); m_fileProgress.SetPos(0); m_backupPath.Empty(); m_btnReset.EnableWindow(FALSE); if (!m_imageFiles.empty()) { m_listImages.SetItemState(0, LVIS_SELECTED, LVIS_SELECTED); PostMessage(WM_LOAD_FIRST_IMAGE, 0, 0); } } void B2saomiaotuxiangchuangkou::OnTimer(UINT_PTR nIDEvent) { if (nIDEvent == 1) { UpdateProgressBars(); } CDialogEx::OnTimer(nIDEvent); } void B2saomiaotuxiangchuangkou::OnLvnItemchangedListImages(NMHDR* pNMHDR, LRESULT* pResult) { LPNMLISTVIEW pNMLV = reinterpret_cast<LPNMLISTVIEW>(pNMHDR); int nItem = -1; if (pNMLV) { if ((pNMLV->uChanged & LVIF_STATE) && (pNMLV->uNewState & LVIS_SELECTED)) { nItem = pNMLV->iItem; } } else { nItem = 0; } if (!m_firstLoadCompleted && nItem == 0) { if (pResult) *pResult = 0; return; } // 当用户手动选择时,更新原图并清除处理后的图像 if (nItem >= 0 && nItem < (int)m_imageFiles.size() && !m_processing) { CString filePath = m_imageFiles[nItem]; LoadImageToControl(filePath, IDC_STATIC_ORIGINAL_IMAGE); // 清除处理后的图像 CWnd* pProcessed = GetDlgItem(IDC_STATIC_PROCESSED_IMAGE); if (pProcessed) { auto pStatic = reinterpret_cast<CStatic*>(pProcessed); HBITMAP hOldBmp = pStatic->SetBitmap(NULL); if (hOldBmp) DeleteObject(hOldBmp); } } if (pResult) { *pResult = 0; } } void B2saomiaotuxiangchuangkou::ProcessImages() { for (m_currentIndex = 0; m_currentIndex < (int)m_imageFiles.size(); ++m_currentIndex) { if (m_stopped) break; while (m_paused && !m_stopped) { std::this_thread::sleep_for(std::chrono::milliseconds(100)); } if (m_stopped) break; ProcessImage(m_imageFiles[m_currentIndex]); { std::lock_guard<std::mutex> lock(m_mutex); int totalProgress = static_cast<int>((static_cast<double>(m_currentIndex + 1) / m_imageFiles.size()) * 100); m_totalProgress.SetPos(totalProgress); m_fileProgress.SetPos(100); } std::this_thread::sleep_for(std::chrono::milliseconds(100)); } m_processing = false; m_stopped = true; PostMessage(WM_COMMAND, MAKEWPARAM(IDC_BUTTON_STOP, BN_CLICKED), (LPARAM)m_btnStop.m_hWnd); AfxMessageBox(_T("处理完成!")); } void B2saomiaotuxiangchuangkou::ProcessImage(const CString& filePath) { CT2CA filePathConverted(filePath); std::string filePathStr(filePathConverted); cv::Mat image = cv::imread(filePathStr); if (image.empty()) return; // 更新显示原图 UpdateDisplayImage(filePath); cv::Mat processed; // ====== 调用OCR方向校正函数 ====== // 创建临时副本用于OCR处理 cv::Mat imageCopy = image.clone(); PerformOCRAndRotation(filePath, imageCopy, processed); // ====== END OCR调用 ====== // 直接显示处理后的图像,不使用临时文件 HBITMAP hProcessedBmp = MatToBitmap(processed, IDC_STATIC_PROCESSED_IMAGE); if (hProcessedBmp) { CWnd* pProcessed = GetDlgItem(IDC_STATIC_PROCESSED_IMAGE); if (pProcessed) { auto pStatic = reinterpret_cast<CStatic*>(pProcessed); HBITMAP hOldBmp = pStatic->SetBitmap(NULL); if (hOldBmp) DeleteObject(hOldBmp); pStatic->SetBitmap(hProcessedBmp); pStatic->Invalidate(); } } // 保存处理后的图像 if (m_saveOption.GetCurSel() == 0) { SaveImage(processed, filePath); } else { CString fileName = GetFileNameFromPath(filePath); CString newPath = m_savePath + _T("\\") + fileName; SaveImage(processed, newPath); } // 更新文件处理进度 for (int progress = 0; progress <= 100; progress += 10) { if (m_stopped) return; while (m_paused && !m_stopped) { std::this_thread::sleep_for(std::chrono::milliseconds(100)); } if (m_stopped) return; { std::lock_guard<std::mutex> lock(m_mutex); m_fileProgress.SetPos(progress); } std::this_thread::sleep_for(std::chrono::milliseconds(50)); } } void B2saomiaotuxiangchuangkou::UpdateDisplayImage(const CString& filePath) { PostMessage(WM_UPDATE_DISPLAY_IMAGE, (WPARAM)(LPCTSTR)filePath, 0); } LRESULT B2saomiaotuxiangchuangkou::OnUpdateDisplayImage(WPARAM wParam, LPARAM lParam) { CString filePath = (LPCTSTR)wParam; if (filePath.IsEmpty()) return 0; // 只在处理过程中更新原图显示 if (m_processing && !m_paused && !m_stopped) { LoadImageToControl(filePath, IDC_STATIC_ORIGINAL_IMAGE); } return 0; } Mat B2saomiaotuxiangchuangkou::rotateImage(const Mat& src, double angle) { if (src.empty()) return Mat(); Point2f center(static_cast<float>(src.cols / 2.0), static_cast<float>(src.rows / 2.0)); Mat rotMat = getRotationMatrix2D(center, angle, 1.0); Rect2f bbox = RotatedRect(center, Size2f(src.size()), angle).boundingRect2f(); rotMat.at<double>(0, 2) += bbox.width / 2.0 - center.x; rotMat.at<double>(1, 2) += bbox.height / 2.0 - center.y; Mat dst; warpAffine(src, dst, rotMat, Size(static_cast<int>(bbox.width), static_cast<int>(bbox.height)), INTER_LINEAR, BORDER_CONSTANT, Scalar(0, 0, 0)); return dst; } // 调整图像大小以适应屏幕 Mat B2saomiaotuxiangchuangkou::resizeToFitScreen(const Mat& img) { if (img.empty()) return Mat(); // 获取屏幕分辨率 int screenWidth = GetSystemMetrics(SM_CXSCREEN); int screenHeight = GetSystemMetrics(SM_CYSCREEN); // 设置最大显示尺寸(留出边缘空间) int maxDisplayWidth = static_cast<int>(screenWidth * 0.8); int maxDisplayHeight = static_cast<int>(screenHeight * 0.8); // 计算缩放比例 double scale = min(static_cast<double>(maxDisplayWidth) / img.cols, static_cast<double>(maxDisplayHeight) / img.rows); // 如果图像已经小于最大尺寸,则不缩放 if (scale >= 1.0) return img.clone(); // 计算新尺寸 int newWidth = static_cast<int>(img.cols * scale); int newHeight = static_cast<int>(img.rows * scale); // 缩放图像 Mat resized; resize(img, resized, Size(newWidth, newHeight), 0, 0, INTER_AREA); return resized; } void B2saomiaotuxiangchuangkou::PerformOCRAndRotation(const CString& imagePath, cv::Mat& inputImage, cv::Mat& outputImage) { // 检查OCR引擎是否初始化 if (!m_tessInitialized) { outputImage = inputImage.clone(); // 返回原始图像 return; } try { // 使用传入的图像 cv::Mat image = inputImage; // 设置图像为灰度图 (Tesseract需要) cv::Mat gray; cv::cvtColor(image, gray, cv::COLOR_BGR2GRAY); // 设置Tesseract图像 m_tess.SetImage(gray.data, gray.cols, gray.rows, 1, gray.step); // 检测方向和脚本 int orient_deg; float orient_conf; const char* script_name; float script_conf; m_tess.DetectOrientationScript(&orient_deg, &orient_conf, &script_name, &script_conf); // 根据检测到的方向角度旋转图像(逆时针旋转) if (orient_deg != 0) { outputImage = rotateImage(image, orient_deg); } else { outputImage = image.clone(); } } catch (...) { // 处理任何异常,返回原始图像 outputImage = inputImage.clone(); } // 重置Tesseract图像 m_tess.Clear(); } void B2saomiaotuxiangchuangkou::SaveImage(cv::Mat& img, const CString& filePath) { if (img.empty()) return; CT2CA filePathConverted(filePath); std::string filePathStr(filePathConverted); std::vector<int> compression_params; CString ext = filePath.Right(4).MakeLower(); if (ext == _T(".jpg") || ext == _T(".jpeg")) { compression_params.push_back(cv::IMWRITE_JPEG_QUALITY); compression_params.push_back(95); } else if (ext == _T(".png")) { compression_params.push_back(cv::IMWRITE_PNG_COMPRESSION); compression_params.push_back(9); } cv::imwrite(filePathStr, img, compression_params); } void B2saomiaotuxiangchuangkou::UpdateProgressBars() { if (m_imageFiles.empty()) return; int totalProgress = static_cast<int>( (static_cast<double>(m_currentIndex) / m_imageFiles.size()) * 100); // 平滑过渡 int currentTotal = m_totalProgress.GetPos(); if (abs(totalProgress - currentTotal) > 5) { m_totalProgress.SetPos(totalProgress); } else if (totalProgress > currentTotal) { m_totalProgress.SetPos(currentTotal + 1); } } void B2saomiaotuxiangchuangkou::CreateBackup() { if (m_folderPath.IsEmpty()) return; CTime time = CTime::GetCurrentTime(); CString timeStr = time.Format(_T("%Y%m%d%H%M%S")); m_backupPath = m_folderPath + _T("\\backup_") + timeStr; if (!CreateDirectory(m_backupPath, NULL)) { return; } for (const auto& file : m_imageFiles) { CString fileName = GetFileNameFromPath(file); CString destPath = m_backupPath + _T("\\") + fileName; CopyFile(file, destPath, FALSE); } } void B2saomiaotuxiangchuangkou::RestoreBackup() { if (m_backupPath.IsEmpty()) return; for (const auto& file : m_imageFiles) { CString fileName = GetFileNameFromPath(file); CString sourcePath = m_backupPath + _T("\\") + fileName; CopyFile(sourcePath, file, FALSE); } RemoveDirectory(m_backupPath); m_backupPath.Empty(); } void B2saomiaotuxiangchuangkou::Cleanup() { m_stopped = true; m_processing = false; if (m_processingThread.joinable()) { m_processingThread.join(); } KillTimer(1); std::lock_guard<std::mutex> lock(m_bitmapMutex); if (m_currentBitmap) { DeleteObject(m_currentBitmap); m_currentBitmap = NULL; } // 释放控件中的位图 HBITMAP hBmp = m_picOriginal.GetBitmap(); if (hBmp) DeleteObject(hBmp); hBmp = m_picProcessed.GetBitmap(); if (hBmp) DeleteObject(hBmp); } CString B2saomiaotuxiangchuangkou::GetFileNameFromPath(const CString& filePath) { int pos = filePath.ReverseFind('\\'); if (pos == -1) pos = filePath.ReverseFind('/'); if (pos != -1) { return filePath.Mid(pos + 1); } return filePath; }中 ,在void B2saomiaotuxiangchuangkou::PerformOCRAndRotation(const CString& imagePath, cv::Mat& inputImage, cv::Mat& outputImage) { // 检查OCR引擎是否初始化 if (!m_tessInitialized) { outputImage = inputImage.clone(); // 返回原始图像 return; } try { // 使用传入的图像 cv::Mat image = inputImage; // 设置图像为灰度图 (Tesseract需要) cv::Mat gray; cv::cvtColor(image, gray, cv::COLOR_BGR2GRAY); // 设置Tesseract图像 m_tess.SetImage(gray.data, gray.cols, gray.rows, 1, gray.step); // 检测方向和脚本 int orient_deg; float orient_conf; const char* script_name; float script_conf; m_tess.DetectOrientationScript(&orient_deg, &orient_conf, &script_name, &script_conf); // 根据检测到的方向角度旋转图像(逆时针旋转) if (orient_deg != 0) { outputImage = rotateImage(image, orient_deg); } else { outputImage = image.clone(); } } catch (...) { // 处理任何异常,返回原始图像 outputImage = inputImage.clone(); } // 重置Tesseract图像 m_tess.Clear(); }函数体中输出的outputImage添加一个判断函数,继续对该输出继续处理,判断函数根据#include <opencv2/opencv.hpp> #include <vector> #include <iostream> #include <numeric> #include <cmath> #include <future> #include <algorithm> using namespace cv; using namespace std; // 计算水平投影的方差 double calculateProjectionVariance(const Mat& binary) { vector<int> projection(binary.rows, 0); // 使用指针遍历加速 for (int y = 0; y < binary.rows; y++) { const uchar* row = binary.ptr<uchar>(y); for (int x = 0; x < binary.cols; x++) { projection[y] += row[x] / 255; // 归一化到0-1 } } // 使用STL算法计算均值和方差 double sum = accumulate(projection.begin(), projection.end(), 0.0); double mean = sum / projection.size(); double variance = 0.0; for_each(projection.begin(), projection.end(), [&](int val) { variance += pow(val - mean, 2); }); variance /= projection.size(); return variance; } // 检测文本倾斜角度 - 优化版本 double detectSkewAngle(const Mat& image) { // 转为灰度图并二值化 Mat gray, binary; cvtColor(image, gray, COLOR_BGR2GRAY); threshold(gray, binary, 0, 255, THRESH_BINARY_INV | THRESH_OTSU); // 初始角度搜索范围和步长 double angleStart = -15.0; double angleEnd = 15.0; double angleStep = 2.0; // 粗粒度搜索 double bestAngle = 0.0; // 两次迭代搜索:先粗后精 for (int iter = 0; iter < 2; iter++) { vector<double> angles; for (double angle = angleStart; angle <= angleEnd; angle += angleStep) { angles.push_back(angle); } double maxVariance = 0.0; double currentBestAngle = 0.0; // 多线程并行计算不同角度的投影方差 vector<future<pair<double, double>>> futures; for (double angle : angles) { futures.push_back(async(launch::async, [binary, angle]() { // 获取图像尺寸 int height = binary.rows; int width = binary.cols; // 计算旋转矩阵 Point2f center(width / 2.0, height / 2.0); Mat rot = getRotationMatrix2D(center, angle, 1.0); // 计算旋转后的图像尺寸 Rect bbox = RotatedRect(center, Size2f(width, height), angle).boundingRect(); // 调整旋转矩阵的平移部分 rot.at<double>(0, 2) += bbox.width / 2.0 - center.x; rot.at<double>(1, 2) += bbox.height / 2.0 - center.y; // 旋转图像 (使用更快的插值方法) Mat rotated; warpAffine(binary, rotated, rot, bbox.size(), INTER_LINEAR, BORDER_REPLICATE); // 计算投影方差 double variance = calculateProjectionVariance(rotated); return make_pair(variance, angle); })); } // 收集结果 for (auto& future : futures) { auto result = future.get(); if (result.first > maxVariance) { maxVariance = result.first; currentBestAngle = result.second; } } // 更新下一次迭代的搜索范围 bestAngle = currentBestAngle; angleStart = bestAngle - angleStep; angleEnd = bestAngle + angleStep; angleStep /= 4.0; // 精搜索步长变为原来的1/4 } return bestAngle; } // 校正倾斜图像 - 优化版本 Mat correctSkew(const Mat& image, double angle) { // 获取图像尺寸 int height = image.rows; int width = image.cols; // 计算旋转矩阵 Point2f center(width / 2.0, height / 2.0); Mat rot = getRotationMatrix2D(center, angle, 1.0); // 计算旋转后的图像尺寸 Rect bbox = RotatedRect(center, Size2f(width, height), angle).boundingRect(); // 调整旋转矩阵的平移部分 rot.at<double>(0, 2) += bbox.width / 2.0 - center.x; rot.at<double>(1, 2) += bbox.height / 2.0 - center.y; // 使用更快的插值方法 Mat corrected; warpAffine(image, corrected, rot, bbox.size(), INTER_LINEAR, BORDER_REPLICATE); return corrected; } int main() { // 读取图像 Mat image = imread("E:\\123\\14.jpg"); if (image.empty()) { cerr << "无法读取图像!" << endl; return -1; } // 检测倾斜角度 double skewAngle = detectSkewAngle(image); cout << "检测到倾斜角度: " << skewAngle << "°" << endl; // 校正倾斜 Mat correctedImage = correctSkew(image, skewAngle); // 显示结果 namedWindow("原始图像", WINDOW_NORMAL); namedWindow("校正后图像", WINDOW_NORMAL); imshow("原始图像", image); imshow("校正后图像", correctedImage); // 保存结果 imwrite("corrected_text.jpg", correctedImage); cout << "校正后的图像已保存为 corrected_text.jpg" << endl; waitKey(0); return 0; }和#include <opencv2/opencv.hpp> #include <iostream> #include <vector> #include <cmath> #include <algorithm> #include <limits> using namespace cv; using namespace std; // 计算两条直线的夹角差(0-90度) double calculateAngleDiff(double angle1, double angle2) { double diff = fabs(angle1 - angle2); if (diff > 90) diff = 180 - diff; return diff; } // 计算直线角度(0-180度) double getLineAngle(const Vec4i& line) { Point pt1(line[0], line[1]); Point pt2(line[2], line[3]); double dx = pt2.x - pt1.x; double dy = pt2.y - pt1.y; double angle = atan2(dy, dx) * 180 / CV_PI; if (angle < 0) angle += 180; return angle; } // 调整图像大小以适应屏幕显示 Mat resizeToFitScreen(const Mat& image, int maxWidth = 1200, int maxHeight = 800) { double scale = 1.0; if (image.cols > maxWidth || image.rows > maxHeight) { double scaleX = static_cast<double>(maxWidth) / image.cols; double scaleY = static_cast<double>(maxHeight) / image.rows; scale = min(scaleX, scaleY); } if (scale < 1.0) { Mat resized; resize(image, resized, Size(), scale, scale); return resized; } return image.clone(); } // 检测四边形轮廓 vector<vector<Point>> detectQuadrilaterals(Mat& edges) { vector<vector<Point>> contours; vector<Vec4i> hierarchy; findContours(edges, contours, hierarchy, RETR_TREE, CHAIN_APPROX_SIMPLE); vector<vector<Point>> quads; for (size_t i = 0; i < contours.size(); i++) { // 忽略小轮廓 if (contourArea(contours[i]) < 1000) continue; vector<Point> approx; // 多边形逼近 double epsilon = 0.02 * arcLength(contours[i], true); approxPolyDP(contours[i], approx, epsilon, true); // 如果是四边形 if (approx.size() == 4) { // 检查是否是凸四边形 if (isContourConvex(approx)) { quads.push_back(approx); } } } return quads; } // 计算四边形的最小外接矩形角度 double getQuadrilateralAngle(const vector<Point>& quad) { RotatedRect rect = minAreaRect(quad); return rect.angle; } // 检查两个四边形是否嵌套(一个在另一个内部) bool isNestedQuad(const vector<Point>& outer, const vector<Point>& inner) { for (const Point& pt : inner) { if (pointPolygonTest(outer, pt, false) < 0) { return false; } } return true; } int main() { // 硬编码图片路径 - 修改为您需要的实际路径 string imagePath = "C:/path/to/your/image.jpg"; // Windows路径 // string imagePath = "/home/user/images/test.jpg"; // Linux路径 // 1. 读取图片 Mat src = imread(imagePath); if (src.empty()) { cerr << "Error: Could not open or find the image at: " << imagePath << endl; cerr << "Please check the path and try again." << endl; return -1; } // 2. 转换为灰度图并进行边缘检测 Mat gray, edges; cvtColor(src, gray, COLOR_BGR2GRAY); GaussianBlur(gray, gray, Size(5, 5), 0); Canny(gray, edges, 50, 150); // 3. 霍夫变换检测直线 vector<Vec4i> lines; HoughLinesP(edges, lines, 1, CV_PI / 180, 50, 50, 10); // 4. 过滤短直线 double diag = sqrt(src.rows * src.rows + src.cols * src.cols); double minLength = diag / 20; vector<Vec4i> validLines; vector<double> angles; for (const auto& line : lines) { Point pt1(line[0], line[1]); Point pt2(line[2], line[3]); double length = norm(pt1 - pt2); if (length > minLength) { validLines.push_back(line); angles.push_back(getLineAngle(line)); } } // 5. 检测四边形轮廓 vector<vector<Point>> quads = detectQuadrilaterals(edges); bool hasNestedQuads = false; double nestedQuadAngle = 0; // 寻找嵌套的四边形(外框和内框) for (size_t i = 0; i < quads.size(); i++) { for (size_t j = 0; j < quads.size(); j++) { if (i == j) continue; // 检查是否嵌套 if (isNestedQuad(quads[i], quads[j])) { hasNestedQuads = true; nestedQuadAngle = getQuadrilateralAngle(quads[j]); // 使用内框的角度 break; } } if (hasNestedQuads) break; } // 6. 处理检测结果 Mat dst = src.clone(); string resultText; double rotationAngle = 0; bool foundFrame = false; const double angleTolerance = 5.0; // 如果有嵌套四边形(粗边框) if (hasNestedQuads) { foundFrame = true; rotationAngle = nestedQuadAngle; resultText = "Nested frame detected. Rotation angle: " + to_string(rotationAngle) + " degrees"; } // 否则检测水平和垂直线 else if (validLines.size() >= 2) { // 分组存储水平和垂直方向的直线 vector<Vec4i> horizontalLines, verticalLines; for (size_t i = 0; i < validLines.size(); i++) { if (angles[i] < angleTolerance || angles[i] > 180 - angleTolerance || (angles[i] > 90 - angleTolerance && angles[i] < 90 + angleTolerance)) { if (fabs(angles[i] - 90) < angleTolerance) { verticalLines.push_back(validLines[i]); } else { horizontalLines.push_back(validLines[i]); } } } // 检查是否构成线框 if (horizontalLines.size() >= 2 && verticalLines.size() >= 2) { foundFrame = true; // 计算水平线平均角度 double avgAngle = 0; int count = 0; for (double angle : angles) { if (angle < angleTolerance || angle > 180 - angleTolerance) { avgAngle += (angle > 90) ? angle - 180 : angle; count++; } } if (count > 0) { avgAngle /= count; rotationAngle = avgAngle; } resultText = "Frame detected. Rotation angle: " + to_string(rotationAngle) + " degrees"; } else { // 检查是否存在平行/共线直线 bool foundParallel = false; for (size_t i = 0; i < angles.size(); i++) { for (size_t j = i + 1; j < angles.size(); j++) { double angleDiff = calculateAngleDiff(angles[i], angles[j]); if (angleDiff < angleTolerance) { foundParallel = true; break; } } if (foundParallel) break; } if (foundParallel) { resultText = "Parallel lines detected but no frame found"; } else { resultText = "No frame or parallel lines detected"; } } } else { resultText = "No sufficient lines detected"; } cout << resultText << endl; // 7. 如果有线框,进行校正 if (foundFrame) { // 旋转图像校正 Point2f center(src.cols / 2.0f, src.rows / 2.0f); Mat rotMat = getRotationMatrix2D(center, rotationAngle, 1.0); warpAffine(src, dst, rotMat, src.size(), INTER_LINEAR, BORDER_CONSTANT, Scalar(255, 255, 255)); } // 8. 保存结果 size_t dotPos = imagePath.find_last_of("."); string baseName = (dotPos == string::npos) ? imagePath : imagePath.substr(0, dotPos); string ext = (dotPos == string::npos) ? ".jpg" : imagePath.substr(dotPos); string origPath = baseName + "_original" + ext; string correctedPath = baseName + "_corrected" + ext; imwrite(origPath, src); if (foundFrame) { imwrite(correctedPath, dst); cout << "Saved original image: " << origPath << endl; cout << "Saved corrected image: " << correctedPath << endl; } else { cout << "Saved original image: " << origPath << endl; } // 9. 显示结果(调整大小以适应屏幕) Mat displaySrc = resizeToFitScreen(src); Mat displayDst = resizeToFitScreen(dst); // 计算字体大小比例 double fontScaleSrc = min(0.7, 600.0 / displaySrc.cols); double fontScaleDst = min(0.7, 600.0 / displayDst.cols); // 添加文本到显示图像 putText(displaySrc, resultText, Point(20, 40), FONT_HERSHEY_SIMPLEX, fontScaleSrc, Scalar(0, 0, 255), 2); if (foundFrame) { putText(displayDst, "Corrected Image", Point(20, 40), FONT_HERSHEY_SIMPLEX, fontScaleDst, Scalar(0, 0, 255), 2); } // 显示原始图像 namedWindow("Original Image", WINDOW_NORMAL); imshow("Original Image", displaySrc); // 显示校正图像(如果有) if (foundFrame) { namedWindow("Corrected Image", WINDOW_NORMAL); imshow("Corrected Image", displayDst); } waitKey(0); return 0; }这两个函数添加,判断条件符合哪一个函数就执行哪一个函数的功能(前提是将后两个函数的功能合理的嵌合到第一个代码中)
07-05
<think>我们正在处理一个OCR方向校正的问题。用户希望在现有的PerformOCRAndRotation函数中集成两种不同的校正方法: 1. 基于倾斜角度检测和校正(detectSkewAngle/correctSkew) 2. 基于四边形检测和调整大小以适应屏幕(detectQuadrilaterals/resizeToFitScreen) 我们需要根据不同的条件选择执行不同的校正方法。因此,我们需要在函数内部添加逻辑来判断使用哪种校正方法。 思路: 1. 首先尝试使用四边形检测(detectQuadrilaterals)来检测图像中的四边形(例如文档边界)。如果检测到有效的四边形,则使用resizeToFitScreen进行校正,并返回校正后的图像。 2. 如果四边形检测失败(例如没有检测到四边形或者检测到的四边形不符合要求),则转而使用倾斜角度检测(detectSkewAngle)来检测图像的倾斜角度,然后使用correctSkew进行校正。 注意:我们可能需要设置一些阈值或条件来判断四边形检测是否成功。例如,我们可以要求检测到的四边形必须是凸四边形,面积必须大于图像面积的某个比例等。 另外,我们还需要考虑这两种方法可能都失败的情况,此时可能需要返回原始图像或采取其他措施。 我们假设已经存在以下函数(根据用户描述): - detectSkewAngle(image): 检测图像的倾斜角度,返回一个角度值(浮点数)。 - correctSkew(image, angle): 根据给定的角度校正图像,返回校正后的图像。 - detectQuadrilaterals(image): 检测图像中的四边形,返回一个四边形的顶点列表(四个点)或者None(如果没有检测到)。 - resizeToFitScreen(image, quadrilateral): 根据检测到的四边形进行透视变换,并调整大小以适应屏幕,返回校正后的图像。 我们将在PerformOCRAndRotation函数中集成这些功能。 伪代码: - 尝试使用detectQuadrilaterals检测四边形。 - 如果检测到四边形(且满足一定条件),则调用resizeToFitScreen进行校正。 - 否则,使用detectSkewAngle检测倾斜角度,然后调用correctSkew进行校正。 - 然后,对校正后的图像进行OCR。 但是,用户原来的函数可能已经包含了一些旋转校正(可能只是基于OCR方向的旋转)。现在我们要在OCR之前先进行图像校正(倾斜和四边形校正)。 因此,我们需要在原来的函数中,在OCR之前加入图像校正步骤。 我们假设原来的函数结构可能是这样的: def PerformOCRAndRotation(image): # 可能有一些预处理 # 进行OCR,得到文本和方向信息 # 根据方向信息旋转图像 # 返回结果 现在我们要在OCR之前先进行图像校正,然后再进行OCR和可能的旋转(如果OCR引擎返回了方向信息,我们可能还需要旋转,但注意:我们校正后的图像可能已经正了,所以OCR返回的方向可能接近0)。 因此,修改后的步骤: 1. 图像校正(先进行四边形校正或倾斜校正) 2. 然后进行OCR(可能返回方向和文本) 3. 根据OCR返回的方向再次旋转图像(如果必要的话)。注意:这一步可能是用户原来函数的功能,我们保留,但校正后的图像可能已经不需要这一步了,不过为了保险可以保留。 但是,用户可能希望先通过图像校正将图像摆正,然后再进行OCR,这样OCR的准确率更高。所以,我们在OCR之前做校正。 具体代码实现: 由于我们不知道detectQuadrilaterals和detectSkewAngle的具体实现,我们假设它们已经可用。 我们设计一个条件判断:如果检测到四边形,则使用四边形校正;否则使用倾斜校正。 另外,我们可能还需要考虑四边形检测的可靠性。例如,我们可能设置一个最小面积阈值,只有检测到的四边形面积大于图像面积的20%才认为有效。 步骤: 1. 将图像转换为灰度图(如果这些检测函数需要灰度图,但具体取决于函数实现,我们假设输入可以是彩色图,但通常检测算法需要灰度图,所以这里可能需要转换)。 2. 调用detectQuadrilaterals(image)获取四边形列表。注意:这个函数可能返回多个四边形,我们需要选择最合适的一个(例如最大的四边形)。 3. 如果找到了合适的四边形,则使用resizeToFitScreen进行校正。 4. 如果没有找到合适的四边形,则调用detectSkewAngle(image)得到倾斜角度,如果角度超过某个阈值(例如1度,避免无意义的校正),则进行倾斜校正。 5. 如果两种方法都不适用,则使用原始图像。 然后,将校正后的图像用于OCR。 我们修改PerformOCRAndRotation函数,在OCR之前加入上述步骤。 代码示例(Python,使用OpenCV作为图像处理库):</think>我将为您设计一个集成图像倾斜检测和四边形校正的OCR方向校正函数。以下是完整的解决方案: ### 解决方案:集成图像校正的OCR函数 ```python import cv2 import numpy as np from skimage.transform import rotate def PerformOCRAndRotation(image, ocr_engine, angle_threshold=1.0, area_threshold=0.2): """ 执行OCR并智能校正图像方向 参数: image: 输入图像 (numpy数组) ocr_engine: OCR引擎对象 angle_threshold: 倾斜角度阈值(度),小于此值不校正 area_threshold: 四边形最小面积阈值(占图像面积比例) """ # 预处理图像 processed_img = preprocess_image(image) # 尝试四边形检测 quadrilaterals = detectQuadrilaterals(processed_img) best_quad = select_best_quadrilateral(quadrilaterals, image.shape, area_threshold) if best_quad is not None: # 使用四边形校正 corrected_img = resizeToFitScreen(image, best_quad) print("使用四边形校正方法") else: # 使用倾斜校正 angle = detectSkewAngle(processed_img) if abs(angle) >= angle_threshold: corrected_img = correctSkew(image, angle) print(f"使用倾斜校正方法,角度: {angle:.2f}°") else: corrected_img = image print("无需校正,倾斜角度小于阈值") # 执行OCR并获取方向 ocr_result = ocr_engine.process(corrected_img) final_angle = ocr_result.orientation_angle if hasattr(ocr_result, 'orientation_angle') else 0 # 最终旋转调整 if abs(final_angle) >= angle_threshold: return rotate(corrected_img, final_angle, resize=True), ocr_result return corrected_img, ocr_result # --- 辅助函数 --- def preprocess_image(img): """图像预处理:灰度化、降噪、二值化""" gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) denoised = cv2.fastNlMeansDenoising(gray, h=10) _, binary = cv2.threshold(denoised, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) return binary def detectSkewAngle(img): """使用霍夫变换检测倾斜角度""" edges = cv2.Canny(img, 50, 150, apertureSize=3) lines = cv2.HoughLinesP(edges, 1, np.pi/180, 100, minLineLength=100, maxLineGap=10) angles = [] if lines is not None: for line in lines: x1, y1, x2, y2 = line[0] angle = np.degrees(np.arctan2(y2 - y1, x2 - x1)) if abs(angle) < 45: # 过滤接近垂直的线 angles.append(angle) return np.median(angles) if angles else 0.0 def correctSkew(img, angle): """根据角度校正图像""" return rotate(img, angle, resize=True) def detectQuadrilaterals(img): """检测图像中的四边形轮廓""" contours, _ = cv2.findContours(img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) quads = [] for cnt in contours: epsilon = 0.02 * cv2.arcLength(cnt, True) approx = cv2.approxPolyDP(cnt, epsilon, True) if len(approx) == 4 and cv2.isContourConvex(approx): quads.append(approx.reshape(4, 2)) return quads def select_best_quadrilateral(quads, img_shape, area_threshold): """选择最佳四边形:面积最大且满足阈值要求""" if not quads: return None img_area = img_shape[0] * img_shape[1] valid_quads = [q for q in quads if cv2.contourArea(q) > area_threshold * img_area] if not valid_quads: return None return max(valid_quads, key=cv2.contourArea) def resizeToFitScreen(img, quadrilateral): """透视变换校正四边形区域""" # 将四边形点排序为 [左上,右上,右下,左下] rect = order_points(quadrilateral) (tl, tr, br, bl) = rect # 计算新图像宽度(取上下边最大长度) width_top = np.linalg.norm(tr - tl) width_bottom = np.linalg.norm(br - bl) max_width = max(int(width_top), int(width_bottom)) # 计算新图像高度(取左右边最大长度) height_left = np.linalg.norm(bl - tl) height_right = np.linalg.norm(br - tr) max_height = max(int(height_left), int(height_right)) # 目标点坐标 dst = np.array([ [0, 0], [max_width - 1, 0], [max_width - 1, max_height - 1], [0, max_height - 1]], dtype="float32") # 计算透视变换矩阵并应用 M = cv2.getPerspectiveTransform(rect, dst) warped = cv2.warpPerspective(img, M, (max_width, max_height)) return warped def order_points(pts): """对四边形顶点进行排序:左上,右上,右下,左下""" rect = np.zeros((4, 2), dtype="float32") s = pts.sum(axis=1) rect[0] = pts[np.argmin(s)] # 左上:和最小 rect[2] = pts[np.argmax(s)] # 右下:和最大 diff = np.diff(pts, axis=1) rect[1] = pts[np.argmin(diff)] # 右上:差最小 rect[3] = pts[np.argmax(diff)] # 左下:差最大 return rect ``` ### 关键设计说明 1. **智能校正流程**: - 优先尝试四边形检测(适合文档扫描场景) - 四边形检测失败时回退到倾斜校正 - 最终使用OCR方向进行微调 2. **条件判断逻辑**: ```python if best_quad is not None: # 检测到有效四边形 使用四边形校正 else: # 无有效四边形 检测倾斜角度 if 角度 > 阈值: 使用倾斜校正 else: 保持原图 ``` 3. **阈值控制**: - `angle_threshold`:避免对微小角度进行校正(默认1°) - `area_threshold`:四边形最小有效面积比例(默认20%) 4. **四边形选择策略**: - 筛选凸四边形轮廓 - 选择面积最大的有效四边形 - 透视变换时自动排序顶点 ### 使用示例 ```python # 初始化OCR引擎(示例使用pytesseract) import pytesseract class OCRWrapper: def process(self, img): text = pytesseract.image_to_string(img) orientation = pytesseract.image_to_osd(img).split("\n") angle = float(orientation[2].split(':')[1]) return type('OCRResult', (), {'text': text, 'orientation_angle': angle}) # 处理图像 image = cv2.imread("document.jpg") ocr_engine = OCRWrapper() corrected_img, result = PerformOCRAndRotation(image, ocr_engine) # 保存结果 cv2.imwrite("corrected_document.jpg", corrected_img) print("识别文本:", result.text) ``` ### 性能优化建议 1. **多线程处理**:对大型图像可并行执行两种检测方法 2. **缓存机制**:存储中间处理结果减少重复计算 3. **GPU加速**:使用CUDA加速OpenCV操作 4. **参数调优**:根据实际场景调整阈值参数 5. **异常处理**:添加try-catch块处理边缘情况 此设计实现了根据图像特征自动选择最优校正方法,并通过预处理和参数优化提高了处理效率,同时保留了OCR引擎的最终方向调整能力[^1]。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值