面向教室场景的视觉透视畸变校正与位置感知多方法研究_视图感知位置编码(透视畸变)-优快云博客

摘要
在基于单目视觉的教室智能分析系统中，透视投影固有的“近大远小”效应导致学生图像中的表观尺寸（像素距离）与其实际物理位置严重不符，破坏了基于像素的空间度量一致性。为从根本上解决此问题，本文系统性地研究并提出了四种核心的技术路径：基于平面假设的逆透视变换法、基于多视角几何的立体视觉法、基于深度传感器的深度信息映射法，以及基于神经网络的端到端学习法。论文详细阐述了每种方法的技术原理、实施步骤、适用条件及优缺点，并通过对比分析，为不同成本、精度和实施难度的应用场景提供了全面的解决方案选型依据。

关键词：透视畸变；位置感知；逆透视变换；立体视觉；深度感知；神经网络；多方法比较

1. 引言

智能教育场景下，利用计算机视觉技术对学生位置、行为及互动进行无感化分析已成为研究热点。然而，部署于教室前部或顶部的单目摄像头所捕获的图像存在严重的透视畸变：一个前排学生可能在图像中占据数百像素，而后排学生仅数十像素。这种尺度与位置的失真使得基于像素的直接度量（如距离计算、区域判断）失去物理意义，严重影响了考勤、注意力监测、课堂参与度分析等应用的准确性与公平性。

为解决此核心瓶颈，单一技术方案往往存在局限性。本文旨在打破单一方法论的约束，构建一个多层次、多模态的技术体系，从不同维度攻克透视畸变问题，为构建高鲁棒性的教室视觉感知系统提供坚实的技术支撑。

2. 技术方案一：基于平面假设的逆透视变换法

此方法是解决地面目标位置感知最经典、应用最广泛的方案。

核心原理：假设所有目标（学生）均位于一个已知的平面上（如教室地面），通过计算图像平面与该物理平面之间的单应性矩阵，将原始图像重投影为一个无透视效果的顶视图。在此顶视图中，像素距离与真实物理距离成固定比例。
实施步骤：
1. 相机标定：使用张正友标定法，获取相机内参矩阵和畸变系数，校正镜头畸变。
2. 定义地面坐标系：在教室地面选择至少4个已知物理尺寸的控制点（如地砖角点）。
3. 计算单应性矩阵：建立控制点的图像像素坐标与其地面物理坐标的对应关系，求解单应性矩阵 HH。
4. 生成顶视图：对每一帧校正后的图像，利用 H−1H−1 进行逆透视变换，生成标准化的顶视图。
5. 目标定位：在顶视图上进行目标检测，目标的像素坐标可直接通过预设比例尺转换为物理坐标。
优点：原理清晰、计算效率高、结果精确（在平面假设成立时）。
缺点：强烈依赖于“地面平坦”的假设，无法处理高度变化（如学生举手、起立）或非地面目标。

3. 技术方案二：基于多视角几何的立体视觉法

该方法通过模拟人眼双目视觉，直接恢复场景的三维信息。

核心原理：使用两个（或多个）相机从不同视角对同一场景进行拍摄。通过找到同一目标点在两个图像中的对应点（立体匹配），利用三角测量原理计算该点的三维坐标。
实施步骤：
1. 系统标定：分别对每个相机进行内参标定，再进行立体标定，获取两个相机之间的相对位置和姿态（旋转矩阵 RR 和平移向量 TT）。
2. 立体校正：将两个相机的图像平面重投影到共面且行对准的平面上，使得匹配点只需在同一水平线上搜索，极大简化计算。
3. 立体匹配：在校正后的图像对中，为左图的每个像素点在右图中寻找对应点，计算视差。
4. 深度计算：根据公式 Z=f∗B/dZ=f∗B/d（其中 ff 为焦距，BB 为基线距离，dd 为视差）计算每个点的深度值 ZZ，进而得到完整的三维坐标 (X,Y,Z)(X,Y,Z)。
优点：能够直接获取场景的深度信息，不受平面假设限制，可处理任意位置的目标。
缺点：系统部署复杂、成本高；立体匹配算法计算量大，且在高纹理重复、遮挡区域易匹配失败。

4. 技术方案三：基于深度传感器的深度信息映射法

利用主动感知设备，绕过复杂的视觉算法，直接获取深度图。

核心原理：采用如英特尔RealSense、微软Kinect或苹果LiDAR等深度摄像头。这些设备通过结构光、飞行时间法或主动立体视觉等技术，主动发射并接收光信号，直接为每一个像素输出其与相机的距离（深度值）。
实施步骤：
1. 数据采集：深度传感器同步输出彩色图像和对应的深度图。
2. 坐标对齐与配准：将彩色图像与深度图进行像素级对齐。
3. 坐标转换：利用相机的内参，将每个像素的 (u,v,d)(u,v,d) 坐标（dd 为深度值）通过以下公式转换为三维物理坐标 (X,Y,Z)(X,Y,Z)：
  X=(u−cx)∗d/fxX=(u−cx)∗d/fx
  Y=(v−cy)∗d/fyY=(v−cy)∗d/fy
  Z=dZ=d
4. 物理空间分析：在得到的三维点云或坐标数据上进行目标检测和空间关系分析。
优点：深度信息直接、准确、稳定，不受纹理光照影响，实时性强。
缺点：设备成本高于普通摄像头；有效测量距离和范围有限；可能受强环境光干扰。

5. 技术方案四：基于神经网络的端到端学习法

一种数据驱动的颠覆性思路，让网络直接从数据中学习畸变校正或坐标映射。

核心原理：构建一个深度神经网络，其输入是原始透视图像，输出可以是：
- 校正后的图像：网络学习一个类似于逆透视变换的映射，生成标准顶视图。
- 直接的物理坐标：网络在目标检测的同时，直接回归每个目标在地面坐标系下的 (X,Y)(X,Y) 坐标。
- 深度图：网络根据单张RGB图像，估计每个像素的深度值。
实施步骤：
1. 数据准备：收集大量教室场景图像，并制作真值。真值可以是：
  - 对应顶视图的图像。
  - 图像中每个学生边界框及其对应的地面物理坐标。
  - 通过深度传感器获取的深度图。
2. 模型训练：选择适当的网络架构（如CNN、Transformer），以原始图像为输入，以对应的真值为监督信号，进行端到端训练。
3. 推理部署：将训练好的模型部署到应用系统中，对新的图像进行前向传播，得到校正后的结果或直接的目标物理坐标。
优点：潜力巨大，无需复杂的相机标定和手工建模，具备强大的场景自适应能力。
缺点：依赖大量、高质量、已标注的训练数据；模型为“黑箱”，可解释性差；泛化能力可能受训练集分布影响。

6. 多方法对比分析与选型建议

表：四种技术方案综合对比

方法	核心原理	精度	成本	复杂度	适用场景	主要局限
逆透视变换	平面投影几何	高（平面内）	低	中	学生座位定位、地面区域入侵检测	依赖平面假设，无法处理高度变化
立体视觉	多视角三角测量	高	中-高	高	高精度3D定位、姿态估计、任意目标测量	部署复杂，匹配计算量大，有遮挡问题
深度传感器	主动光学测距	高	中-高	低	实时3D重建、机器人导航、对精度和实时性要求高的应用	成本较高，测量范围和环境受限
端到端学习	数据驱动映射	中-高（依赖数据）	低（部署后）	高（训练阶段）	大规模部署、场景自适应要求高、可作为其他方法的补充	依赖大量标注数据，模型泛化能力存疑

选型建议：

追求性价比与地面定位：首选逆透视变换法，其在满足地面目标定位需求时，是成本最低、效果最稳定的方案。
需要完整三维信息：若需分析学生举手、起立等三维姿态，且有预算，可选择立体视觉法或深度传感器法。其中，深度传感器更稳定、易用。
面向未来与大规模应用：对于有海量数据积累的大型科技公司或研究机构，可积极探索端到端学习法，将其作为提升系统自适应能力和性能上限的长期技术方向。

7. 结论与展望

透视畸变是教室视觉分析中一个基础且关键的问题。本文系统性地提出了逆透视变换、立体视觉、深度传感和端到端学习四种技术路径，它们从不同维度提供了解决方案，各有千秋，互为补充。

未来，融合多种方法的混合模型将成为趋势。例如，利用逆透视变换进行快速初定位，辅以轻量级深度学习模型进行高度估计或异常校正；或利用生成式AI合成大量训练数据，以解决端到端学习方法的数据瓶颈。通过多技术路径的协同与融合，有望在复杂的真实教室环境中，构建起更加鲁棒、精准和公平的智能视觉感知系统。

方法一：逆透视变换法的完整代码实现

python

复制

下载

import cv2
import numpy as np
import matplotlib.pyplot as plt

class PerspectiveCorrector:
    def __init__(self, camera_matrix=None, dist_coeffs=None):
        """
        初始化透视校正器
        Args:
            camera_matrix: 相机内参矩阵
            dist_coeffs: 畸变系数
        """
        self.camera_matrix = camera_matrix
        self.dist_coeffs = dist_coeffs
        self.homography_matrix = None
        self.inv_homography_matrix = None
        
    def calibrate_camera(self, calibration_images, pattern_size=(9, 6), square_size=0.025):
        """
        相机标定 - 张正友标定法
        Args:
            calibration_images: 标定板图像列表
            pattern_size: 棋盘格内角点数量
            square_size: 棋盘格方格大小(米)
        """
        # 准备物体点: (0,0,0), (1,0,0), (2,0,0) ....,(8,5,0)
        objp = np.zeros((pattern_size[0] * pattern_size[1], 3), np.float32)
        objp[:, :2] = np.mgrid[0:pattern_size[0], 0:pattern_size[1]].T.reshape(-1, 2)
        objp *= square_size

        # 存储物体点和图像点
        objpoints = []  # 真实世界3D点
        imgpoints = []  # 图像2D点

        for img in calibration_images:
            if isinstance(img, str):
                gray = cv2.imread(img, cv2.IMREAD_GRAYSCALE)
            else:
                gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

            # 查找棋盘格角点
            ret, corners = cv2.findChessboardCorners(gray, pattern_size, None)

            if ret:
                objpoints.append(objp)
                
                # 亚像素精确化
                criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
                corners_refined = cv2.cornerSubPix(gray, corners, (11, 11), (-1, -1), criteria)
                imgpoints.append(corners_refined)

        # 相机标定
        ret, self.camera_matrix, self.dist_coeffs, rvecs, tvecs = cv2.calibrateCamera(
            objpoints, imgpoints, gray.shape[::-1], None, None)
        
        print(f"相机标定完成，重投影误差: {ret}")
        return ret

    def compute_homography(self, image_points, world_points):
        """
        计算单应性矩阵
        Args:
            image_points: 图像中的4个点坐标 [[x1,y1], [x2,y2], ...]
            world_points: 真实世界中的4个点坐标 [[X1,Y1], [X2,Y2], ...]
        """
        # 确保输入格式正确
        image_points = np.array(image_points, dtype=np.float32)
        world_points = np.array(world_points, dtype=np.float32)
        
        # 计算单应性矩阵
        self.homography_matrix, status = cv2.findHomography(image_points, world_points)
        self.inv_homography_matrix = np.linalg.inv(self.homography_matrix)
        
        return self.homography_matrix

    def undistort_image(self, image):
        """校正图像畸变"""
        if self.camera_matrix is not None and self.dist_coeffs is not None:
            return cv2.undistort(image, self.camera_matrix, self.dist_coeffs)
        return image

    def create_birdseye_view(self, image, output_size=(800, 600)):
        """生成鸟瞰图"""
        if self.homography_matrix is None:
            raise ValueError("请先计算单应性矩阵")
            
        # 校正畸变
        undistorted = self.undistort_image(image)
        
        # 透视变换
        birdseye = cv2.warpPerspective(undistorted, self.homography_matrix, output_size)
        return birdseye

    def pixel_to_world(self, pixel_points):
        """将像素坐标转换为世界坐标"""
        if self.homography_matrix is None:
            raise ValueError("请先计算单应性矩阵")
            
        pixel_points = np.array(pixel_points, dtype=np.float32)
        if len(pixel_points.shape) == 1:
            pixel_points = pixel_points.reshape(1, -1)
            
        # 添加齐次坐标
        pixel_homo = np.column_stack([pixel_points, np.ones(len(pixel_points))])
        
        # 变换到世界坐标
        world_homo = np.dot(self.homography_matrix, pixel_homo.T).T
        world_coords = world_homo[:, :2] / world_homo[:, 2:3]
        
        return world_coords

    def world_to_pixel(self, world_points):
        """将世界坐标转换为像素坐标"""
        if self.inv_homography_matrix is None:
            raise ValueError("请先计算单应性矩阵")
            
        world_points = np.array(world_points, dtype=np.float32)
        if len(world_points.shape) == 1:
            world_points = world_points.reshape(1, -1)
            
        # 添加齐次坐标
        world_homo = np.column_stack([world_points, np.ones(len(world_points))])
        
        # 变换到像素坐标
        pixel_homo = np.dot(self.inv_homography_matrix, world_homo.T).T
        pixel_coords = pixel_homo[:, :2] / pixel_homo[:, 2:3]
        
        return pixel_coords

# 使用示例
def demo_perspective_correction():
    # 初始化校正器
    corrector = PerspectiveCorrector()
    
    # 假设我们已经有了标定好的相机参数（实际应用中需要通过标定获得）
    # 这里使用模拟参数
    corrector.camera_matrix = np.array([
        [800, 0, 320],
        [0, 800, 240],
        [0, 0, 1]
    ], dtype=np.float32)
    corrector.dist_coeffs = np.zeros((4, 1))
    
    # 定义地面控制点（图像坐标和世界坐标）
    # 图像坐标 (从原始图像中选择4个点)
    image_points = [
        [100, 100],   # 左上
        [540, 100],   # 右上
        [540, 380],   # 右下
        [100, 380]    # 左下
    ]
    
    # 对应的世界坐标 (米)
    world_points = [
        [0, 0],       # 左上 (0,0)
        [5, 0],       # 右上 (5,0)  
        [5, 4],       # 右下 (5,4)
        [0, 4]        # 左下 (0,4)
    ]
    
    # 计算单应性矩阵
    H = corrector.compute_homography(image_points, world_points)
    print("单应性矩阵:\n", H)
    
    # 加载测试图像
    # image = cv2.imread("classroom.jpg")
    # 这里用模拟图像代替
    image = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
    
    # 生成鸟瞰图
    birdseye = corrector.create_birdseye_view(image)
    
    # 测试坐标转换
    test_pixel = [320, 240]  # 图像中心点
    world_coord = corrector.pixel_to_world(test_pixel)
    print(f"像素坐标 {test_pixel} -> 世界坐标 {world_coord[0]}")
    
    # 测试反向转换
    test_world = [2.5, 2.0]  # 世界坐标中心
    pixel_coord = corrector.world_to_pixel(test_world)
    print(f"世界坐标 {test_world} -> 像素坐标 {pixel_coord[0]}")
    
    return corrector, birdseye

# 运行示例
if __name__ == "__main__":
    corrector, result = demo_perspective_correction()

方法二：深度传感器法的代码实现

python

复制

下载

import cv2
import numpy as np
import pyrealsense2 as rs  # Intel RealSense SDK

class DepthSensorProcessor:
    def __init__(self):
        """初始化深度传感器处理器"""
        self.pipeline = None
        self.config = None
        self.align = None
        self.depth_scale = None
        self.intrinsics = None
        
    def initialize_realsense(self):
        """初始化RealSense相机"""
        try:
            # 创建管道和配置
            self.pipeline = rs.pipeline()
            self.config = rs.config()
            
            # 配置流
            self.config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30)
            self.config.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 30)
            
            # 启动管道
            profile = self.pipeline.start(self.config)
            
            # 获取深度缩放比例
            depth_sensor = profile.get_device().first_depth_sensor()
            self.depth_scale = depth_sensor.get_depth_scale()
            
            # 创建对齐对象（将深度图与彩色图对齐）
            align_to = rs.stream.color
            self.align = rs.align(align_to)
            
            # 获取相机内参
            color_profile = profile.get_stream(rs.stream.color)
            self.intrinsics = color_profile.as_video_stream_profile().get_intrinsics()
            
            print("RealSense相机初始化成功")
            return True
            
        except Exception as e:
            print(f"RealSense初始化失败: {e}")
            return False
    
    def get_aligned_frames(self):
        """获取对齐的深度图和彩色图"""
        if self.pipeline is None:
            raise ValueError("请先初始化RealSense相机")
            
        # 等待一组连贯的帧
        frames = self.pipeline.wait_for_frames()
        
        # 对齐深度帧到彩色帧
        aligned_frames = self.align.process(frames)
        
        # 获取对齐后的深度帧和彩色帧
        depth_frame = aligned_frames.get_depth_frame()
        color_frame = aligned_frames.get_color_frame()
        
        if not depth_frame or not color_frame:
            return None, None
            
        # 转换为numpy数组
        depth_image = np.asanyarray(depth_frame.get_data())
        color_image = np.asanyarray(color_frame.get_data())
        
        return depth_image, color_image
    
    def depth_to_3d_coordinates(self, depth_image, pixel_points):
        """
        将深度图中的像素坐标转换为3D世界坐标
        Args:
            depth_image: 深度图
            pixel_points: 像素坐标列表 [[x1,y1], [x2,y2], ...]
        Returns:
            3D坐标列表 [[X1,Y1,Z1], [X2,Y2,Z2], ...] (单位: 米)
        """
        if self.intrinsics is None:
            raise ValueError("相机内参未初始化")
            
        coordinates_3d = []
        
        for pixel in pixel_points:
            x, y = int(pixel[0]), int(pixel[1])
            
            # 确保坐标在图像范围内
            if (0 <= x < depth_image.shape[1] and 0 <= y < depth_image.shape[0]):
                # 获取深度值（单位：米）
                depth = depth_image[y, x] * self.depth_scale
                
                if depth > 0:  # 有效的深度值
                    # 使用相机内参将像素坐标和深度转换为3D坐标
                    point_3d = rs.rs2_deproject_pixel_to_point(
                        self.intrinsics, [x, y], depth
                    )
                    coordinates_3d.append(point_3d)
                else:
                    coordinates_3d.append([0, 0, 0])  # 无效深度
            else:
                coordinates_3d.append([0, 0, 0])  # 无效坐标
        
        return np.array(coordinates_3d)
    
    def create_depth_colormap(self, depth_image, clip_max=5000):
        """
        将深度图转换为彩色图用于可视化
        Args:
            depth_image: 原始深度图
            clip_max: 最大深度值限制（毫米）
        """
        # 应用颜色映射
        depth_colormap = cv2.applyColorMap(
            cv2.convertScaleAbs(depth_image, alpha=0.03), 
            cv2.COLORMAP_JET
        )
        return depth_colormap
    
    def calculate_distance_between_points(self, point1_3d, point2_3d):
        """计算两个3D点之间的欧氏距离"""
        return np.linalg.norm(np.array(point1_3d) - np.array(point2_3d))
    
    def detect_students_and_calculate_distances(self, color_image, depth_image, detection_model=None):
        """
        检测学生并计算真实世界距离
        Args:
            color_image: 彩色图像
            depth_image: 深度图像
            detection_model: 目标检测模型（如YOLO）
        Returns:
            学生位置和距离信息
        """
        if detection_model is None:
            # 使用简单的背景减除作为示例
            return self._simple_student_detection(color_image, depth_image)
        else:
            # 使用深度学习模型进行检测
            return self._deep_learning_detection(color_image, depth_image, detection_model)
    
    def _simple_student_detection(self, color_image, depth_image):
        """简单的学生检测方法（示例）"""
        # 转换为灰度图
        gray = cv2.cvtColor(color_image, cv2.COLOR_BGR2GRAY)
        
        # 应用背景减除（这里使用简单阈值，实际应用应该用更复杂的方法）
        _, thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
        
        # 查找轮廓
        contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
        
        student_positions = []
        
        for contour in contours:
            if cv2.contourArea(contour) > 1000:  # 过滤小轮廓
                # 获取边界框
                x, y, w, h = cv2.boundingRect(contour)
                
                # 计算底部中心点（假设这是学生的脚部位置）
                foot_point = [x + w//2, y + h]
                
                # 转换为3D坐标
                point_3d = self.depth_to_3d_coordinates(depth_image, [foot_point])[0]
                
                if point_3d[2] > 0:  # 有效深度
                    student_positions.append({
                        'bbox': (x, y, w, h),
                        'pixel_position': foot_point,
                        'world_position': point_3d,
                        'depth': point_3d[2]
                    })
        
        # 计算学生之间的距离
        distances = []
        for i in range(len(student_positions)):
            for j in range(i + 1, len(student_positions)):
                dist = self.calculate_distance_between_points(
                    student_positions[i]['world_position'],
                    student_positions[j]['world_position']
                )
                distances.append({
                    'student1': i,
                    'student2': j,
                    'distance': dist
                })
        
        return student_positions, distances
    
    def visualize_results(self, color_image, depth_image, student_positions, distances):
        """可视化结果"""
        # 创建深度彩色图
        depth_colormap = self.create_depth_colormap(depth_image)
        
        # 在彩色图上绘制检测结果
        result_image = color_image.copy()
        
        for i, student in enumerate(student_positions):
            x, y, w, h = student['bbox']
            
            # 绘制边界框
            cv2.rectangle(result_image, (x, y), (x + w, y + h), (0, 255, 0), 2)
            
            # 绘制位置点
            px, py = student['pixel_position']
            cv2.circle(result_image, (px, py), 5, (255, 0, 0), -1)
            
            # 显示深度信息
            depth_text = f"D: {student['depth']:.2f}m"
            cv2.putText(result_image, depth_text, (x, y - 10), 
                       cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)
        
        # 显示距离信息
        for dist_info in distances[:3]:  # 只显示前3个距离
            s1 = student_positions[dist_info['student1']]
            s2 = student_positions[dist_info['student2']]
            
            # 绘制连线
            cv2.line(result_image, 
                    tuple(s1['pixel_position']), 
                    tuple(s2['pixel_position']), 
                    (0, 0, 255), 2)
            
            # 显示距离
            mid_point = (
                (s1['pixel_position'][0] + s2['pixel_position'][0]) // 2,
                (s1['pixel_position'][1] + s2['pixel_position'][1]) // 2
            )
            dist_text = f"{dist_info['distance']:.2f}m"
            cv2.putText(result_image, dist_text, mid_point, 
                       cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1)
        
        return result_image, depth_colormap

# 使用示例
def demo_depth_sensing():
    """深度传感器方法演示"""
    processor = DepthSensorProcessor()
    
    # 初始化RealSense
    if processor.initialize_realsense():
        try:
            # 获取帧
            depth_image, color_image = processor.get_aligned_frames()
            
            if depth_image is not None and color_image is not None:
                # 检测学生并计算距离
                student_positions, distances = processor.detect_students_and_calculate_distances(
                    color_image, depth_image
                )
                
                # 可视化结果
                result_img, depth_colormap = processor.visualize_results(
                    color_image, depth_image, student_positions, distances
                )
                
                # 显示结果
                cv2.imshow('Color Image', color_image)
                cv2.imshow('Depth Map', depth_colormap)
                cv2.imshow('Detection Results', result_img)
                
                print(f"检测到 {len(student_positions)} 个学生")
                for i, dist in enumerate(distances):
                    print(f"学生 {dist['student1']} 和 学生 {dist['student2']} 的距离: {dist['distance']:.2f} 米")
                
                cv2.waitKey(0)
                cv2.destroyAllWindows()
                
        finally:
            # 停止管道
            if processor.pipeline:
                processor.pipeline.stop()
    
    return processor

# 如果没有RealSense相机，使用模拟数据测试
def demo_depth_sensing_simulation():
    """深度传感器模拟演示"""
    processor = DepthSensorProcessor()
    
    # 模拟相机内参
    processor.intrinsics = type('obj', (object,), {
        'fx': 600, 'fy': 600, 'ppx': 320, 'ppy': 240,
        'width': 640, 'height': 480, 'model': 0, 'coeffs': [0,0,0,0,0]
    })()
    processor.depth_scale = 0.001  # 1mm
    
    # 创建模拟深度图（简单的梯度）
    depth_image = np.zeros((480, 640), dtype=np.uint16)
    for i in range(480):
        depth_image[i, :] = 1000 + i * 2  # 深度从1m到1.96m
    
    # 创建模拟彩色图
    color_image = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
    
    # 模拟学生位置（像素坐标）
    test_points = [[100, 200], [300, 250], [500, 300]]
    
    # 转换为3D坐标
    coordinates_3d = processor.depth_to_3d_coordinates(depth_image, test_points)
    
    print("深度传感器模拟测试:")
    for i, (pixel, world) in enumerate(zip(test_points, coordinates_3d)):
        print(f"点{i}: 像素{pixel} -> 3D坐标({world[0]:.2f}, {world[1]:.2f}, {world[2]:.2f})m")
    
    # 计算距离
    dist = processor.calculate_distance_between_points(coordinates_3d[0], coordinates_3d[1])
    print(f"点0和点1之间的距离: {dist:.2f}米")
    
    return processor

if __name__ == "__main__":
    # 如果有RealSense相机，运行真实演示
    # demo_depth_sensing()
    
    # 否则运行模拟演示
    demo_depth_sensing_simulation()

两种方法的集成使用示例

python

复制

下载

class MultiMethodDistanceCalculator:
    """多方法距离计算器"""
    
    def __init__(self):
        self.perspective_corrector = PerspectiveCorrector()
        self.depth_processor = DepthSensorProcessor()
        self.active_method = "perspective"  # 默认使用透视变换法
        
    def set_active_method(self, method):
        """设置激活的方法"""
        valid_methods = ["perspective", "depth"]
        if method in valid_methods:
            self.active_method = method
        else:
            raise ValueError(f"方法必须是: {valid_methods}")
    
    def calculate_distances(self, image, points):
        """
        根据激活的方法计算距离
        Args:
            image: 输入图像
            points: 点坐标列表
        Returns:
            距离矩阵
        """
        if self.active_method == "perspective":
            return self._calculate_with_perspective(image, points)
        else:
            return self._calculate_with_depth(image, points)
    
    def _calculate_with_perspective(self, image, points):
        """使用透视变换法计算距离"""
        # 转换到世界坐标
        world_coords = self.perspective_corrector.pixel_to_world(points)
        
        # 计算所有点对之间的欧氏距离
        n = len(world_coords)
        distance_matrix = np.zeros((n, n))
        
        for i in range(n):
            for j in range(n):
                if i != j:
                    dist = np.linalg.norm(world_coords[i] - world_coords[j])
                    distance_matrix[i][j] = dist
        
        return distance_matrix
    
    def _calculate_with_depth(self, image, points):
        """使用深度传感器法计算距离"""
        # 获取深度图（这里需要实际的深度传感器）
        depth_image, color_image = self.depth_processor.get_aligned_frames()
        
        # 转换为3D坐标
        coordinates_3d = self.depth_processor.depth_to_3d_coordinates(depth_image, points)
        
        # 计算距离矩阵
        n = len(coordinates_3d)
        distance_matrix = np.zeros((n, n))
        
        for i in range(n):
            for j in range(n):
                if i != j:
                    dist = self.depth_processor.calculate_distance_between_points(
                        coordinates_3d[i], coordinates_3d[j]
                    )
                    distance_matrix[i][j] = dist
        
        return distance_matrix

# 集成演示
def demo_integrated_system():
    """集成系统演示"""
    calculator = MultiMethodDistanceCalculator()
    
    # 测试点
    test_points = [[100, 100], [200, 150], [300, 200], [400, 250]]
    
    # 使用透视变换法
    calculator.set_active_method("perspective")
    distances_perspective = calculator.calculate_distances(None, test_points)
    print("透视变换法距离矩阵:")
    print(distances_perspective)
    
    # 使用深度传感器法（模拟）
    calculator.set_active_method("depth")
    # 这里需要实际的深度传感器数据
    
    return calculator

if __name__ == "__main__":
    demo_integrated_system()

安装依赖

要运行这些代码，需要安装以下依赖：

bash

复制

下载

pip install opencv-python
pip install numpy
pip install matplotlib
# 对于RealSense深度传感器
pip install pyrealsense2