camera orientation

本文介绍了如何使用setPreviewSize方法来调整相机预览画面的尺寸,并强调了在改变尺寸前停止预览的重要性。同时,文章详细说明了在不同显示方向下设置合适预览尺寸的方法。
public void setPreviewSize (int width, int height)
Added in  API level 1

Sets the dimensions for preview pictures. If the preview has already started, applications should stop the preview first before changing preview size. The sides of width and height are based on camera orientation. That is, the preview size is the size before it is rotated by display orientation. So applications need to consider the display orientation while setting preview size. For example, suppose the camera supports both 480x320 and 320x480 preview sizes. The application wants a 3:2 preview ratio. If the display orientation is set to 0 or 180, preview size should be set to 480x320. If the display orientation is set to 90 or 270, preview size should be set to 320x480. The display orientation should also be considered while setting picture size and thumbnail size.

import airsim import numpy as np import open3d as o3d import networkx as nx import matplotlib.pyplot as plt from scipy.spatial.transform import Rotation as R def find_closest_node(graph, point, thresh=5, return_dist=False): min_distance = float('inf') closest_node = None point = np.array(point) for node, data in graph.nodes(data=True): n_pos = np.array(data['pos']) distance = np.linalg.norm(point-n_pos) if distance < min_distance: min_distance = distance closest_node = node # print(f"min_distance: {min_distance}") if min_distance < thresh: if return_dist: return closest_node, min_distance else: return closest_node else: if return_dist: return None, -1 else: return None def compute_shortest_path(G, A, B, weight='None'): node_A, _ = find_closest_node(G, A, return_dist=True) node_B, _ = find_closest_node(G, B, return_dist=True) if weight != 'None': node_path = nx.shortest_path(G, source=node_A, target=node_B, weight=weight) else: node_path = nx.shortest_path(G, source=node_A, target=node_B) path = [G.nodes[node].get('pos', None)+G.nodes[node].get('ori', None) for node in node_path] return path, node_path def visualize_point_cloud(point_cloud): pcd = o3d.geometry.PointCloud() pcd.points = o3d.utility.Vector3dVector(point_cloud) o3d.visualization.draw_geometries([pcd]) def statistical_filter(point_cloud, k=30, std_dev_multiplier=1.0): from sklearn.neighbors import NearestNeighbors nbrs = NearestNeighbors(n_neighbors=k).fit(point_cloud) distances, _ = nbrs.kneighbors(point_cloud) mean_distances = np.mean(distances, axis=1) global_mean = np.mean(mean_distances) global_std = np.std(mean_distances) filter_idx = np.where(mean_distances < global_mean + std_dev_multiplier * global_std)[0] filtered_points = point_cloud[filter_idx] return filtered_points, filter_idx def get_IntrinsicMatrix(fov, width, height): intrinsic_mat = np.zeros([3,3]) intrinsic_mat[0, 0] = width / (2*np.tan(np.deg2rad(fov/2))) intrinsic_mat[1, 1] = height / (2*np.tan(np.deg2rad(fov/2))) intrinsic_mat[0, 2] = width / 2 intrinsic_mat[1, 2] = height / 2 intrinsic_mat[2, 2] = 1 return intrinsic_mat def get_ExtrinsicMatric(camera_pose): ''' :param camera_pose: [x, y, z, rx, ry, rz, rw] :return: ''' pos = camera_pose[:3] rot = camera_pose[3:] r1 = R.from_quat(rot).as_matrix() r2 = R.from_euler('x', 180, degrees=True).as_matrix() rotation_matrix = r1.dot(r2) # Create translation vector translation_vector = pos # Create extrinsic matrix extrinsic_matrix = np.eye(4) extrinsic_matrix[:3, :3] = rotation_matrix extrinsic_matrix[:3, 3] = translation_vector return extrinsic_matrix def update_camera_pose(cur_pose, delta_yaw): ''' :param cur_pose: [x, y, z, rx, ry, rz, rw] :param delta_yaw: yaw in radians :return: [x, y, z, rx, ry, rz, rw] ''' cur_pos = cur_pose[:3] cur_rot = cur_pose[3:] cur_rot_euler = R.from_quat(cur_rot).as_euler('xyz') cur_rot_euler[2] += delta_yaw new_rot = np.array(list(airsim.to_quaternion(*cur_rot_euler))) # print(R.from_quat(new_rot).as_euler('xyz')) new_pose = np.concatenate([cur_pos, new_rot], axis=0) return new_pose def build_semantic_point_cloud(depth_img, intrinsic_mat, bboxes, labels, BEV=False): height, width = depth_img.shape[:2] label_map = np.zeros_like(depth_img, dtype=np.int_) depth_map = np.zeros_like(depth_img, dtype=np.float16) depth_map.fill(np.inf) avg_depth_per_bbox = [] def _pix_in_bbox(pix, bbox): px, py = pix x_min, y_min, x_max, y_max = bbox if x_min < px < x_max and y_min < py < y_max: return True else: return False # get average depth for each region for bbox, label in zip(bboxes, labels): x_min, y_min, x_max, y_max = list(map(int, bbox)) depth_region = depth_img[x_min:x_max+1, y_min:y_max+1] avg_depth = depth_region.mean() avg_depth_per_bbox.append(avg_depth) # obtain label map for each pixel for x in range(height): for y in range(width): for i, (bbox, label) in enumerate(zip(bboxes, labels)): if _pix_in_bbox((x,y), bbox): if avg_depth_per_bbox[i] < depth_map[x, y]: depth_map[x,y] = avg_depth_per_bbox[i] label_map[x,y] = label # create point cloud u, v = np.meshgrid(np.arange(width), np.arange(height)) # Normalize the pixel coordinates normalized_x = (u - intrinsic_mat[0, 2]) / intrinsic_mat[0, 0] normalized_y = (v - intrinsic_mat[1, 2]) / intrinsic_mat[1, 1] # Reproject to 3D space by multiplying by the depth value x = normalized_x * depth_img y = normalized_y * depth_img z = depth_img # Stack the coordinates to get the point cloud point_cloud = np.stack((z, -x, -y), axis=-1) # filter out-range points point_cloud[point_cloud[:, :, 2]>=50] = 0 point_cloud[point_cloud[:, :, 2]<0] = 0 # point_cloud[point_cloud[:, :, 0]>0] = 0 # point_cloud[point_cloud[:, :, 1]>0] = 0 filtered_point_cloud = point_cloud[point_cloud.any(axis=-1)] filtered_label_map = label_map[point_cloud.any(axis=-1)] return filtered_point_cloud, filtered_label_map def merge_point_cloud( base_cloud: np.ndarray, base_label_map: np.ndarray, aux_cloud: np.ndarray, aux_label_map: np.ndarray, relative_pose: np.ndarray ) -> (np.ndarray, np.ndarray): ext_mat = get_ExtrinsicMatric(relative_pose) converted_aux_cloud = aux_cloud.dot(ext_mat) merged_point_cloud = np.concatenate((base_cloud, converted_aux_cloud), axis=0) merged_label_map = np.concatenate((base_label_map, aux_label_map), axis=0) return merged_point_cloud, merged_label_map def visualize_semantic_point_cloud(point_cloud_flat, label_map_flat): labels = np.unique(label_map_flat) label_num = len(labels) color_map = {} for i in range(len(labels)): color_map[labels[i]] = i # visualize point cloud colors = plt.cm.get_cmap('hsv', label_num) class_colors = [colors(i) for i in range(label_num)] class_colors_rgb = [(color[0], color[1], color[2]) for color in class_colors] # class_colors_rgb = np.array([[0, 0, 0], [0, 1, 0], [0, 0, 1], [1, 0, 0]]) class_colors = np.random.rand(len(labels), 3) label_colors = np.array([class_colors[color_map[label_map_flat[i]]] for i in range(len(label_map_flat))]) # print(label_colors) point_cloud = o3d.geometry.PointCloud() point_cloud.points = o3d.utility.Vector3dVector(point_cloud_flat) # point_cloud = point_cloud.voxel_down_sample(0.4) point_cloud.colors = o3d.utility.Vector3dVector(label_colors) # 设置点云颜色 # point_cloud.paint_uniform_color([0, 0, 0.8]) o3d.visualization.draw_geometries([point_cloud]) def build_semantic_map(depth_img, fov, camera_pose, boxes, phrases, visualize=False): height, width = depth_img.shape[:2] depth_img_unorm = depth_img*100 extrinsic_mat = get_ExtrinsicMatric(camera_pose) # print(extrinsic_mat) intrinsic_mat = get_IntrinsicMatrix(fov, width, height) assert len(boxes) == len(phrases) boxes[:, [0, 1, 2, 3]] = boxes[:, [1, 0, 3, 2]] class_dict = {"None": 0} labels = [] for i in range(len(phrases)): if phrases[i] not in class_dict: label_idx = len(class_dict) class_dict[phrases[i]] = label_idx labels.append(label_idx) else: label_idx = class_dict[phrases[i]] labels.append(label_idx) label_num = len(class_dict) point_cloud, label_map = build_semantic_point_cloud(depth_img_unorm, intrinsic_mat, boxes, labels) point_cloud_flat = point_cloud.reshape(-1, 3) point_cloud_flat_homo = np.hstack((point_cloud_flat, np.ones((len(point_cloud_flat), 1)))) point_cloud_flat_homo = point_cloud_flat_homo.dot(extrinsic_mat.T) point_cloud_flat = point_cloud_flat_homo[:, :3] label_map_flat = label_map.reshape(-1) if visualize: # visualize_semantic_point_cloud(point_cloud_flat, label_map_flat) visualize_point_cloud(point_cloud_flat) return point_cloud_flat, label_map_flat, class_dict def build_local_point_cloud(depth_img, intrinsic_mat): ''' convert depth image to point cloud in ego-centric airsim coordinate system Args: depth_image: in [height, width, 1] format Returns: point_clouds in [height, width, 3] format ''' px_height, px_width = depth_img.shape[:2] k = intrinsic_mat fx = k[0, 0] fy = k[1, 1] cx = k[0, 2] cy = k[1, 2] x = np.linspace(0, px_width - 1, px_width) y = np.linspace(0, px_height - 1, px_height) xv, yv = np.meshgrid(x, y) Z = depth_img X = (xv - cx) * Z / fx Y = (yv - cy) * Z / fy # convert to ego-centric airsim coordinate system # point_clouds = np.stack([X, Y, Z], axis=-1) point_clouds = np.stack((Z, -X, -Y), axis=-1) return point_clouds def build_global_point_cloud(local_pc, camera_pose): ''' Args: camera_pose: [x, y, z, rx, ry, rz, rw] robot_pos: camera position in world coordinate system formatted as [X,Y, Z] robot_ori: camera orientation in world coordinate system formatted as [x, y, z, w] pc: point cloud array in camera coordinate system formatted as [height, width, 3] Returns: pc2w: point cloud array in world coordinate system formatted as [height, width, 3] ''' h, w = local_pc.shape[:2] robot_pos = camera_pose[:3] robot_ori = camera_pose[3:] pc = local_pc.reshape(-1, 3) pc_homo = np.hstack((pc, np.ones((h * w, 1)))) r1 = R.from_quat(robot_ori).as_matrix() # extrinsic rotation in world frame coordinate system r2 = R.from_euler('x', 180, degrees=True).as_matrix() # align body frame with world frame coordinate system # r2 = np.eye(3) robot_rot = r1.dot(r2) trans_mat = np.eye(4) trans_mat[:3, :3] = robot_rot trans_mat[:3, 3] = robot_pos pc2w = pc_homo.dot(trans_mat.T) # [N, 4] pc2w = pc2w.reshape(h, w, 4) # [H, W, 4] return pc2w[:, :, :3] # [H, W, 3] def convert_global_pc(depth, fov, camera_pose, mask=None): depth = depth.squeeze() depth *= 100 height, width = depth.shape[:2] K = get_IntrinsicMatrix(fov=fov, height=height, width=width) local_pc = build_local_point_cloud(depth, K) global_pc = build_global_point_cloud(local_pc, camera_pose) if mask is not None: filter_idx = np.where( (depth > 0) & (depth < 50) & mask ) else: filter_idx = np.where( (depth > 0) & (depth < 50) ) return global_pc, filter_idx def build_global_map(depth, fov, camera_pose): ''' :param depth: a list of depth image :param fov: :param camera_pose: a list of camera pose corresponding to depth image :param mask: a list of mask :param phrases: a list of phrases corresponding to mask :return: ''' all_pc = [] all_filter_idx = [] height, width = depth[0].shape[:2] K = get_IntrinsicMatrix(fov=fov, height=height, width=width) for i in range(len(depth)): d = depth[i] d *= 100 d = d.squeeze() pose = camera_pose[i] filter_idx = np.where( (d > 0) & (d < 50) ) local_pc = build_local_point_cloud(d, K) global_pc= build_global_point_cloud(local_pc, pose) all_pc.append(global_pc) all_filter_idx.append(filter_idx) def visualize_nx_graph(nx_graph, node_color='black', show_label=False): positions_3d = nx.get_node_attributes(nx_graph, 'pos') positions_2d = {node: (x, y) for node, (x, y, z) in positions_3d.items()} plt.figure(figsize=(12, 9)) nx.draw(nx_graph, positions_2d, with_labels=show_label, node_size=30, node_color=node_color, font_size=10) plt.show() 详细解释一下
最新发布
08-26
以上代码实现了一个基于AirSim模拟器的语义地图构建与图优化路径规划的初步框架。代码中使用了多个Python库,包括: - **AirSim**:用于无人机或车辆的仿真环境,提供传感器数据(如深度图像和相机姿态)。 - **NumPy**:进行数值计算。 - **Open3D**:用于点云可视化。 - **NetworkX**:用于图(Graph)结构的创建和路径规划。 - **Matplotlib**:用于图的可视化。 - **SciPy**:用于处理旋转(如四元数与欧拉角之间的转换)。 ### 代码主要功能解释 #### 1. **find_closest_node** ```python def find_closest_node(graph, point, thresh=5, return_dist=False): ``` 该函数在图中找到距离给定点最近的节点。如果最近的节点的距离小于阈值`thresh`,则返回该节点;否则返回`None`。 #### 2. **compute_shortest_path** ```python def compute_shortest_path(G, A, B, weight='None'): ``` 该函数使用`NetworkX`库中的`shortest_path`方法计算图`G`中从点`A`到点`B`的最短路径。路径可以基于权重(如距离)计算。 #### 3. **visualize_point_cloud** ```python def visualize_point_cloud(point_cloud): ``` 使用`Open3D`库可视化点云数据。 #### 4. **statistical_filter** ```python def statistical_filter(point_cloud, k=30, std_dev_multiplier=1.0): ``` 该函数使用统计方法过滤点云中的异常点(离群点)。通过K近邻计算每个点的平均距离,并根据标准差过滤掉远离均值的点。 #### 5. **get_IntrinsicMatrix** ```python def get_IntrinsicMatrix(fov, width, height): ``` 根据相机的**视场角(FOV)**、图像宽度和高度计算相机的**内参矩阵**(Intrinsic Matrix)。 #### 6. **get_ExtrinsicMatric** ```python def get_ExtrinsicMatric(camera_pose): ``` 根据相机姿态(位置和四元数旋转)计算**外参矩阵**(Extrinsic Matrix),用于将点云从相机坐标系转换到世界坐标系。 #### 7. **update_camera_pose** ```python def update_camera_pose(cur_pose, delta_yaw): ``` 根据当前相机姿态和偏航角(`delta_yaw`)更新相机姿态。 #### 8. **build_semantic_point_cloud** ```python def build_semantic_point_cloud(depth_img, intrinsic_mat, bboxes, labels, BEV=False): ``` 根据深度图像、相机内参矩阵、边界框和标签构建**语义点云**(Semantic Point Cloud)。每个点都带有语义标签(如“车”、“人”等)。 #### 9. **merge_point_cloud** ```python def merge_point_cloud(base_cloud, base_label_map, aux_cloud, aux_label_map, relative_pose): ``` 将两个点云(带标签)根据相对位姿进行**合并**,用于构建全局地图。 #### 10. **visualize_semantic_point_cloud** ```python def visualize_semantic_point_cloud(point_cloud_flat, label_map_flat): ``` 使用不同颜色可视化带有语义标签的点云。 #### 11. **build_semantic_map** ```python def build_semantic_map(depth_img, fov, camera_pose, boxes, phrases, visualize=False): ``` 构建带有语义信息的全局地图。输入包括深度图像、视场角、相机姿态、边界框和语义标签。 #### 12. **build_local_point_cloud** ```python def build_local_point_cloud(depth_img, intrinsic_mat): ``` 将深度图像转换为**局部点云**(Local Point Cloud),即相机坐标系下的点云。 #### 13. **build_global_point_cloud** ```python def build_global_point_cloud(local_pc, camera_pose): ``` 将局部点云转换为**全局点云**(Global Point Cloud),即世界坐标系下的点云。 #### 14. **convert_global_pc** ```python def convert_global_pc(depth, fov, camera_pose, mask=None): ``` 将深度图像和相机姿态转换为全局点云,并根据深度值进行过滤。 #### 15. **build_global_map** ```python def build_global_map(depth, fov, camera_pose): ``` 从多个深度图像和相机姿态构建全局点云地图。 #### 16. **visualize_nx_graph** ```python def visualize_nx_graph(nx_graph, node_color='black', show_label=False): ``` 使用`Matplotlib`可视化**图结构**(Graph),包括节点和边。 --- ### 代码整体流程 1. **获取深度图像和相机姿态**(来自AirSim)。 2. **计算相机内参矩阵和外参矩阵**。 3. **将深度图像转换为局部点云**。 4. **将局部点云转换为全局点云**。 5. **构建语义点云**,每个点带有语义标签。 6. **合并多个点云**,构建全局地图。 7. **可视化点云和图结构**。 8. **使用图算法**(如最短路径)进行路径规划。 --- ### 应用场景 - **自动驾驶**:构建带有语义信息的环境地图,用于路径规划。 - **无人机导航**:在复杂环境中进行语义感知和避障。 - **机器人路径规划**:结合语义地图和图搜索算法进行高效路径规划。 --- ###
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值