Python实现：Douglas-Peucker算法实现压缩geojson多边形数据

不似少年游'

于 2024-11-15 15:09:51 发布

阅读量487

点赞数 6

文章标签：算法

本文链接：https://blog.youkuaiyun.com/C747217368/article/details/143797078

版权

1. 引言

在处理地理信息系统（GIS）或计算机视觉任务中，我们常常会遇到大量点数据，这些数据通常用于描述多边形或路径。当点数过多时，会导致计算效率低下、文件体积增大。因此，为了减少数据冗余和提高处理效率，通常需要对多边形点集进行压缩，而 **Douglas-Peucker 算法** 正是用于这一目的的经典算法。

本文将详细解析一段 Python 代码，通过 Douglas-Peucker 算法对 JSON 文件中的形状数据进行压缩，并保存优化后的结果。

2. 代码功能概述

这段代码的主要功能是：
1. Douglas-Peucker 算法用于对多边形的点集进行压缩。
2. 在给定目录中遍历所有 `.json` 文件，对指定标签的形状数据进行压缩。
3. 输出压缩后的数据并覆盖原有文件。

3. 代码分解与详细讲解

3.1 Douglas-Peucker 算法的实现


def douglas_peucker(points, epsilon):
    """
    实现 Douglas-Peucker 算法压缩多边形点集
    :param points: 输入的点集列表 [(x1, y1), (x2, y2), ...]
    :param epsilon: 距离阈值
    :return: 压缩后的点集
    """
    if len(points) < 3:
        return points
    
    start = points[0]
    end = points[-1]
    max_dist = 0
    index = 0
    
    # 找到距离起始点和结束点形成的直线段最远的点
    for i in range(1, len(points) - 1):
        dist = perpendicular_distance(points[i], start, end)
        if dist > max_dist:
            index = i
            max_dist = dist
    
    # 如果最大距离大于阈值，则递归压缩
    if max_dist > epsilon:
        left_result = douglas_peucker(points[:index + 1], epsilon)
        right_result = douglas_peucker(points[index:], epsilon)
        return left_result[:-1] + right_result
    else:
        return [start, end]

算法原理

输入：多边形的点集 points 和距离阈值 epsilon。
过程：
- 从点集 points 中取出第一个点和最后一个点作为直线段。
- 计算所有中间点到该直线段的垂直距离，找到距离最大的点 index。
- 如果最大距离 max_dist 大于阈值 epsilon，则将点集分为两部分，递归处理。
- 如果最大距离小于阈值，则只保留起点和终点。
输出：压缩后的点集。

3.2 计算点到直线段的垂直距离

def perpendicular_distance(point, line_start, line_end):
    """计算点到线段的垂直距离"""
    x0, y0 = point
    x1, y1 = line_start
    x2, y2 = line_end
    
    if (x1 == x2) and (y1 == y2):
        return math.hypot(x0 - x1, y0 - y1)
    
    num = abs((y2 - y1) * x0 - (x2 - x1) * y0 + x2 * y1 - y2 * x1)
    den = math.hypot(y2 - y1, x2 - x1)
    return num / den

函数解析

利用线段公式计算点到直线段的垂直距离。
如果线段退化为一个点，则直接计算欧氏距离。

3.3 压缩 JSON 数据中的形状

def compress_shapes(data, label="zhuangzi", target_points=6):
    """
    压缩 JSON 数据中指定 label 的 shapes
    """
    epsilon = 5.0

    for shape in data['shapes']:
        if shape['label'] == label:
            points = shape['points']
            compressed_points = douglas_peucker(points, epsilon)
            
            # 调整 epsilon 直到点数满足要求
            while len(compressed_points) > target_points:
                epsilon += 1.0
                compressed_points = douglas_peucker(points, epsilon)
            
            shape['points'] = compressed_points
    return data

功能说明

对 JSON 数据中的 shapes 进行遍历，并对指定 label 的形状进行压缩。
初始设置 epsilon = 5.0，如果压缩后的点数仍然多于 target_points，则逐步增加 epsilon 继续压缩。

3.4 处理目录中的 JSON 文件

def process_json_files_in_directory(directory_path, label="zhuangzi", target_points=6):
    """
    遍历目录中的所有 JSON 文件，对指定 label 的形状进行压缩
    """
    for filename in os.listdir(directory_path):
        if filename.endswith('.json'):
            file_path = os.path.join(directory_path, filename)
            print(f"正在处理文件: {file_path}")

            try:
                with open(file_path, 'r', encoding='utf-8') as f:
                    data = json.load(f)

                # 压缩形状数据
                compressed_data = compress_shapes(data, label=label, target_points=target_points)

                with open(file_path, 'w', encoding='utf-8') as f:
                    json.dump(compressed_data, f, ensure_ascii=False, indent=4)

                print(f"压缩完成: {file_path}")
            except Exception as e:
                print(f"处理文件时出错: {file_path} - 错误: {e}")

功能说明

遍历给定目录下的所有 .json 文件。
对每个文件中的数据进行压缩，并将结果保存到原文件中。

3.5 主程序入口

if __name__ == "__main__":
    directory_path = input("请输入要处理的目录路径: ")
    
    if os.path.isdir(directory_path):
        process_json_files_in_directory(directory_path)
    else:
        print("目录不存在，请检查路径是否正确。")