应用-为 Python 选择更快的 JSON 库_python json速度-优快云博客

本文链接：https://blog.youkuaiyun.com/liluo0815481/article/details/146230054

安全性/抗崩溃性：日志消息可能包含来自不受信任来源的数据。如果 JSON 编码器在遇到错误数据时崩溃，这对可靠性和安全性都不好。
自定义编码：Eliot 支持 JSON 编码的自定义，因此你可以序列化其他类型的 Python 对象。一些 JSON 库支持这一点，而另一些则不支持。
跨平台：能够在 Linux、macOS 和 Windows 上运行。
维护性：我不想依赖一个没有积极支持的库。

我考虑的库包括 orjson、rapidjson、ujson 和 hyperjson。

根据上述标准，我筛选掉了一些库：

在我最初撰写这篇文章时，ujson 有许多关于崩溃的 bug 报告，并且自 2016 年以来没有发布新版本。虽然现在看起来它又开始维护了，但我还没有重新审视它。
hyperjson 只有 macOS 的包，而且总体上看起来非常不成熟。现在他们只推荐使用 orjson。

第四步：基准测试

最终的两个候选者是 rapidjson 和 orjson。我运行了以下基准测试：

import time
import json
import orjson
import rapidjson

m = {
    "timestamp": 1556283673.1523004,
    "task_uuid": "0ed1a1c3-050c-4fb9-9426-a7e72d0acfc7",
    "task_level": [1, 2, 1],
    "action_status": "started",
    "action_type": "main",
    "key": "value",
    "another_key": 123,
    "and_another": ["a", "b"],
}

def benchmark(name, dumps):
    start = time.time()
    for i in range(1000000):
        dumps(m)
    print(name, time.time() - start)

benchmark("Python", json.dumps)
# orjson 只输出字节，但我们通常需要 Unicode：
benchmark("orjson", lambda s: str(orjson.dumps(s), "utf-8"))
benchmark("rapidjson", rapidjson.dumps)

结果如下：

$ python jsonperf.py 
Python 4.829106330871582
orjson 1.0466396808624268
rapidjson 2.1441543102264404

即使需要额外的 Unicode 解码，orjson 仍然是最快的（在这个特定的基准测试中！）。

当然，总有一些权衡。orjson 的用户比 rapidjson 少（比较 orjson PyPI 统计数据和 rapidjson PyPI 统计数据），而且没有 Conda 包，所以我必须自己为 Conda-forge 打包。但它确实快得多。

应用-为 Python 选择更快的 JSON 库

目录

为 Python 选择更快的 JSON 库

第一步：你真的需要一个新的 JSON 库吗？

第二步：定义基准测试

第三步：根据额外需求进行筛选

第四步：基准测试