Python_Json

本文介绍了Python中处理JSON数据的两个常用模块json和ijson。json模块适合处理小数据集,可通过loads()和dumps()实现JSON与Python对象的转换;ijson模块采用迭代解析,适合处理大型JSON文件。还给出了使用json模块解析JSON文件的示例代码及结果。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

JSON,https://www.sharetechnote.com/html/Python_Json.html

即JavaScript对象表示法,类似于Web上的通用数据语言。它被广泛使用,因为它易于人类和机器阅读、编写和理解。

 

有两个常用的模块(包)用于JSON解析:json和ijson。

 

Python的json模块是处理JSON数据的首选工具。如果你的JSON是一个字符串,你可以使用json.loads()将其转换为一个可以处理的Python对象,比如字典或列表。而如果你要将Python对象转换回JSON字符串,可以使用json.dumps()。非常有用!

 

但是当你的JSON数据比泰坦尼克号还大的时候会发生什么呢?这就是ijson发挥作用的时候。

 

ijson就像是JSON模块中的马拉松选手。它不会试图一口气处理完整个数据,而是一步一步地进行。这种方法被称为迭代解析,对于处理大型JSON文件非常好,因为你不需要一次性将所有数据加载到内存中。

 

ijson.items()函数是一个工作马。给它一个类似文件的对象和一个前缀,它会返回一个迭代器,逐个产出与该前缀匹配的JSON对象。前缀只是一个由点分隔的属性名字符串,显示了你想要的宝贝的路径。

 

如果你需要更多的控制,ijson.parse()返回一个迭代器,它会给你返回JSON文档的解析事件。你会看到像'start_map'、'end_map'、'start_array'、'end_array'之类的事件,以及其他携带相关数据的事件。

 

所以,json还是ijson?这取决于你的数据大小。如果你有一个小巧的JSON数据集,不会给你的内存带来麻烦,json是一个简洁而简单的选择。但如果你的数据更像是巨兽,或者你以流的形式接收数据,不能等待所有数据到达,ijson会帮助你!

JsonTest_01.txt

 

{

 

  "employees": [

 

    {

 

      "id": "001",

 

      "firstName": "John",

 

      "lastName": "Doe",

 

      "email": "john.doe@example.com",

 

      "phoneNumbers": [

 

        {

 

          "type": "home",

 

          "number": "123-456-7890"

 

        },

 

        {

 

          "type": "work",

 

          "number": "555-555-1212"

 

        }

 

      ],

 

      "address": {

 

        "street": "123 Main Street",

 

        "city": "Anytown",

 

        "state": "CA",

 

        "zip": "12345",

 

        "country": "USA"

 

      },

 

      "skills": [

 

        {

 

          "name": "Programming",

 

          "level": "Expert"

 

        },

 

        {

 

          "name": "Project Management",

 

          "level": "Intermediate"

 

        }

 

      ]

 

    },

 

    {

 

      "id": "002",

 

      "firstName": "Jane",

 

      "lastName": "Smith",

 

      "email": "jane.smith@example.com",

 

      "phoneNumbers": [

 

        {

 

          "type": "home",

 

          "number": "987-654-3210"

 

        },

 

        {

 

          "type": "work",

 

          "number": "555-555-1212"

 

        }

 

      ],

 

      "address": {

 

        "street": "456 High Street",

 

        "city": "Anytown",

 

        "state": "CA",

 

        "zip": "12345",

 

        "country": "USA"

 

      },

 

      "skills": [

 

        {

 

          "name": "Design",

 

          "level": "Expert"

 

        },

 

        {

 

          "name": "Copywriting",

 

          "level": "Intermediate"

 

        }

 

      ]

 

    }

 

  ]

 

}

 

 

 

 

 

 

 

JSON Parsing with json module

 

 

 

This is an example Python code to parse and print it or extract a specific items from the json object. In this example, I a going to use json module only.

 

 

 

 

Json_Test_01.py

 

import json

 

 

 

 

 

# this is to read a json file and return the whole contents of the file. This is just two line codes but would need a

 

# long explanation to give you the full details.

 

# The with statement is used here to handle the opening and closing of the file. This is good practice because it

 

# makes sure the file gets closed properly even if something goes wrong while reading it.

 

 

 

# Then, the json.load function is called with file as its argument. This function reads the whole JSON file and turns

 

# it into a Python object.

 

 

 

# If the JSON file contains a JSON object (i.e., what you'd write in JavaScript as {...}), json.load returns a

 

# dictionary. If the JSON file contains a JSON array (i.e., [...] in JavaScript), it returns a list. If the JSON file just

 

# contains a single JSON value (like "hello" or 42), then it'll return a string or an integer, or whatever the appropriate

 

# Python equivalent is.

 

#

 

# In case of the test json file shown in previous section. This is what happens when json.load() is executed.

 

# In this case, the outermost element of the JSON data is a JSON object (i.e., {...}). Therefore, json.load() will

 

# return a Python dictionary when you use it to load this JSON file.

 

#

 

# Breaking further down, it goes as follows:

 

 #

 

# The returned dictionary has a single key: "employees". The value associated with this key is a JSON array, which

 

# is translated into a Python list.

 

# Each element of this list is a JSON object, corresponding to individual employees. These objects are translated into

 

# dictionaries in Python.

 

# Each of these dictionaries has several keys (like "id", "firstName", "lastName", and so on). The values for these

 

# keys are either simple data types (strings in this case), JSON arrays (for "phoneNumbers" and "skills"), or JSON

 

# objects (for "address").

 

# These inner JSON arrays and objects are also translated into Python lists and dictionaries, respectively.

 

 

 

def read_json_file(file_name):

 

    with open(file_name, 'r') as file:

 

        return json.load(file)

 

 

 

 

 

# This function is to print the contents of a JSON structure in a readable, nicely-formatted way. It only takes one

 

# argument named json_content:

 

#

 

# json_content is a Python object that you want to print. This object should be a valid JSON structure.

 

# In other words, it could be a Python dictionary, list, string, number, None, etc.

 

#

 

# Now let's dive into the function body:

 

#

 

# The function uses the json.dumps() method from Python's json module. The dumps() function is short for "dump

 

# string," and its job is to convert a Python object into a JSON string.

 

 #

 

# The first argument it takes is the Python object you want to convert. In this case, it's the json_content that you

 

# passed to the json_print() function.

 

# The second argument, indent=4, is an optional parameter that tells dumps() to format the output string in a pretty

 

# way. Specifically, it adds newlines after each JSON object (i.e., {}) or array ([]), and it indents the contents of

 

# those objects or arrays by 4 spaces. This makes the JSON structure much easier to read, especially if it's nested!

 

# Once json.dumps() has created this pretty string, the function simply prints it to the console with the print()

 

# function. And that's all there is to it!

 

#

 

def json_print(json_content):

 

    print(json.dumps(json_content, indent=4))

 

 

 

 

 

# This is to get access a specific item from a JSON structure and print it in a pretty way. In short, it is to extract

 

# a specific items of the given JSON data.

 

#

 

# It takes two arguments:

 

#

 

# json_content is a Python object representing the JSON structure you want to work with. This can be a Python

 

# dictionary, list, string, number, None, etc. — anything that's a valid JSON structure.

 

 #

 

# json_item_path is a string describing the path to the item within the JSON structure you're interested in. It

 

# uses dot notation to represent the hierarchy of keys in the structure. For example, in a JSON structure

 

# representing a person, the path "address.city" would correspond to the city part of the address.

 

#

 

# Now, let's break down the function body:

 

 #

 

# First, it splits json_item_path into a list of its components with json_item_path.split('.'). This is done because the

 

# path could represent several levels of nesting, e.g., "address.city" would be split into ['address', 'city'].

 

 #

 

# It then initializes item to point to the whole JSON content.

 

 #

 

# Next, it enters a loop that iterates over each part of the item path. For each part:

 

 #

 

# If the part is a string of digits (which is checked with part.isdigit()), it's assumed to be an index into a list.

 

# So, the function converts it to an integer with int(part) and uses that to index into item.

 

# If the part is not a string of digits, it's assumed to be a key into a dictionary. So, the function uses it directly to

 

# index into item.

 

# In either case, item is updated to point to the next level of the JSON structure as specified by the current part

 

# of the item path.

 

# Once all parts of the item path have been processed, item should be pointing to the desired item in the JSON

 

# structure.

 

 #

 

# Finally, the function pretty-prints the item path and the item itself with print()

 

#

 

def json_print_item(json_content, json_item_path):

 

    item_path_parts = json_item_path.split('.')

 

    item = json_content

 

    for part in item_path_parts:

 

        if part.isdigit():

 

            item = item[int(part)]

 

        else:

 

            item = item[part]   

 

    print(json_item_path,' = ', json.dumps(item, indent=4))

 

 

 

 

 

 

 

# This is to do exactly same thing as def json_print_item() does, but in different way (doing reversive instead of

 

# for loop). I would not describe in this about this. you may skip this part unless you are really interested in.

 

#

 

def json_print_item_recursive(json_content, json_item_path, full_path=''):

 

    item_path_parts = json_item_path.split('.')

 

    part = item_path_parts[0]

 

 

 

    new_full_path = full_path + ('.' + part if full_path else part)

 

 

 

    if len(item_path_parts) == 1:

 

        if part.isdigit():

 

            item = json_content[int(part)]

 

        else:

 

            item = json_content[part]

 

        print(new_full_path, ' = ', json.dumps(item, indent=4))

 

    else:

 

        new_path = '.'.join(item_path_parts[1:])

 

        if part.isdigit():

 

            json_print_item_recursive(json_content[int(part)], new_path, new_full_path)

 

        else:

 

            json_print_item_recursive(json_content[part], new_path, new_full_path)

 

 

 

 

 

# This is the test code for the functions defined above

 

 

 

json_content = read_json_file('JsonTest_01.txt')

 

 

 

json_print(json_content)

 

json_print_item(json_content, 'employees.1.firstName')  

 

json_print_item(json_content, 'employees.0.phoneNumbers.0.type')

 

json_print_item(json_content, 'employees.0.phoneNumbers.0.number')

 

json_print_item_recursive(json_content, 'employees.1.firstName')  

 

json_print_item_recursive(json_content, 'employees.0.phoneNumbers.0.type')

 

json_print_item_recursive(json_content, 'employees.0.phoneNumbers.0.number')

 

 

 

Result

{

 

    "employees": [

 

        {

 

            "id": "001",

 

            "firstName": "John",

 

            "lastName": "Doe",

 

            "email": "john.doe@example.com",

 

            "phoneNumbers": [

 

                {

 

                    "type": "home",

 

                    "number": "123-456-7890"

 

                },

 

                {

 

                    "type": "work",

 

                    "number": "555-555-1212"

 

                }

 

            ],

 

            "address": {

 

                "street": "123 Main Street",

 

                "city": "Anytown",

 

                "state": "CA",

 

                "zip": "12345",

 

                "country": "USA"

 

            },

 

            "skills": [

 

                {

 

                    "name": "Programming",

 

                    "level": "Expert"

 

                },

 

                {

 

                    "name": "Project Management",

 

                    "level": "Intermediate"

 

                }

 

            ]

 

        },

 

        {

 

            "id": "002",

 

            "firstName": "Jane",

 

            "lastName": "Smith",

 

            "email": "jane.smith@example.com",

 

            "phoneNumbers": [

 

                {

 

                    "type": "home",

 

                    "number": "987-654-3210"

 

                },

 

                {

 

                    "type": "work",

 

                    "number": "555-555-1212"

 

                }

 

            ],

 

            "address": {

 

                "street": "456 High Street",

 

                "city": "Anytown",

 

                "state": "CA",

 

                "zip": "12345",

 

                "country": "USA"

 

            },

 

            "skills": [

 

                {

 

                    "name": "Design",

 

                    "level": "Expert"

 

                },

 

                {

 

                    "name": "Copywriting",

 

                    "level": "Intermediate"

 

                }

 

            ]

 

        }

 

    ]

 

}

 

employees.1.firstName = "Jane"

 

employees.0.phoneNumbers.0.type = "home"

 

employees.0.phoneNumbers.0.number = "123-456-7890"

 

employees.1.firstName = "Jane"

 

employees.0.phoneNumbers.0.type = "home"

 

employees.0.phoneNumbers.0.number = "123-456-7890"

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值