Python编程教程之二:提高

八、模块和包

在Python中,模块和包是组织代码的重要方式,它们可以帮助我们更好地管理和重用代码。

8.1 模块

模块是一个包含Python定义和语句的文件。文件名就是模块名,后缀为.py。模块可以定义函数、类和变量,也可以包含可执行的代码。

# 创建一个名为 my_module.py 的文件
# my_module.py

def greeting(name):
    return f"Hello, {name}!"

PI = 3.14159

class Calculator:
    def add(self, a, b):
        return a + b

要使用这个模块,我们可以使用import语句:

# 使用模块
import my_module

print(my_module.greeting("Alice"))  # 输出: Hello, Alice!
print(my_module.PI)  # 输出: 3.14159

calc = my_module.Calculator()
print(calc.add(5, 3))  # 输出: 8

我们也可以使用from...import语句导入模块中的特定部分:

# 导入特定函数或变量
from my_module import greeting, PI

print(greeting("Bob"))  # 输出: Hello, Bob!
print(PI)  # 输出: 3.14159

或者使用as关键字给模块或函数起别名:

# 使用别名
import my_module as mm
from my_module import greeting as greet

print(mm.greeting("Charlie"))  # 输出: Hello, Charlie!
print(greet("David"))  # 输出: Hello, David!

注意:当导入一个模块时,Python会执行该模块中的所有代码。因此,通常我们会将模块中的可执行代码放在if __name__ == "__main__":条件块中,这样只有在直接运行该模块时才会执行这些代码。

8.2 包

包是一种组织模块的方式,它是一个包含多个模块的目录。包目录中必须包含一个名为__init__.py的文件,这个文件可以为空,也可以包含包的初始化代码。

下面是一个包的示例结构:

my_package/
    __init__.py
    module1.py
    module2.py
    subpackage/
        __init__.py
        module3.py

我们可以这样使用包中的模块:

# 导入包中的模块
import my_package.module1
from my_package import module2
from my_package.subpackage import module3

# 使用模块中的函数
my_package.module1.some_function()
module2.another_function()
module3.third_function()

提示:在__init__.py文件中,我们可以导入包中的模块,这样用户就可以直接从包中导入这些模块,而不需要指定模块名。

8.3 常用标准模块

Python提供了许多标准模块,这些模块可以帮助我们完成各种任务。下面是一些常用的标准模块:

模块名功能描述
os提供与操作系统交互的功能
sys提供与Python解释器交互的功能
math提供数学函数
random生成随机数
datetime处理日期和时间
json处理JSON数据
re正则表达式操作

下面是一些使用标准模块的示例:

# 使用os模块
import os
print(os.getcwd())  # 获取当前工作目录
print(os.listdir())  # 列出当前目录中的文件和目录

# 使用math模块
import math
print(math.sqrt(16))  # 输出: 4.0
print(math.pi)  # 输出: 3.141592653589793

# 使用random模块
import random
print(random.randint(1, 10))  # 输出1到10之间的随机整数
print(random.choice(['apple', 'banana', 'cherry']))  # 随机选择一个元素

# 使用datetime模块
from datetime import datetime, timedelta
now = datetime.now()
print(now)  # 输出当前日期和时间
print(now + timedelta(days=7))  # 输出一周后的日期和时间

九、文件和数据处理

在编程中,文件和数据处理是非常重要的技能。Python提供了丰富的工具和库来处理各种类型的数据。

9.1 文件操作

Python提供了内置的open()函数来打开文件,该函数返回一个文件对象,我们可以使用这个对象来读取或写入文件。

9.1.1 打开和关闭文件
# 打开文件
file = open('example.txt', 'r')  # 'r'表示读取模式

# 使用文件
content = file.read()
print(content)

# 关闭文件
file.close()

更推荐使用with语句,它可以自动处理文件的打开和关闭:

# 使用with语句
with open('example.txt', 'r') as file:
    content = file.read()
    print(content)
# 文件会自动关闭
9.1.2 文件打开模式

open()函数的第二个参数是文件打开模式,常用的模式有:

模式描述
'r'读取模式(默认)
'w'写入模式(会覆盖已有文件)
'a'追加模式(在文件末尾添加内容)
'b'二进制模式(与其他模式结合使用,如'rb', 'wb')
'+'读写模式(与其他模式结合使用,如'r+', 'w+')
9.1.3 读取文件

有几种方法可以读取文件内容:

# 读取整个文件
with open('example.txt', 'r') as file:
    content = file.read()
    print(content)

# 逐行读取
with open('example.txt', 'r') as file:
    for line in file:
        print(line.strip())  # strip()用于去除行尾的换行符

# 读取所有行到一个列表
with open('example.txt', 'r') as file:
    lines = file.readlines()
    for line in lines:
        print(line.strip())
9.1.4 写入文件
# 写入字符串
with open('output.txt', 'w') as file:
    file.write('Hello, World!\n')
    file.write('This is a test.\n')

# 写入列表中的字符串
lines = ['First line\n', 'Second line\n', 'Third line\n']
with open('output.txt', 'w') as file:
    file.writelines(lines)
9.1.5 二进制文件操作

对于非文本文件(如图片、音频等),我们需要使用二进制模式:

# 复制图片文件
with open('image.jpg', 'rb') as source_file:
    with open('image_copy.jpg', 'wb') as target_file:
        target_file.write(source_file.read())

9.2 CSV文件处理

CSV(Comma-Separated Values)是一种常用的数据交换格式。Python的csv模块提供了处理CSV文件的功能。

import csv

# 读取CSV文件
with open('data.csv', 'r', newline='') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

# 写入CSV文件
data = [
    ['Name', 'Age', 'City'],
    ['Alice', 25, 'New York'],
    ['Bob', 30, 'Los Angeles'],
    ['Charlie', 35, 'Chicago']
]

with open('output.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerows(data)

我们也可以使用DictReader和DictWriter来处理CSV文件,这样可以更方便地通过列名访问数据:

import csv

# 使用DictReader读取CSV文件
with open('data.csv', 'r', newline='') as file:
    reader = csv.DictReader(file)
    for row in reader:
        print(f"Name: {row['Name']}, Age: {row['Age']}")

# 使用DictWriter写入CSV文件
data = [
    {'Name': 'Alice', 'Age': 25, 'City': 'New York'},
    {'Name': 'Bob', 'Age': 30, 'City': 'Los Angeles'},
    {'Name': 'Charlie', 'Age': 35, 'City': 'Chicago'}
]

with open('output.csv', 'w', newline='') as file:
    fieldnames = ['Name', 'Age', 'City']
    writer = csv.DictWriter(file, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(data)

9.3 JSON数据处理

JSON(JavaScript Object Notation)是一种轻量级的数据交换格式。Python的json模块提供了处理JSON数据的功能。

import json

# 将Python对象转换为JSON字符串
data = {
    'name': 'Alice',
    'age': 25,
    'courses': ['Math', 'Physics', 'Chemistry'],
    'is_student': True
}

json_str = json.dumps(data, indent=4)  # indent参数用于美化输出
print(json_str)

# 将JSON字符串转换为Python对象
json_str = '{"name": "Bob", "age": 30, "courses": ["History", "Art"], "is_student": False}'
data = json.loads(json_str)
print(data)
print(data['name'])  # 输出: Bob

# 将Python对象写入JSON文件
with open('data.json', 'w') as file:
    json.dump(data, file, indent=4)

# 从JSON文件读取数据
with open('data.json', 'r') as file:
    data = json.load(file)
    print(data)

9.4 XML数据处理

XML(eXtensible Markup Language)是一种标记语言,常用于存储和传输数据。Python的xml.etree.ElementTree模块提供了处理XML数据的功能。

import xml.etree.ElementTree as ET

# 解析XML字符串
xml_str = '''
<students>
    <student id="1">
        <name>Alice</name>
        <age>25</age>
        <courses>
            <course>Math</course>
            <course>Physics</course>
        </courses>
    </student>
    <student id="2">
        <name>Bob</name>
        <age>30</age>
        <courses>
            <course>History</course>
            <course>Art</course>
        </courses>
    </student>
</students>
'''

root = ET.fromstring(xml_str)

# 遍历XML树
for student in root.findall('student'):
    student_id = student.get('id')
    name = student.find('name').text
    age = student.find('age').text
    print(f"ID: {student_id}, Name: {name}, Age: {age}")
    
    for course in student.find('courses').findall('course'):
        print(f"  Course: {course.text}")

# 创建XML元素
new_student = ET.Element('student')
new_student.set('id', '3')

name = ET.SubElement(new_student, 'name')
name.text = 'Charlie'

age = ET.SubElement(new_student, 'age')
age.text = '35'

courses = ET.SubElement(new_student, 'courses')
course1 = ET.SubElement(courses, 'course')
course1.text = 'Chemistry'
course2 = ET.SubElement(courses, 'course')
course2.text = 'Biology'

root.append(new_student)

# 将XML树转换为字符串
xml_str = ET.tostring(root, encoding='unicode')
print(xml_str)

# 将XML树写入文件
tree = ET.ElementTree(root)
tree.write('students.xml')

9.5 Excel文件处理

Excel是一种常用的电子表格软件。Python的openpyxl和pandas库提供了处理Excel文件的功能。

9.5.1 使用openpyxl处理Excel文件
# 首先需要安装openpyxl: pip install openpyxl
import openpyxl

# 创建一个新的工作簿
workbook = openpyxl.Workbook()

# 获取活动工作表
sheet = workbook.active
sheet.title = "Students"

# 写入数据
headers = ['Name', 'Age', 'Grade']
sheet.append(headers)

data = [
    ['Alice', 25, 'A'],
    ['Bob', 30, 'B'],
    ['Charlie', 35, 'C']
]

for row in data:
    sheet.append(row)

# 保存工作簿
workbook.save('students.xlsx')

# 读取Excel文件
workbook = openpyxl.load_workbook('students.xlsx')
sheet = workbook.active

# 读取所有数据
for row in sheet.iter_rows(values_only=True):
    print(row)

# 读取特定单元格的值
print(f"A1: {sheet['A1'].value}")
print(f"B2: {sheet.cell(row=2, column=2).value}")
9.5.2 使用pandas处理Excel文件
# 首先需要安装pandas和openpyxl: pip install pandas openpyxl
import pandas as pd

# 创建DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Grade': ['A', 'B', 'C']
}
df = pd.DataFrame(data)

# 将DataFrame写入Excel文件
df.to_excel('students_pandas.xlsx', index=False)

# 从Excel文件读取数据
df = pd.read_excel('students_pandas.xlsx')
print(df)

# 选择特定列
print(df['Name'])

# 选择特定行
print(df.loc[0])  # 第一行
print(df.loc[0:1])  # 第一行和第二行

# 条件筛选
print(df[df['Age'] > 25])

十、数据库操作

数据库是存储和管理数据的系统。Python提供了多种方式来与数据库交互,包括标准库和第三方库。

10.1 SQLite数据库

SQLite是一个轻量级的嵌入式数据库,Python的标准库sqlite3提供了对SQLite的支持。

import sqlite3

# 连接到数据库(如果不存在,则会创建)
conn = sqlite3.connect('example.db')

# 创建一个游标对象
cursor = conn.cursor()

# 创建表
cursor.execute('''
CREATE TABLE IF NOT EXISTS students (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    age INTEGER,
    grade TEXT
)
''')

# 插入数据
cursor.execute("INSERT INTO students (name, age, grade) VALUES (?, ?, ?)", ('Alice', 25, 'A'))
cursor.execute("INSERT INTO students (name, age, grade) VALUES (?, ?, ?)", ('Bob', 30, 'B'))
cursor.execute("INSERT INTO students (name, age, grade) VALUES (?, ?, ?)", ('Charlie', 35, 'C'))

# 提交事务
conn.commit()

# 查询数据
cursor.execute("SELECT * FROM students")
rows = cursor.fetchall()

for row in rows:
    print(row)

# 使用参数化查询
cursor.execute("SELECT * FROM students WHERE age > ?", (25,))
rows = cursor.fetchall()

for row in rows:
    print(row)

# 更新数据
cursor.execute("UPDATE students SET grade = ? WHERE name = ?", ('A+', 'Alice'))
conn.commit()

# 删除数据
cursor.execute("DELETE FROM students WHERE name = ?", ('Charlie',))
conn.commit()

# 关闭连接
conn.close()

提示:使用参数化查询(如?占位符)可以防止SQL注入攻击,这是一种常见的安全漏洞。

10.2 MySQL数据库

MySQL是一种流行的关系型数据库管理系统。要使用Python连接MySQL,我们需要安装mysql-connector-python库。

# 首先需要安装mysql-connector-python: pip install mysql-connector-python
import mysql.connector
from mysql.connector import Error

try:
    # 连接到MySQL数据库
    conn = mysql.connector.connect(
        host='localhost',
        user='your_username',
        password='your_password',
        database='your_database'
    )
    
    if conn.is_connected():
        print("成功连接到MySQL数据库")
        
        # 创建一个游标对象
        cursor = conn.cursor()
        
        # 创建表
        cursor.execute('''
        CREATE TABLE IF NOT EXISTS students (
            id INT AUTO_INCREMENT PRIMARY KEY,
            name VARCHAR(255) NOT NULL,
            age INT,
            grade VARCHAR(10)
        )
        ''')
        
        # 插入数据
        sql = "INSERT INTO students (name, age, grade) VALUES (%s, %s, %s)"
        val = ('Alice', 25, 'A')
        cursor.execute(sql, val)
        
        val = ('Bob', 30, 'B')
        cursor.execute(sql, val)
        
        val = ('Charlie', 35, 'C')
        cursor.execute(sql, val)
        
        # 提交事务
        conn.commit()
        print(f"{cursor.rowcount} 条记录已插入")
        
        # 查询数据
        cursor.execute("SELECT * FROM students")
        rows = cursor.fetchall()
        
        for row in rows:
            print(row)
        
        # 使用参数化查询
        cursor.execute("SELECT * FROM students WHERE age > %s", (25,))
        rows = cursor.fetchall()
        
        for row in rows:
            print(row)
        
        # 更新数据
        sql = "UPDATE students SET grade = %s WHERE name = %s"
        val = ('A+', 'Alice')
        cursor.execute(sql, val)
        conn.commit()
        print(f"{cursor.rowcount} 条记录已更新")
        
        # 删除数据
        sql = "DELETE FROM students WHERE name = %s"
        val = ('Charlie',)
        cursor.execute(sql, val)
        conn.commit()
        print(f"{cursor.rowcount} 条记录已删除")

except Error as e:
    print("Error while connecting to MySQL", e)
finally:
    # 关闭连接
    if conn.is_connected():
        cursor.close()
        conn.close()
        print("MySQL连接已关闭")

10.3 PostgreSQL数据库

PostgreSQL是一种功能强大的开源关系型数据库系统。要使用Python连接PostgreSQL,我们需要安装psycopg2库。

# 首先需要安装psycopg2: pip install psycopg2
import psycopg2
from psycopg2 import sql

try:
    # 连接到PostgreSQL数据库
    conn = psycopg2.connect(
        host="localhost",
        database="your_database",
        user="your_username",
        password="your_password"
    )
    
    print("成功连接到PostgreSQL数据库")
    
    # 创建一个游标对象
    cursor = conn.cursor()
    
    # 创建表
    cursor.execute('''
    CREATE TABLE IF NOT EXISTS students (
        id SERIAL PRIMARY KEY,
        name VARCHAR(255) NOT NULL,
        age INTEGER,
        grade VARCHAR(10)
    )
    ''')
    
    # 插入数据
    insert_query = sql.SQL("INSERT INTO students (name, age, grade) VALUES (%s, %s, %s)")
    cursor.execute(insert_query, ('Alice', 25, 'A'))
    cursor.execute(insert_query, ('Bob', 30, 'B'))
    cursor.execute(insert_query, ('Charlie', 35, 'C'))
    
    # 提交事务
    conn.commit()
    print(f"{cursor.rowcount} 条记录已插入")
    
    # 查询数据
    cursor.execute("SELECT * FROM students")
    rows = cursor.fetchall()
    
    for row in rows:
        print(row)
    
    # 使用参数化查询
    cursor.execute("SELECT * FROM students WHERE age > %s", (25,))
    rows = cursor.fetchall()
    
    for row in rows:
        print(row)
    
    # 更新数据
    update_query = sql.SQL("UPDATE students SET grade = %s WHERE name = %s")
    cursor.execute(update_query, ('A+', 'Alice'))
    conn.commit()
    print(f"{cursor.rowcount} 条记录已更新")
    
    # 删除数据
    delete_query = sql.SQL("DELETE FROM students WHERE name = %s")
    cursor.execute(delete_query, ('Charlie',))
    conn.commit()
    print(f"{cursor.rowcount} 条记录已删除")

except (Exception, psycopg2.Error) as error:
    print("Error while connecting to PostgreSQL", error)
finally:
    # 关闭连接
    if conn:
        cursor.close()
        conn.close()
        print("PostgreSQL连接已关闭")

10.4 MongoDB数据库

MongoDB是一种流行的NoSQL数据库,它存储的是文档而不是表格。要使用Python连接MongoDB,我们需要安装pymongo库。

# 首先需要安装pymongo: pip install pymongo
from pymongo import MongoClient
from pymongo.errors import ConnectionFailure

try:
    # 连接到MongoDB
    client = MongoClient('mongodb://localhost:27017/')
    
    # 测试连接
    client.admin.command('ismaster')
    print("成功连接到MongoDB")
    
    # 创建或获取数据库
    db = client['university']
    
    # 创建或获取集合
    students_collection = db['students']
    
    # 插入文档
    student1 = {
        "name": "Alice",
        "age": 25,
        "grade": "A",
        "courses": ["Math", "Physics", "Chemistry"]
    }
    
    student2 = {
        "name": "Bob",
        "age": 30,
        "grade": "B",
        "courses": ["History", "Art", "Literature"]
    }
    
    student3 = {
        "name": "Charlie",
        "age": 35,
        "grade": "C",
        "courses": ["Biology", "Geology", "Astronomy"]
    }
    
    # 插入单个文档
    result = students_collection.insert_one(student1)
    print(f"插入的文档ID: {result.inserted_id}")
    
    # 插入多个文档
    result = students_collection.insert_many([student2, student3])
    print(f"插入的文档ID: {result.inserted_ids}")
    
    # 查询文档
    print("所有学生:")
    for student in students_collection.find():
        print(student)
    
    # 条件查询
    print("年龄大于25的学生:")
    for student in students_collection.find({"age": {"$gt": 25}}):
        print(student)
    
    # 更新文档
    result = students_collection.update_one(
        {"name": "Alice"},
        {"$set": {"grade": "A+"}}
    )
    print(f"更新的文档数量: {result.modified_count}")
    
    # 删除文档
    result = students_collection.delete_one({"name": "Charlie"})
    print(f"删除的文档数量: {result.deleted_count}")

except ConnectionFailure:
    print("无法连接到MongoDB")
finally:
    # 关闭连接
    if 'client' in locals():
        client.close()
        print("MongoDB连接已关闭")

10.5 ORM框架

ORM(Object-Relational Mapping)框架可以将数据库表映射为Python对象,使得我们可以使用面向对象的方式来操作数据库。SQLAlchemy是Python中最流行的ORM框架之一。

# 首先需要安装SQLAlchemy: pip install sqlalchemy
from sqlalchemy import create_engine, Column, Integer, String, Float
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

# 创建基类
Base = declarative_base()

# 定义学生类
class Student(Base):
    __tablename__ = 'students'
    
    id = Column(Integer, primary_key=True)
    name = Column(String(255), nullable=False)
    age = Column(Integer)
    grade = Column(String(10))
    score = Column(Float)
    
    def __repr__(self):
        return f"<Student(name='{self.name}', age={self.age}, grade='{self.grade}', score={self.score})>"

# 创建数据库引擎
engine = create_engine('sqlite:///university.db')

# 创建表
Base.metadata.create_all(engine)

# 创建会话
Session = sessionmaker(bind=engine)
session = Session()

# 添加学生
student1 = Student(name='Alice', age=25, grade='A', score=95.5)
student2 = Student(name='Bob', age=30, grade='B', score=85.0)
student3 = Student(name='Charlie', age=35, grade='C', score=75.5)

session.add(student1)
session.add(student2)
session.add(student3)

# 提交会话
session.commit()

# 查询所有学生
students = session.query(Student).all()
for student in students:
    print(student)

# 条件查询
students = session.query(Student).filter(Student.age > 25).all()
for student in students:
    print(student)

# 更新学生
student = session.query(Student).filter_by(name='Alice').first()
student.grade = 'A+'
student.score = 98.0
session.commit()

# 删除学生
student = session.query(Student).filter_by(name='Charlie').first()
session.delete(student)
session.commit()

# 关闭会话
session.close()

十一、网络编程基础

网络编程是指编写能够在网络中不同计算机之间进行通信的程序。Python提供了丰富的库来支持网络编程。

11.1 Socket编程

Socket(套接字)是网络编程的基础,它提供了一种在不同计算机之间进行通信的方式。Python的socket模块提供了对Socket的支持。

11.1.1 TCP Socket

TCP(Transmission Control Protocol)是一种面向连接的、可靠的、基于字节流的传输层通信协议。

# TCP服务器
import socket

# 创建socket对象
server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# 绑定地址和端口
server_address = ('localhost', 8888)
server_socket.bind(server_address)

# 开始监听,最大连接数为5
server_socket.listen(5)
print(f"服务器启动,监听 {server_address}")

while True:
    # 等待客户端连接
    print("等待客户端连接...")
    client_socket, client_address = server_socket.accept()
    print(f"客户端 {client_address} 已连接")
    
    try:
        # 接收数据
        data = client_socket.recv(1024)
        print(f"接收到数据: {data.decode()}")
        
        # 发送数据
        message = "Hello, Client!"
        client_socket.sendall(message.encode())
        print("已发送数据")
        
    finally:
        # 关闭连接
        client_socket.close()
        print(f"与客户端 {client_address} 的连接已关闭")
# TCP客户端
import socket

# 创建socket对象
client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# 连接服务器
server_address = ('localhost', 8888)
client_socket.connect(server_address)
print(f"已连接到服务器 {server_address}")

try:
    # 发送数据
    message = "Hello, Server!"
    client_socket.sendall(message.encode())
    print("已发送数据")
    
    # 接收数据
    data = client_socket.recv(1024)
    print(f"接收到数据: {data.decode()}")
    
finally:
    # 关闭连接
    client_socket.close()
    print("连接已关闭")
11.1.2 UDP Socket

UDP(User Datagram Protocol)是一种无连接的传输层协议,它不保证数据包的顺序和可靠性,但具有较低的延迟。

# UDP服务器
import socket

# 创建socket对象
server_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)

# 绑定地址和端口
server_address = ('localhost', 9999)
server_socket.bind(server_address)
print(f"服务器启动,监听 {server_address}")

while True:
    # 接收数据和客户端地址
    data, client_address = server_socket.recvfrom(1024)
    print(f"从 {client_address} 接收到数据: {data.decode()}")
    
    # 发送数据
    message = "Hello, Client!"
    server_socket.sendto(message.encode(), client_address)
    print(f"已向 {client_address} 发送数据")
# UDP客户端
import socket

# 创建socket对象
client_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)

# 服务器地址
server_address = ('localhost', 9999)

try:
    # 发送数据
    message = "Hello, Server!"
    client_socket.sendto(message.encode(), server_address)
    print(f"已向 {server_address} 发送数据")
    
    # 接收数据
    data, server_address = client_socket.recvfrom(1024)
    print(f"从 {server_address} 接收到数据: {data.decode()}")
    
finally:
    # 关闭连接
    client_socket.close()
    print("连接已关闭")

11.2 HTTP编程

HTTP(HyperText Transfer Protocol)是互联网上应用最广泛的一种网络协议。Python的requests库提供了简洁的HTTP客户端功能。

# 首先需要安装requests: pip install requests
import requests
import json

# 发送GET请求
response = requests.get('https://api.github.com')
print(f"状态码: {response.status_code}")
print(f"响应头: {response.headers}")
print(f"响应内容: {response.text}")

# 发送带参数的GET请求
params = {'q': 'python', 'sort': 'stars'}
response = requests.get('https://api.github.com/search/repositories', params=params)
print(f"URL: {response.url}")
data = response.json()
print(f"找到的仓库数量: {data['total_count']}")

# 发送POST请求
url = 'https://httpbin.org/post'
data = {'name': 'Alice', 'age': 25}
response = requests.post(url, data=data)
print(f"状态码: {response.status_code}")
print(f"响应内容: {response.json()}")

# 发送JSON数据
url = 'https://httpbin.org/post'
data = {'name': 'Bob', 'age': 30}
headers = {'Content-Type': 'application/json'}
response = requests.post(url, data=json.dumps(data), headers=headers)
print(f"状态码: {response.status_code}")
print(f"响应内容: {response.json()}")

# 上传文件
url = 'https://httpbin.org/post'
files = {'file': open('example.txt', 'rb')}
response = requests.post(url, files=files)
print(f"状态码: {response.status_code}")
print(f"响应内容: {response.json()}")

# 处理异常
try:
    response = requests.get('https://api.github.com/invalid-url')
    response.raise_for_status()  # 如果状态码不是200,则抛出异常
except requests.exceptions.RequestException as e:
    print(f"请求出错: {e}")

11.3 Web服务器

Python的http.server模块提供了一个简单的HTTP服务器实现,我们可以用它来创建基本的Web服务器。

import http.server
import socketserver

# 设置端口
PORT = 8000

# 创建请求处理程序
class MyRequestHandler(http.server.SimpleHTTPRequestHandler):
    def do_GET(self):
        if self.path == '/':
            self.path = '/index.html'
        return http.server.SimpleHTTPRequestHandler.do_GET(self)

# 创建服务器
with socketserver.TCPServer(("", PORT), MyRequestHandler) as httpd:
    print(f"服务器启动,监听端口 {PORT}")
    # 启动服务器
    httpd.serve_forever()

对于更复杂的Web应用,我们可以使用Web框架,如Flask或Django。

11.3.1 Flask框架
# 首先需要安装Flask: pip install flask
from flask import Flask, render_template, request, jsonify

# 创建Flask应用
app = Flask(__name__)

# 路由和视图函数
@app.route('/')
def index():
    return "Hello, World!"

@app.route('/hello/<name>')
def hello(name):
    return f"Hello, {name}!"

@app.route('/user/<int:user_id>')
def user_profile(user_id):
    return f"User ID: {user_id}"

# 处理GET和POST请求
@app.route('/login', methods=['GET', 'POST'])
def login():
    if request.method == 'POST':
        username = request.form['username']
        password = request.form['password']
        # 这里可以添加验证逻辑
        return f"Welcome, {username}!"
    else:
        return '''
            <form method="post">
                <p>Username: <input type="text" name="username"></p>
                <p>Password: <input type="password" name="password"></p>
                <p><input type="submit" value="Login"></p>
            </form>
        '''

# 返回JSON数据
@app.route('/api/data')
def get_data():
    data = {
        'name': 'Alice',
        'age': 25,
        'courses': ['Math', 'Physics', 'Chemistry']
    }
    return jsonify(data)

# 返回HTML模板
@app.route('/template')
def template():
    # 假设有一个名为template.html的模板文件
    return render_template('template.html', name='Alice')

# 错误处理
@app.errorhandler(404)
def page_not_found(e):
    return "Page not found", 404

# 启动应用
if __name__ == '__main__':
    app.run(debug=True)
11.3.2 Django框架
# 首先需要安装Django: pip install django
# 创建Django项目: django-admin startproject myproject
# 创建Django应用: python manage.py startapp myapp

# myapp/views.py
from django.http import HttpResponse, JsonResponse
from django.shortcuts import render

def index(request):
    return HttpResponse("Hello, World!")

def hello(request, name):
    return HttpResponse(f"Hello, {name}!")

def user_profile(request, user_id):
    return HttpResponse(f"User ID: {user_id}")

def login(request):
    if request.method == 'POST':
        username = request.POST.get('username')
        password = request.POST.get('password')
        # 这里可以添加验证逻辑
        return HttpResponse(f"Welcome, {username}!")
    else:
        return '''
            <form method="post">
                <p>Username: <input type="text" name="username"></p>
                <p>Password: <input type="password" name="password"></p>
                <p><input type="submit" value="Login"></p>
            </form>
        '''

def get_data(request):
    data = {
        'name': 'Alice',
        'age': 25,
        'courses': ['Math', 'Physics', 'Chemistry']
    }
    return JsonResponse(data)

def template(request):
    # 假设有一个名为template.html的模板文件
    return render(request, 'template.html', {'name': 'Alice'})

# myproject/urls.py
from django.contrib import admin
from django.urls import path
from myapp import views

urlpatterns = [
    path('admin/', admin.site.urls),
    path('', views.index, name='index'),
    path('hello/<str:name>/', views.hello, name='hello'),
    path('user/<int:user_id>/', views.user_profile, name='user_profile'),
    path('login/', views.login, name='login'),
    path('api/data/', views.get_data, name='get_data'),
    path('template/', views.template, name='template'),
]

11.4 Web爬虫

Web爬虫是一种自动浏览互联网并收集信息的程序。Python的requests和BeautifulSoup库提供了强大的Web爬虫功能。

# 首先需要安装requests和beautifulsoup4: pip install requests beautifulsoup4
import requests
from bs4 import BeautifulSoup
import csv
import json

# 发送HTTP请求
url = 'https://example.com'
response = requests.get(url)

# 检查请求是否成功
if response.status_code == 200:
    # 解析HTML内容
    soup = BeautifulSoup(response.text, 'html.parser')
    
    # 提取标题
    title = soup.title.text
    print(f"标题: {title}")
    
    # 提取所有链接
    links = soup.find_all('a')
    print("所有链接:")
    for link in links:
        print(link.get('href'))
    
    # 提取特定元素
    # 假设我们要提取所有class为'item'的div元素
    items = soup.find_all('div', class_='item')
    print("所有项目:")
    for item in items:
        print(item.text.strip())
    
    # 提取表格数据
    table = soup.find('table')
    if table:
        rows = table.find_all('tr')
        data = []
        for row in rows:
            cells = row.find_all(['td', 'th'])
            row_data = [cell.text.strip() for cell in cells]
            data.append(row_data)
        
        # 将数据保存为CSV文件
        with open('table_data.csv', 'w', newline='', encoding='utf-8') as file:
            writer = csv.writer(file)
            writer.writerows(data)
        
        # 将数据保存为JSON文件
        with open('table_data.json', 'w', encoding='utf-8') as file:
            json.dump(data, file, ensure_ascii=False, indent=4)
    
    # 处理分页
    # 假设我们要爬取多个页面的数据
    all_data = []
    for page in range(1, 6):  # 爬取前5页
        page_url = f'{url}?page={page}'
        page_response = requests.get(page_url)
        
        if page_response.status_code == 200:
            page_soup = BeautifulSoup(page_response.text, 'html.parser')
            page_items = page_soup.find_all('div', class_='item')
            
            for item in page_items:
                item_data = {
                    'title': item.find('h2').text.strip(),
                    'description': item.find('p').text.strip(),
                    'link': item.find('a').get('href')
                }
                all_data.append(item_data)
    
    # 将所有数据保存为JSON文件
    with open('all_data.json', 'w', encoding='utf-8') as file:
        json.dump(all_data, file, ensure_ascii=False, indent=4)
    
    print(f"共爬取了 {len(all_data)} 条数据")

else:
    print(f"请求失败,状态码: {response.status_code}")

警告:在进行Web爬虫时,请遵守网站的robots.txt规则,并尊重网站的使用条款。过于频繁的请求可能会对网站服务器造成负担,甚至可能触犯法律。

十二、GUI编程入门

GUI(Graphical User Interface,图形用户界面)是用户与程序交互的图形化方式。Python提供了多种GUI库,如Tkinter、PyQt、wxPython等。

12.1 Tkinter

Tkinter是Python的标准GUI库,它简单易用,适合初学者。

12.1.1 基本窗口
import tkinter as tk

# 创建主窗口
root = tk.Tk()
root.title("我的第一个GUI程序")
root.geometry("300x200")  # 设置窗口大小

# 添加标签
label = tk.Label(root, text="Hello, World!")
label.pack(pady=10)  # pady参数用于设置垂直间距

# 添加按钮
def button_click():
    print("按钮被点击了!")

button = tk.Button(root, text="点击我", command=button_click)
button.pack(pady=10)

# 添加文本框
entry = tk.Entry(root)
entry.pack(pady=10)

# 添加文本区域
text = tk.Text(root, height=5, width=30)
text.pack(pady=10)

# 运行主循环
root.mainloop()
12.1.2 布局管理

Tkinter提供了三种布局管理器:pack、grid和place。

import tkinter as tk

root = tk.Tk()
root.title("布局管理")
root.geometry("400x300")

# Pack布局
frame1 = tk.Frame(root, bg='red', height=100)
frame1.pack(fill=tk.X, padx=5, pady=5)

label1 = tk.Label(frame1, text="Pack布局", bg='red')
label1.pack()

# Grid布局
frame2 = tk.Frame(root, bg='green', height=100)
frame2.pack(fill=tk.BOTH, expand=True, padx=5, pady=5)

for i in range(3):
    for j in range(3):
        label = tk.Label(frame2, text=f"Grid {i},{j}", bg='green')
        label.grid(row=i, column=j, padx=5, pady=5)

# Place布局
frame3 = tk.Frame(root, bg='blue', height=100)
frame3.pack(fill=tk.X, padx=5, pady=5)

label3 = tk.Label(frame3, text="Place布局", bg='blue')
label3.place(x=150, y=40)

root.mainloop()
12.1.3 事件处理
import tkinter as tk
from tkinter import messagebox

def button_click():
    messagebox.showinfo("消息", "按钮被点击了!")

def key_press(event):
    print(f"按下了键: {event.char}")

def mouse_click(event):
    print(f"鼠标点击位置: ({event.x}, {event.y})")

root = tk.Tk()
root.title("事件处理")
root.geometry("300x200")

# 绑定键盘事件
root.bind("<Key>", key_press)

# 绑定鼠标事件
root.bind("<Button-1>", mouse_click)

# 添加按钮
button = tk.Button(root, text="点击我", command=button_click)
button.pack(pady=20)

root.mainloop()
12.1.4 常用组件
import tkinter as tk
from tkinter import ttk, messagebox, filedialog

def show_selection():
    selected = listbox.curselection()
    if selected:
        messagebox.showinfo("选择", f"你选择了: {listbox.get(selected[0])}")

def open_file():
    file_path = filedialog.askopenfilename()
    if file_path:
        with open(file_path, 'r') as file:
            text.delete(1.0, tk.END)
            text.insert(tk.END, file.read())

def save_file():
    file_path = filedialog.asksaveasfilename()
    if file_path:
        with open(file_path, 'w') as file:
            file.write(text.get(1.0, tk.END))

root = tk.Tk()
root.title("常用组件")
root.geometry("500x400")

# 标签
label = tk.Label(root, text="标签")
label.pack(pady=5)

# 按钮
button_frame = tk.Frame(root)
button_frame.pack(pady=5)

button1 = tk.Button(button_frame, text="打开文件", command=open_file)
button1.pack(side=tk.LEFT, padx=5)

button2 = tk.Button(button_frame, text="保存文件", command=save_file)
button2.pack(side=tk.LEFT, padx=5)

button3 = tk.Button(button_frame, text="显示选择", command=show_selection)
button3.pack(side=tk.LEFT, padx=5)

# 输入框
entry = tk.Entry(root, width=40)
entry.pack(pady=5)

# 列表框
listbox_frame = tk.Frame(root)
listbox_frame.pack(pady=5, fill=tk.X)

listbox = tk.Listbox(listbox_frame)
listbox.pack(side=tk.LEFT, fill=tk.BOTH, expand=True)

scrollbar = tk.Scrollbar(listbox_frame)
scrollbar.pack(side=tk.RIGHT, fill=tk.Y)

listbox.config(yscrollcommand=scrollbar.set)
scrollbar.config(command=listbox.yview)

for i in range(10):
    listbox.insert(tk.END, f"项目 {i+1}")

# 单选按钮
radio_frame = tk.Frame(root)
radio_frame.pack(pady=5)

var = tk.StringVar()
var.set("选项1")

radio1 = tk.Radiobutton(radio_frame, text="选项1", variable=var, value="选项1")
radio1.pack(side=tk.LEFT)

radio2 = tk.Radiobutton(radio_frame, text="选项2", variable=var, value="选项2")
radio2.pack(side=tk.LEFT)

radio3 = tk.Radiobutton(radio_frame, text="选项3", variable=var, value="选项3")
radio3.pack(side=tk.LEFT)

# 复选框
check_frame = tk.Frame(root)
check_frame.pack(pady=5)

check1_var = tk.BooleanVar()
check2_var = tk.BooleanVar()
check3_var = tk.BooleanVar()

check1 = tk.Checkbutton(check_frame, text="选项A", variable=check1_var)
check1.pack(side=tk.LEFT)

check2 = tk.Checkbutton(check_frame, text="选项B", variable=check2_var)
check2.pack(side=tk.LEFT)

check3 = tk.Checkbutton(check_frame, text="选项C", variable=check3_var)
check3.pack(side=tk.LEFT)

# 下拉框
combo_frame = tk.Frame(root)
combo_frame.pack(pady=5)

combo_label = tk.Label(combo_frame, text="选择:")
combo_label.pack(side=tk.LEFT)

combo = ttk.Combobox(combo_frame, values=["选项1", "选项2", "选项3"])
combo.pack(side=tk.LEFT)
combo.current(0)

# 滑块
scale = tk.Scale(root, from_=0, to=100, orient=tk.HORIZONTAL)
scale.pack(pady=5, fill=tk.X)

# 文本区域
text = tk.Text(root, height=10, width=50)
text.pack(pady=5, fill=tk.BOTH, expand=True)

root.mainloop()

12.2 PyQt

PyQt是一个功能强大的GUI库,它提供了丰富的组件和功能。PyQt有PyQt5和PyQt6两个版本,这里以PyQt5为例。

12.2.1 基本窗口
# 首先需要安装PyQt5: pip install pyqt5
import sys
from PyQt5.QtWidgets import QApplication, QWidget, QLabel, QPushButton, QVBoxLayout, QLineEdit, QTextEdit

class MyWindow(QWidget):
    def __init__(self):
        super().__init__()
        self.initUI()
    
    def initUI(self):
        # 设置窗口标题和大小
        self.setWindowTitle('我的第一个PyQt程序')
        self.setGeometry(300, 300, 300, 200)
        
        # 创建布局
        layout = QVBoxLayout()
        
        # 添加标签
        label = QLabel('Hello, World!')
        layout.addWidget(label)
        
        # 添加按钮
        button = QPushButton('点击我')
        button.clicked.connect(self.button_click)
        layout.addWidget(button)
        
        # 添加文本框
        self.entry = QLineEdit()
        layout.addWidget(self.entry)
        
        # 添加文本区域
        self.text = QTextEdit()
        layout.addWidget(self.text)
        
        # 设置布局
        self.setLayout(layout)
    
    def button_click(self):
        text = self.entry.text()
        self.text.append(f"按钮被点击了! 输入的内容是: {text}")

if __name__ == '__main__':
    app = QApplication(sys.argv)
    window = MyWindow()
    window.show()
    sys.exit(app.exec_())
12.2.2 信号和槽
import sys
from PyQt5.QtWidgets import QApplication, QWidget, QLabel, QPushButton, QVBoxLayout, QSlider, QSpinBox, QComboBox
from PyQt5.QtCore import Qt

class MyWindow(QWidget):
    def __init__(self):
        super().__init__()
        self.initUI()
    
    def initUI(self):
        self.setWindowTitle('信号和槽')
        self.setGeometry(300, 300, 300, 200)
        
        layout = QVBoxLayout()
        
        # 标签
        self.label = QLabel('0')
        self.label.setAlignment(Qt.AlignCenter)
        layout.addWidget(self.label)
        
        # 滑块
        slider = QSlider(Qt.Horizontal)
        slider.setRange(0, 100)
        slider.valueChanged.connect(self.update_label)
        layout.addWidget(slider)
        
        # 微调框
        spin_box = QSpinBox()
        spin_box.setRange(0, 100)
        spin_box.valueChanged.connect(self.update_label)
        layout.addWidget(spin_box)
        
        # 下拉框
        combo_box = QComboBox()
        combo_box.addItems(['选项1', '选项2', '选项3'])
        combo_box.currentTextChanged.connect(self.update_label_text)
        layout.addWidget(combo_box)
        
        # 按钮
        button = QPushButton('重置')
        button.clicked.connect(self.reset)
        layout.addWidget(button)
        
        self.setLayout(layout)
    
    def update_label(self, value):
        self.label.setText(str(value))
    
    def update_label_text(self, text):
        self.label.setText(text)
    
    def reset(self):
        self.label.setText('0')

if __name__ == '__main__':
    app = QApplication(sys.argv)
    window = MyWindow()
    window.show()
    sys.exit(app.exec_())
12.2.3 常用组件
import sys
from PyQt5.QtWidgets import (QApplication, QWidget, QLabel, QPushButton, QVBoxLayout, QHBoxLayout, 
                            QLineEdit, QTextEdit, QSlider, QSpinBox, QComboBox, QCheckBox, 
                            QRadioButton, QGroupBox, QListWidget, QTabWidget, QTableWidget, 
                            QTableWidgetItem, QFileDialog, QMessageBox, QProgressBar)
from PyQt5.QtCore import Qt

class MyWindow(QWidget):
    def __init__(self):
        super().__init__()
        self.initUI()
    
    def initUI(self):
        self.setWindowTitle('常用组件')
        self.setGeometry(300, 300, 800, 600)
        
        main_layout = QVBoxLayout()
        
        # 标签页
        tab_widget = QTabWidget()
        main_layout.addWidget(tab_widget)
        
        # 第一个标签页:基本组件
        basic_tab = QWidget()
        tab_widget.addTab(basic_tab, "基本组件")
        
        basic_layout = QVBoxLayout()
        basic_tab.setLayout(basic_layout)
        
        # 输入框
        input_layout = QHBoxLayout()
        input_label = QLabel("输入:")
        self.input_field = QLineEdit()
        input_layout.addWidget(input_label)
        input_layout.addWidget(self.input_field)
        basic_layout.addLayout(input_layout)
        
        # 文本区域
        self.text_area = QTextEdit()
        basic_layout.addWidget(self.text_area)
        
        # 按钮组
        button_layout = QHBoxLayout()
        
        open_button = QPushButton("打开文件")
        open_button.clicked.connect(self.open_file)
        button_layout.addWidget(open_button)
        
        save_button = QPushButton("保存文件")
        save_button.clicked.connect(self.save_file)
        button_layout.addWidget(save_button)
        
        clear_button = QPushButton("清空")
        clear_button.clicked.connect(self.clear_text)
        button_layout.addWidget(clear_button)
        
        basic_layout.addLayout(button_layout)
        
        # 进度条
        self.progress_bar = QProgressBar()
        self.progress_bar.setValue(50)
        basic_layout.addWidget(self.progress_bar)
        
        # 第二个标签页:选择组件
        selection_tab = QWidget()
        tab_widget.addTab(selection_tab, "选择组件")
        
        selection_layout = QVBoxLayout()
        selection_tab.setLayout(selection_layout)
        
        # 单选按钮组
        radio_group = QGroupBox("单选按钮")
        radio_layout = QVBoxLayout()
        
        self.radio1 = QRadioButton("选项1")
        self.radio2 = QRadioButton("选项2")
        self.radio3 = QRadioButton("选项3")
        
        radio_layout.addWidget(self.radio1)
        radio_layout.addWidget(self.radio2)
        radio_layout.addWidget(self.radio3)
        
        radio_group.setLayout(radio_layout)
        selection_layout.addWidget(radio_group)
        
        # 复选框组
        check_group = QGroupBox("复选框")
        check_layout = QVBoxLayout()
        
        self.check1 = QCheckBox("选项A")
        self.check2 = QCheckBox("选项B")
        self.check3 = QCheckBox("选项C")
        
        check_layout.addWidget(self.check1)
        check_layout.addWidget(self.check2)
        check_layout.addWidget(self.check3)
        
        check_group.setLayout(check_layout)
        selection_layout.addWidget(check_group)
        
        # 下拉框
        combo_layout = QHBoxLayout()
        combo_label = QLabel("下拉框:")
        self.combo_box = QComboBox()
        self.combo_box.addItems(["选项1", "选项2", "选项3"])
        combo_layout.addWidget(combo_label)
        combo_layout.addWidget(self.combo_box)
        selection_layout.addLayout(combo_layout)
        
        # 列表框
        list_label = QLabel("列表框:")
        selection_layout.addWidget(list_label)
        
        self.list_widget = QListWidget()
        for i in range(10):
            self.list_widget.addItem(f"项目 {i+1}")
        selection_layout.addWidget(self.list_widget)
        
        # 第三个标签页:表格
        table_tab = QWidget()
        tab_widget.addTab(table_tab, "表格")
        
        table_layout = QVBoxLayout()
        table_tab.setLayout(table_layout)
        
        # 表格
        self.table_widget = QTableWidget(5, 3)
        self.table_widget.setHorizontalHeaderLabels(["姓名", "年龄", "成绩"])
        
        # 填充表格数据
        data = [
            ["Alice", "25", "A"],
            ["Bob", "30", "B"],
            ["Charlie", "35", "C"],
            ["David", "28", "A"],
            ["Eve", "32", "B"]
        ]
        
        for i in range(5):
            for j in range(3):
                item = QTableWidgetItem(data[i][j])
                self.table_widget.setItem(i, j, item)
        
        table_layout.addWidget(self.table_widget)
        
        # 表格按钮
        table_button_layout = QHBoxLayout()
        
        add_row_button = QPushButton("添加行")
        add_row_button.clicked.connect(self.add_row)
        table_button_layout.addWidget(add_row_button)
        
        remove_row_button = QPushButton("删除行")
        remove_row_button.clicked.connect(self.remove_row)
        table_button_layout.addWidget(remove_row_button)
        
        table_layout.addLayout(table_button_layout)
        
        self.setLayout(main_layout)
    
    def open_file(self):
        file_path, _ = QFileDialog.getOpenFileName(self, "打开文件", "", "文本文件 (*.txt)")
        if file_path:
            with open(file_path, 'r', encoding='utf-8') as file:
                self.text_area.setText(file.read())
    
    def save_file(self):
        file_path, _ = QFileDialog.getSaveFileName(self, "保存文件", "", "文本文件 (*.txt)")
        if file_path:
            with open(file_path, 'w', encoding='utf-8') as file:
                file.write(self.text_area.toPlainText())
    
    def clear_text(self):
        self.text_area.clear()
    
    def add_row(self):
        row_count = self.table_widget.rowCount()
        self.table_widget.insertRow(row_count)
    
    def remove_row(self):
        current_row = self.table_widget.currentRow()
        if current_row >= 0:
            self.table_widget.removeRow(current_row)

if __name__ == '__main__':
    app = QApplication(sys.argv)
    window = MyWindow()
    window.show()
    sys.exit(app.exec_())

十三、Python常用库介绍

Python拥有丰富的第三方库,这些库可以帮助我们快速实现各种功能。下面介绍一些常用的Python库。

13.1 数据科学库

13.1.1 NumPy

NumPy是Python中用于科学计算的基础库,它提供了高性能的多维数组对象和用于处理这些数组的工具。

# 首先需要安装NumPy: pip install numpy
import numpy as np

# 创建数组
a = np.array([1, 2, 3, 4, 5])
print(f"一维数组: {a}")

b = np.array([[1, 2, 3], [4, 5, 6]])
print(f"二维数组:\n{b}")

# 创建特殊数组
zeros = np.zeros((3, 4))  # 全零数组
print(f"全零数组:\n{zeros}")

ones = np.ones((2, 3))  # 全一数组
print(f"全一数组:\n{ones}")

random = np.random.random((2, 3))  # 随机数组
print(f"随机数组:\n{random}")

# 数组属性
print(f"数组形状: {b.shape}")
print(f"数组维度: {b.ndim}")
print(f"数组元素类型: {b.dtype}")
print(f"数组元素总数: {b.size}")

# 数组操作
c = np.array([[1, 2], [3, 4]])
d = np.array([[5, 6], [7, 8]])

# 数组加法
print(f"数组加法:\n{c + d}")

# 数组乘法(元素级)
print(f"数组乘法(元素级):\n{c * d}")

# 矩阵乘法
print(f"矩阵乘法:\n{np.dot(c, d)}")

# 数组索引和切片
e = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(f"第一行: {e[0]}")
print(f"第一列: {e[:, 0]}")
print(f"子数组:\n{e[0:2, 1:3]}")

# 数组形状操作
f = np.array([[1, 2, 3], [4, 5, 6]])
print(f"原始数组形状: {f.shape}")

g = f.reshape(3, 2)
print(f"重塑后的数组:\n{g}")

h = f.flatten()  # 将数组展平为一维
print(f"展平后的数组: {h}")

# 数组计算
i = np.array([1, 2, 3, 4, 5])
print(f"数组求和: {np.sum(i)}")
print(f"数组平均值: {np.mean(i)}")
print(f"数组最大值: {np.max(i)}")
print(f"数组最小值: {np.min(i)}")
print(f"数组标准差: {np.std(i)}")

# 数组排序
j = np.array([3, 1, 5, 2, 4])
print(f"排序后的数组: {np.sort(j)}")

# 数组条件操作
k = np.array([1, 2, 3, 4, 5])
print(f"大于2的元素: {k[k > 2]}")
print(f"偶数元素: {k[k % 2 == 0]}")
13.1.2 Pandas

Pandas是Python中用于数据分析和处理的库,它提供了DataFrame和Series等数据结构,以及各种数据操作工具。

# 首先需要安装Pandas: pip install pandas
import pandas as pd
import numpy as np

# 创建Series
s = pd.Series([1, 3, 5, np.nan, 6, 8])
print(f"Series:\n{s}")

# 创建DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [25, 30, 35, 28, 32],
    'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix'],
    'Score': [85, 90, 75, 80, 95]
}
df = pd.DataFrame(data)
print(f"DataFrame:\n{df}")

# 查看数据
print(f"前几行数据:\n{df.head()}")
print(f"后几行数据:\n{df.tail()}")
print(f"数据信息:")
df.info()
print(f"数据描述:\n{df.describe()}")

# 数据选择
print(f"选择列:\n{df['Name']}")
print(f"选择多列:\n{df[['Name', 'Age']}")
print(f"选择行:\n{df.loc[0]}")
print(f"选择多行:\n{df.loc[0:2]}")
print(f"条件选择:\n{df[df['Age'] > 30]}")

# 数据操作
df['Grade'] = ['A', 'A', 'C', 'B', 'A']  # 添加列
print(f"添加列后的DataFrame:\n{df}")

df = df.drop('Grade', axis=1)  # 删除列
print(f"删除列后的DataFrame:\n{df}")

df.loc[0, 'Age'] = 26  # 修改值
print(f"修改值后的DataFrame:\n{df}")

# 数据排序
print(f"按年龄排序:\n{df.sort_values('Age')}")
print(f"按分数降序排序:\n{df.sort_values('Score', ascending=False)}")

# 数据分组
print(f"按城市分组并计算平均年龄:\n{df.groupby('City')['Age'].mean()}")

# 数据聚合
print(f"按城市分组并计算多个统计量:\n{df.groupby('City').agg({'Age': 'mean', 'Score': 'max'})}")

# 数据合并
df1 = pd.DataFrame({'Name': ['Alice', 'Bob'], 'Age': [25, 30]})
df2 = pd.DataFrame({'Name': ['Charlie', 'David'], 'Age': [35, 28]})

df_concat = pd.concat([df1, df2])
print(f"合并后的DataFrame:\n{df_concat}")

# 数据连接
df3 = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'], 'Score': [85, 90, 75]})
df4 = pd.DataFrame({'Name': ['Alice', 'Bob', 'David'], 'City': ['New York', 'Los Angeles', 'Houston']})

df_merge = pd.merge(df3, df4, on='Name', how='outer')
print(f"连接后的DataFrame:\n{df_merge}")

# 处理缺失值
df_with_na = pd.DataFrame({
    'A': [1, 2, np.nan],
    'B': [5, np.nan, np.nan],
    'C': [1, 2, 3]
})
print(f"包含缺失值的DataFrame:\n{df_with_na}")

print(f"删除包含缺失值的行:\n{df_with_na.dropna()}")
print(f"填充缺失值:\n{df_with_na.fillna(0)}")

# 时间序列
dates = pd.date_range('20230101', periods=6)
df_time = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(f"时间序列DataFrame:\n{df_time}")

# 数据读取和写入
# df.to_csv('data.csv', index=False)  # 写入CSV文件
# df_csv = pd.read_csv('data.csv')  # 读取CSV文件

# df.to_excel('data.xlsx', index=False)  # 写入Excel文件
# df_excel = pd.read_excel('data.xlsx')  # 读取Excel文件
13.1.3 Matplotlib

Matplotlib是Python中用于绘制图表的库,它提供了各种类型的图表,如线图、散点图、柱状图等。

# 首先需要安装Matplotlib: pip install matplotlib
import matplotlib.pyplot as plt
import numpy as np

# 设置中文字体
plt.rcParams['font.sans-serif'] = ['WenQuanYi Zen Hei']
plt.rcParams['axes.unicode_minus'] = False

# 线图
x = np.linspace(0, 10, 100)
y = np.sin(x)

plt.figure(figsize=(10, 6))
plt.plot(x, y, label='sin(x)')
plt.plot(x, np.cos(x), label='cos(x)')
plt.xlabel('x')
plt.ylabel('y')
plt.title('三角函数')
plt.legend()
plt.grid(True)
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/line_plot.png')
plt.close()

# 散点图
np.random.seed(42)
N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.rand(N)
area = (30 * np.random.rand(N))**2

plt.figure(figsize=(10, 6))
plt.scatter(x, y, s=area, c=colors, alpha=0.5)
plt.xlabel('x')
plt.ylabel('y')
plt.title('散点图')
plt.colorbar()
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/scatter_plot.png')
plt.close()

# 柱状图
labels = ['A', 'B', 'C', 'D', 'E']
values = [12, 35, 30, 35, 27]

plt.figure(figsize=(10, 6))
plt.bar(labels, values)
plt.xlabel('类别')
plt.ylabel('值')
plt.title('柱状图')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/bar_plot.png')
plt.close()

# 直方图
data = np.random.randn(1000)

plt.figure(figsize=(10, 6))
plt.hist(data, bins=30, alpha=0.5)
plt.xlabel('值')
plt.ylabel('频数')
plt.title('直方图')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/histogram.png')
plt.close()

# 饼图
labels = ['A', 'B', 'C', 'D']
sizes = [15, 30, 45, 10]
explode = (0, 0.1, 0, 0)  # 突出显示第二块

plt.figure(figsize=(10, 6))
plt.pie(sizes, explode=explode, labels=labels, autopct='%1.1f%%', shadow=True, startangle=90)
plt.axis('equal')  # 使饼图呈圆形
plt.title('饼图')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/pie_chart.png')
plt.close()

# 箱线图
data = [np.random.normal(0, std, 100) for std in range(1, 4)]

plt.figure(figsize=(10, 6))
plt.boxplot(data, vert=True, patch_artist=True)
plt.xlabel('分布')
plt.ylabel('值')
plt.title('箱线图')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/box_plot.png')
plt.close()

# 子图
fig, axs = plt.subplots(2, 2, figsize=(12, 10))

# 线图
axs[0, 0].plot(x, np.sin(x))
axs[0, 0].set_title('sin(x)')

# 散点图
axs[0, 1].scatter(np.random.rand(50), np.random.rand(50))
axs[0, 1].set_title('散点图')

# 柱状图
axs[1, 0].bar(['A', 'B', 'C'], [10, 20, 15])
axs[1, 0].set_title('柱状图')

# 直方图
axs[1, 1].hist(np.random.randn(1000), bins=30)
axs[1, 1].set_title('直方图')

plt.tight_layout()
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/subplots.png')
plt.close()
13.1.4 Seaborn

Seaborn是基于Matplotlib的数据可视化库,它提供了更高级的接口和更美观的默认样式。

# 首先需要安装Seaborn: pip install seaborn
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# 设置样式
sns.set(style='whitegrid', font='WenQuanYi Zen Hei', rc={'axes.unicode_minus': False})

# 加载示例数据集
tips = sns.load_dataset('tips')
print(f"提示数据集:\n{tips.head()}")

# 散点图
plt.figure(figsize=(10, 6))
sns.scatterplot(x='total_bill', y='tip', data=tips, hue='time', size='size')
plt.title('总账单与小费的关系')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/seaborn_scatter.png')
plt.close()

# 线图
plt.figure(figsize=(10, 6))
sns.lineplot(x='total_bill', y='tip', data=tips, hue='day')
plt.title('总账单与小费的关系')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/seaborn_line.png')
plt.close()

# 柱状图
plt.figure(figsize=(10, 6))
sns.barplot(x='day', y='total_bill', data=tips)
plt.title('不同日期的总账单')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/seaborn_bar.png')
plt.close()

# 箱线图
plt.figure(figsize=(10, 6))
sns.boxplot(x='day', y='total_bill', data=tips)
plt.title('不同日期的总账单分布')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/seaborn_box.png')
plt.close()

# 小提琴图
plt.figure(figsize=(10, 6))
sns.violinplot(x='day', y='total_bill', data=tips)
plt.title('不同日期的总账单分布')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/seaborn_violin.png')
plt.close()

# 热力图
plt.figure(figsize=(10, 6))
corr = tips.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')
plt.title('相关性热力图')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/seaborn_heatmap.png')
plt.close()

# 分布图
plt.figure(figsize=(10, 6))
sns.distplot(tips['total_bill'], kde=True)
plt.title('总账单分布')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/seaborn_dist.png')
plt.close()

# 联合图
plt.figure(figsize=(10, 6))
sns.jointplot(x='total_bill', y='tip', data=tips, kind='scatter')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/seaborn_joint.png')
plt.close()

# 成对关系图
plt.figure(figsize=(10, 6))
sns.pairplot(tips)
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/seaborn_pair.png')
plt.close()

# 分类图
plt.figure(figsize=(10, 6))
sns.catplot(x='day', y='total_bill', data=tips, kind='swarm')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/seaborn_cat.png')
plt.close()

13.2 机器学习库

13.2.1 Scikit-learn

Scikit-learn是Python中用于机器学习的库,它提供了各种机器学习算法和工具。

# 首先需要安装Scikit-learn: pip install scikit-learn
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.cluster import KMeans
from sklearn.metrics import accuracy_score, mean_squared_error, confusion_matrix, classification_report
import seaborn as sns

# 设置中文字体
plt.rcParams['font.sans-serif'] = ['WenQuanYi Zen Hei']
plt.rcParams['axes.unicode_minus'] = False
sns.set(style='whitegrid', font='WenQuanYi Zen Hei', rc={'axes.unicode_minus': False})

# 加载示例数据集
# 波士顿房价数据集(回归问题)
boston = datasets.load_boston()
X_boston = boston.data
y_boston = boston.target

# 鸢尾花数据集(分类问题)
iris = datasets.load_iris()
X_iris = iris.data
y_iris = iris.target

# 划分训练集和测试集
X_boston_train, X_boston_test, y_boston_train, y_boston_test = train_test_split(
    X_boston, y_boston, test_size=0.3, random_state=42)

X_iris_train, X_iris_test, y_iris_train, y_iris_test = train_test_split(
    X_iris, y_iris, test_size=0.3, random_state=42)

# 数据标准化
scaler = StandardScaler()
X_boston_train_scaled = scaler.fit_transform(X_boston_train)
X_boston_test_scaled = scaler.transform(X_boston_test)

X_iris_train_scaled = scaler.fit_transform(X_iris_train)
X_iris_test_scaled = scaler.transform(X_iris_test)

# 线性回归
lin_reg = LinearRegression()
lin_reg.fit(X_boston_train_scaled, y_boston_train)
y_boston_pred = lin_reg.predict(X_boston_test_scaled)
mse = mean_squared_error(y_boston_test, y_boston_pred)
print(f"线性回归均方误差: {mse:.2f}")

# 逻辑回归
log_reg = LogisticRegression(max_iter=1000)
log_reg.fit(X_iris_train_scaled, y_iris_train)
y_iris_pred = log_reg.predict(X_iris_test_scaled)
accuracy = accuracy_score(y_iris_test, y_iris_pred)
print(f"逻辑回归准确率: {accuracy:.2f}")

# K近邻
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_iris_train_scaled, y_iris_train)
y_iris_pred = knn.predict(X_iris_test_scaled)
accuracy = accuracy_score(y_iris_test, y_iris_pred)
print(f"K近邻准确率: {accuracy:.2f}")

# 支持向量机
svm = SVC(kernel='linear')
svm.fit(X_iris_train_scaled, y_iris_train)
y_iris_pred = svm.predict(X_iris_test_scaled)
accuracy = accuracy_score(y_iris_test, y_iris_pred)
print(f"支持向量机准确率: {accuracy:.2f}")

# 决策树
tree = DecisionTreeClassifier(random_state=42)
tree.fit(X_iris_train, y_iris_train)  # 决策树不需要标准化
y_iris_pred = tree.predict(X_iris_test)
accuracy = accuracy_score(y_iris_test, y_iris_pred)
print(f"决策树准确率: {accuracy:.2f}")

# 随机森林
forest = RandomForestClassifier(n_estimators=100, random_state=42)
forest.fit(X_iris_train, y_iris_train)  # 随机森林不需要标准化
y_iris_pred = forest.predict(X_iris_test)
accuracy = accuracy_score(y_iris_test, y_iris_pred)
print(f"随机森林准确率: {accuracy:.2f}")

# K均值聚类
kmeans = KMeans(n_clusters=3, random_state=42)
kmeans.fit(X_iris)
y_iris_clusters = kmeans.labels_

# 可视化聚类结果
plt.figure(figsize=(10, 6))
plt.scatter(X_iris[:, 0], X_iris[:, 1], c=y_iris_clusters, cmap='viridis')
plt.xlabel('花萼长度')
plt.ylabel('花萼宽度')
plt.title('K均值聚类结果')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/kmeans_clustering.png')
plt.close()

# 混淆矩阵
cm = confusion_matrix(y_iris_test, y_iris_pred)
plt.figure(figsize=(10, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.xlabel('预测标签')
plt.ylabel('真实标签')
plt.title('混淆矩阵')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/confusion_matrix.png')
plt.close()

# 分类报告
print("分类报告:")
print(classification_report(y_iris_test, y_iris_pred, target_names=iris.target_names))
13.2.2 TensorFlow

TensorFlow是一个开源的机器学习框架,由Google开发,广泛用于深度学习和神经网络。

# 首先需要安装TensorFlow: pip install tensorflow
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

# 设置中文字体
plt.rcParams['font.sans-serif'] = ['WenQuanYi Zen Hei']
plt.rcParams['axes.unicode_minus'] = False

# 检查TensorFlow版本
print(f"TensorFlow版本: {tf.__version__}")

# 创建张量
scalar = tf.constant(7)
vector = tf.constant([1, 2, 3])
matrix = tf.constant([[1, 2], [3, 4]])

print(f"标量: {scalar}")
print(f"向量: {vector}")
print(f"矩阵: {matrix}")

# 张量操作
a = tf.constant([[1, 2], [3, 4]])
b = tf.constant([[5, 6], [7, 8]])

print(f"张量加法: {tf.add(a, b)}")
print(f"张量乘法: {tf.matmul(a, b)}")

# 变量
v = tf.Variable([[1.0, 2.0], [3.0, 4.0]])
print(f"变量: {v}")

# 自动微分
x = tf.Variable(3.0)
with tf.GradientTape() as tape:
    y = x**2
dy_dx = tape.gradient(y, x)
print(f"y = x^2 在 x=3 处的导数: {dy_dx}")

# 线性回归示例
# 生成数据
X = np.array([1, 2, 3, 4, 5], dtype=np.float32)
y = np.array([2, 4, 6, 8, 10], dtype=np.float32)

# 创建模型
model = tf.keras.Sequential([
    tf.keras.layers.Dense(units=1, input_shape=[1])
])

# 编译模型
model.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=0.01), loss='mean_squared_error')

# 训练模型
history = model.fit(X, y, epochs=100, verbose=0)

# 预测
print(f"预测 x=6 的结果: {model.predict([6.0])}")

# 绘制训练过程
plt.figure(figsize=(10, 6))
plt.plot(history.history['loss'])
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('训练过程中的损失')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/tf_linear_regression_loss.png')
plt.close()

# 神经网络分类示例
# 加载MNIST数据集
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# 数据预处理
x_train, x_test = x_train / 255.0, x_test / 255.0

# 创建模型
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation='softmax')
])

# 编译模型
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# 训练模型
history = model.fit(x_train, y_train, epochs=5, validation_data=(x_test, y_test))

# 评估模型
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f"测试准确率: {test_acc:.4f}")

# 绘制训练过程
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='训练准确率')
plt.plot(history.history['val_accuracy'], label='验证准确率')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title('训练过程中的准确率')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='训练损失')
plt.plot(history.history['val_loss'], label='验证损失')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('训练过程中的损失')
plt.legend()

plt.tight_layout()
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/tf_mnist_training.png')
plt.close()

# 卷积神经网络示例
# 创建CNN模型
cnn_model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

# 编译模型
cnn_model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])

# 数据预处理(添加通道维度)
x_train_cnn = x_train.reshape((-1, 28, 28, 1))
x_test_cnn = x_test.reshape((-1, 28, 28, 1))

# 训练模型
cnn_history = cnn_model.fit(x_train_cnn, y_train, epochs=5, validation_data=(x_test_cnn, y_test))

# 评估模型
cnn_test_loss, cnn_test_acc = cnn_model.evaluate(x_test_cnn, y_test, verbose=2)
print(f"CNN测试准确率: {cnn_test_acc:.4f}")

# 绘制训练过程
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.plot(cnn_history.history['accuracy'], label='训练准确率')
plt.plot(cnn_history.history['val_accuracy'], label='验证准确率')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title('CNN训练过程中的准确率')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(cnn_history.history['loss'], label='训练损失')
plt.plot(cnn_history.history['val_loss'], label='验证损失')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('CNN训练过程中的损失')
plt.legend()

plt.tight_layout()
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/tf_cnn_mnist_training.png')
plt.close()
13.2.3 PyTorch

PyTorch是另一个流行的深度学习框架,由Facebook开发,以其灵活性和易用性而闻名。

# 首先需要安装PyTorch: pip install torch torchvision
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import numpy as np
import matplotlib.pyplot as plt

# 设置中文字体
plt.rcParams['font.sans-serif'] = ['WenQuanYi Zen Hei']
plt.rcParams['axes.unicode_minus'] = False

# 检查PyTorch版本
print(f"PyTorch版本: {torch.__version__}")

# 创建张量
scalar = torch.tensor(7)
vector = torch.tensor([1, 2, 3])
matrix = torch.tensor([[1, 2], [3, 4]])

print(f"标量: {scalar}")
print(f"向量: {vector}")
print(f"矩阵: {matrix}")

# 张量操作
a = torch.tensor([[1, 2], [3, 4]], dtype=torch.float32)
b = torch.tensor([[5, 6], [7, 8]], dtype=torch.float32)

print(f"张量加法: {torch.add(a, b)}")
print(f"张量乘法: {torch.matmul(a, b)}")

# 自动微分
x = torch.tensor(3.0, requires_grad=True)
y = x**2
y.backward()
print(f"y = x^2 在 x=3 处的导数: {x.grad}")

# 线性回归示例
# 生成数据
X = torch.tensor([[1.0], [2.0], [3.0], [4.0], [5.0]])
y = torch.tensor([[2.0], [4.0], [6.0], [8.0], [10.0]])

# 创建模型
model = nn.Linear(1, 1)

# 定义损失函数和优化器
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

# 训练模型
num_epochs = 100
losses = []
for epoch in range(num_epochs):
    # 前向传播
    outputs = model(X)
    loss = criterion(outputs, y)
    
    # 反向传播和优化
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    losses.append(loss.item())
    
    if (epoch+1) % 10 == 0:
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

# 预测
predicted = model(torch.tensor([[6.0]]))
print(f"预测 x=6 的结果: {predicted.item()}")

# 绘制训练过程
plt.figure(figsize=(10, 6))
plt.plot(losses)
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('训练过程中的损失')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/pytorch_linear_regression_loss.png')
plt.close()

# 神经网络分类示例
# 定义超参数
input_size = 784  # 28x28
hidden_size = 128
num_classes = 10
num_epochs = 5
batch_size = 100
learning_rate = 0.001

# 加载MNIST数据集
train_dataset = torchvision.datasets.MNIST(root='./data', 
                                           train=True,
                                           transform=transforms.ToTensor(),
                                           download=True)

test_dataset = torchvision.datasets.MNIST(root='./data', 
                                          train=False,
                                          transform=transforms.ToTensor())

# 数据加载器
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=batch_size,
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                          batch_size=batch_size,
                                          shuffle=False)

# 定义神经网络模型
class NeuralNet(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(NeuralNet, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, num_classes)
    
    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out

model = NeuralNet(input_size, hidden_size, num_classes)

# 定义损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

# 训练模型
total_step = len(train_loader)
losses = []
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # 将张量展平
        images = images.reshape(-1, 28*28)
        
        # 前向传播
        outputs = model(images)
        loss = criterion(outputs, labels)
        
        # 反向传播和优化
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        if (i+1) % 100 == 0:
            print(f'Epoch [{epoch+1}/{num_epochs}], Step [{i+1}/{total_step}], Loss: {loss.item():.4f}')
    
    losses.append(loss.item())

# 测试模型
model.eval()  # 设置模型为评估模式
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.reshape(-1, 28*28)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
    
    print(f'测试准确率: {100 * correct / total:.2f}%')

# 绘制训练过程
plt.figure(figsize=(10, 6))
plt.plot(losses)
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('训练过程中的损失')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/pytorch_mnist_loss.png')
plt.close()

# 卷积神经网络示例
# 定义CNN模型
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.fc1 = nn.Linear(64 * 7 * 7, 128)
        self.fc2 = nn.Linear(128, num_classes)
        self.relu = nn.ReLU()
    
    def forward(self, x):
        out = self.relu(self.conv1(x))
        out = self.pool(out)
        out = self.relu(self.conv2(out))
        out = self.pool(out)
        out = out.view(-1, 64 * 7 * 7)
        out = self.relu(self.fc1(out))
        out = self.fc2(out)
        return out

cnn_model = CNN()

# 定义损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(cnn_model.parameters(), lr=learning_rate)

# 训练模型
total_step = len(train_loader)
cnn_losses = []
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # 前向传播
        outputs = cnn_model(images)
        loss = criterion(outputs, labels)
        
        # 反向传播和优化
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        if (i+1) % 100 == 0:
            print(f'Epoch [{epoch+1}/{num_epochs}], Step [{i+1}/{total_step}], Loss: {loss.item():.4f}')
    
    cnn_losses.append(loss.item())

# 测试模型
cnn_model.eval()  # 设置模型为评估模式
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        outputs = cnn_model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
    
    print(f'CNN测试准确率: {100 * correct / total:.2f}%')

# 绘制训练过程
plt.figure(figsize=(10, 6))
plt.plot(cnn_losses)
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('CNN训练过程中的损失')
plt.savefig('/home/wuying/autoglm/session_a2dc1bc4-2ea2-49d3-a5db-90c5e4232fd4/pytorch_cnn_mnist_loss.png')
plt.close()

13.3 Web开发库

13.3.1 Flask

Flask是一个轻量级的Web框架,它简单易用,适合小型项目和快速原型开发。

# 首先需要安装Flask: pip install flask
from flask import Flask, render_template, request, redirect, url_for, flash, session, jsonify
import os
import sqlite3

# 创建Flask应用
app = Flask(__name__)
app.secret_key = 'your_secret_key'  # 用于会话加密

# 配置数据库
DATABASE = 'app.db'

def get_db_connection():
    conn = sqlite3.connect(DATABASE)
    conn.row_factory = sqlite3.Row
    return conn

def init_db():
    conn = get_db_connection()
    
    # 创建用户表
    conn.execute('''
    CREATE TABLE IF NOT EXISTS users (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        username TEXT UNIQUE NOT NULL,
        password TEXT NOT NULL,
        email TEXT UNIQUE NOT NULL
    )
    ''')
    
    # 创建文章表
    conn.execute('''
    CREATE TABLE IF NOT EXISTS posts (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        title TEXT NOT NULL,
        content TEXT NOT NULL,
        author_id INTEGER NOT NULL,
        created TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
        FOREIGN KEY (author_id) REFERENCES users (id)
    )
    ''')
    
    conn.commit()
    conn.close()

# 初始化数据库
init_db()

# 路由和视图函数
@app.route('/')
def index():
    conn = get_db_connection()
    posts = conn.execute('SELECT p.*, u.username FROM posts p JOIN users u ON p.author_id = u.id ORDER BY p.created DESC').fetchall()
    conn.close()
    return render_template('index.html', posts=posts)

@app.route('/post/<int:post_id>')
def post(post_id):
    conn = get_db_connection()
    post = conn.execute('SELECT p.*, u.username FROM posts p JOIN users u ON p.author_id = u.id WHERE p.id = ?', (post_id,)).fetchone()
    conn.close()
    
    if post is None:
        flash('文章不存在', 'error')
        return redirect(url_for('index'))
    
    return render_template('post.html', post=post)

@app.route('/register', methods=['GET', 'POST'])
def register():
    if request.method == 'POST':
        username = request.form['username']
        password = request.form['password']
        email = request.form['email']
        
        if not username or not password or not email:
            flash('请填写所有字段', 'error')
            return render_template('register.html')
        
        conn = get_db_connection()
        try:
            conn.execute('INSERT INTO users (username, password, email) VALUES (?, ?, ?)', 
                        (username, password, email))
            conn.commit()
            flash('注册成功,请登录', 'success')
            return redirect(url_for('login'))
        except sqlite3.IntegrityError:
            flash('用户名或邮箱已存在', 'error')
        finally:
            conn.close()
    
    return render_template('register.html')

@app.route('/login', methods=['GET', 'POST'])
def login():
    if request.method == 'POST':
        username = request.form['username']
        password = request.form['password']
        
        if not username or not password:
            flash('请填写用户名和密码', 'error')
            return render_template('login.html')
        
        conn = get_db_connection()
        user = conn.execute('SELECT * FROM users WHERE username = ? AND password = ?', 
                          (username, password)).fetchone()
        conn.close()
        
        if user:
            session['user_id'] = user['id']
            session['username'] = user['username']
            flash('登录成功', 'success')
            return redirect(url_for('index'))
        else:
            flash('用户名或密码错误', 'error')
    
    return render_template('login.html')

@app.route('/logout')
def logout():
    session.clear()
    flash('已退出登录', 'info')
    return redirect(url_for('index'))

@app.route('/create', methods=['GET', 'POST'])
def create():
    if 'user_id' not in session:
        flash('请先登录', 'error')
        return redirect(url_for('login'))
    
    if request.method == 'POST':
        title = request.form['title']
        content = request.form['content']
        
        if not title or not content:
            flash('请填写标题和内容', 'error')
            return render_template('create.html')
        
        conn = get_db_connection()
        conn.execute('INSERT INTO posts (title, content, author_id) VALUES (?, ?, ?)', 
                   (title, content, session['user_id']))
        conn.commit()
        conn.close()
        
        flash('文章创建成功', 'success')
        return redirect(url_for('index'))
    
    return render_template('create.html')

@app.route('/edit/<int:post_id>', methods=['GET', 'POST'])
def edit(post_id):
    if 'user_id' not in session:
        flash('请先登录', 'error')
        return redirect(url_for('login'))
    
    conn = get_db_connection()
    post = conn.execute('SELECT * FROM posts WHERE id = ?', (post_id,)).fetchone()
    
    if post is None:
        flash('文章不存在', 'error')
        conn.close()
        return redirect(url_for('index'))
    
    if post['author_id'] != session['user_id']:
        flash('你没有权限编辑这篇文章', 'error')
        conn.close()
        return redirect(url_for('post', post_id=post_id))
    
    if request.method == 'POST':
        title = request.form['title']
        content = request.form['content']
        
        if not title or not content:
            flash('请填写标题和内容', 'error')
            conn.close()
            return render_template('edit.html', post=post)
        
        conn.execute('UPDATE posts SET title = ?, content = ? WHERE id = ?', 
                   (title, content, post_id))
        conn.commit()
        conn.close()
        
        flash('文章更新成功', 'success')
        return redirect(url_for('post', post_id=post_id))
    
    conn.close()
    return render_template('edit.html', post=post)

@app.route('/delete/<int:post_id>', methods=['POST'])
def delete(post_id):
    if 'user_id' not in session:
        flash('请先登录', 'error')
        return redirect(url_for('login'))
    
    conn = get_db_connection()
    post = conn.execute('SELECT * FROM posts WHERE id = ?', (post_id,)).fetchone()
    
    if post is None:
        flash('文章不存在', 'error')
        conn.close()
        return redirect(url_for('index'))
    
    if post['author_id'] != session['user_id']:
        flash('你没有权限删除这篇文章', 'error')
        conn.close()
        return redirect(url_for('post', post_id=post_id))
    
    conn.execute('DELETE FROM posts WHERE id = ?', (post_id,))
    conn.commit()
    conn.close()
    
    flash('文章删除成功', 'success')
    return redirect(url_for('index'))

# API路由
@app.route('/api/posts')
def api_posts():
    conn = get_db_connection()
    posts = conn.execute('SELECT p.*, u.username FROM posts p JOIN users u ON p.author_id = u.id ORDER BY p.created DESC').fetchall()
    conn.close()
    
    # 将结果转换为字典列表
    posts_list = []
    for post in posts:
        posts_list.append({
            'id': post['id'],
            'title': post['title'],
            'content': post['content'],
            'author': post['username'],
            'created': post['created']
        })
    
    return jsonify(posts_list)

@app.route('/api/post/<int:post_id>')
def api_post(post_id):
    conn = get_db_connection()
    post = conn.execute('SELECT p.*, u.username FROM posts p JOIN users u ON p.author_id = u.id WHERE p.id = ?', (post_id,)).fetchone()
    conn.close()
    
    if post is None:
        return jsonify({'error': 'Post not found'}), 404
    
    return jsonify({
        'id': post['id'],
        'title': post['title'],
        'content': post['content'],
        'author': post['username'],
        'created': post['created']
    })

# 错误处理
@app.errorhandler(404)
def page_not_found(e):
    return render_template('404.html'), 404

@app.errorhandler(500)
def internal_server_error(e):
    return render_template('500.html'), 500

# 启动应用
if __name__ == '__main__':
    app.run(debug=True)
13.3.2 Django

Django是一个功能强大的Web框架,它提供了完整的Web开发解决方案,包括ORM、表单处理、认证系统等。

# 首先需要安装Django: pip install django
# 创建Django项目: django-admin startproject myproject
# 创建Django应用: python manage.py startapp myapp

# myapp/models.py
from django.db import models
from django.contrib.auth.models import User

class Post(models.Model):
    title = models.CharField(max_length=200)
    content = models.TextField()
    author = models.ForeignKey(User, on_delete=models.CASCADE)
    created_at = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)
    
    def __str__(self):
        return self.title

class Comment(models.Model):
    post = models.ForeignKey(Post, on_delete=models.CASCADE, related_name='comments')
    author = models.ForeignKey(User, on_delete=models.CASCADE)
    content = models.TextField()
    created_at = models.DateTimeField(auto_now_add=True)
    
    def __str__(self):
        return f'Comment by {self.author.username} on {self.post.title}'

# myapp/views.py
from django.shortcuts import render, get_object_or_404, redirect
from django.contrib.auth import login, authenticate, logout
from django.contrib.auth.decorators import login_required
from django.contrib import messages
from django.core.paginator import Paginator
from django.db.models import Q
from .models import Post, Comment
from .forms import PostForm, CommentForm, UserRegistrationForm

def index(request):
    # 搜索功能
    query = request.GET.get('q')
    if query:
        posts = Post.objects.filter(
            Q(title__icontains=query) | Q(content__icontains=query)
        ).order_by('-created_at')
    else:
        posts = Post.objects.all().order_by('-created_at')
    
    # 分页
    paginator = Paginator(posts, 5)  # 每页显示5篇文章
    page_number = request.GET.get('page')
    page_obj = paginator.get_page(page_number)
    
    return render(request, 'myapp/index.html', {'page_obj': page_obj, 'query': query})

def post_detail(request, post_id):
    post = get_object_or_404(Post, id=post_id)
    comments = post.comments.all().order_by('-created_at')
    
    if request.method == 'POST':
        form = CommentForm(request.POST)
        if form.is_valid() and request.user.is_authenticated:
            comment = form.save(commit=False)
            comment.post = post
            comment.author = request.user
            comment.save()
            messages.success(request, '评论已添加')
            return redirect('post_detail', post_id=post.id)
    else:
        form = CommentForm()
    
    return render(request, 'myapp/post_detail.html', {
        'post': post,
        'comments': comments,
        'form': form
    })

@login_required
def create_post(request):
    if request.method == 'POST':
        form = PostForm(request.POST)
        if form.is_valid():
            post = form.save(commit=False)
            post.author = request.user
            post.save()
            messages.success(request, '文章已创建')
            return redirect('post_detail', post_id=post.id)
    else:
        form = PostForm()
    
    return render(request, 'myapp/create_post.html', {'form': form})

@login_required
def edit_post(request, post_id):
    post = get_object_or_404(Post, id=post_id, author=request.user)
    
    if request.method == 'POST':
        form = PostForm(request.POST, instance=post)
        if form.is_valid():
            form.save()
            messages.success(request, '文章已更新')
            return redirect('post_detail', post_id=post.id)
    else:
        form = PostForm(instance=post)
    
    return render(request, 'myapp/edit_post.html', {'form': form, 'post': post})

@login_required
def delete_post(request, post_id):
    post = get_object_or_404(Post, id=post_id, author=request.user)
    
    if request.method == 'POST':
        post.delete()
        messages.success(request, '文章已删除')
        return redirect('index')
    
    return render(request, 'myapp/delete_post.html', {'post': post})

def register(request):
    if request.method == 'POST':
        form = UserRegistrationForm(request.POST)
        if form.is_valid():
            user = form.save()
            username = form.cleaned_data.get('username')
            messages.success(request, f'账户 {username} 已创建,现在可以登录了')
            return redirect('login')
    else:
        form = UserRegistrationForm()
    
    return render(request, 'myapp/register.html', {'form': form})

def user_login(request):
    if request.method == 'POST':
        username = request.POST['username']
        password = request.POST['password']
        user = authenticate(request, username=username, password=password)
        
        if user is not None:
            login(request, user)
            messages.success(request, '登录成功')
            return redirect('index')
        else:
            messages.error(request, '用户名或密码错误')
    
    return render(request, 'myapp/login.html')

@login_required
def user_logout(request):
    logout(request)
    messages.success(request, '已退出登录')
    return redirect('index')

@login_required
def profile(request):
    user_posts = Post.objects.filter(author=request.user).order_by('-created_at')
    return render(request, 'myapp/profile.html', {'user_posts': user_posts})

# myapp/forms.py
from django import forms
from django.contrib.auth.forms import UserCreationForm
from django.contrib.auth.models import User
from .models import Post, Comment

class PostForm(forms.ModelForm):
    class Meta:
        model = Post
        fields = ['title', 'content']
        widgets = {
            'title': forms.TextInput(attrs={'class': 'form-control'}),
            'content': forms.Textarea(attrs={'class': 'form-control', 'rows': 10}),
        }

class CommentForm(forms.ModelForm):
    class Meta:
        model = Comment
        fields = ['content']
        widgets = {
            'content': forms.Textarea(attrs={'class': 'form-control', 'rows': 3}),
        }

class UserRegistrationForm(UserCreationForm):
    email = forms.EmailField(required=True)
    
    class Meta:
        model = User
        fields = ['username', 'email', 'password1', 'password2']
    
    def __init__(self, *args, **kwargs):
        super(UserRegistrationForm, self).__init__(*args, **kwargs)
        self.fields['username'].widget.attrs.update({'class': 'form-control'})
        self.fields['email'].widget.attrs.update({'class': 'form-control'})
        self.fields['password1'].widget.attrs.update({'class': 'form-control'})
        self.fields['password2'].widget.attrs.update({'class': 'form-control'})

# myproject/urls.py
from django.contrib import admin
from django.urls import path, include
from django.conf import settings
from django.conf.urls.static import static

urlpatterns = [
    path('admin/', admin.site.urls),
    path('', include('myapp.urls')),
]

if settings.DEBUG:
    urlpatterns += static(settings.MEDIA_URL, document_root=settings.MEDIA_ROOT)

# myapp/urls.py
from django.urls import path
from . import views

urlpatterns = [
    path('', views.index, name='index'),
    path('post/<int:post_id>/', views.post_detail, name='post_detail'),
    path('create/', views.create_post, name='create_post'),
    path('edit/<int:post_id>/', views.edit_post, name='edit_post'),
    path('delete/<int:post_id>/', views.delete_post, name='delete_post'),
    path('register/', views.register, name='register'),
    path('login/', views.user_login, name='login'),
    path('logout/', views.user_logout, name='logout'),
    path('profile/', views.profile, name='profile'),
]

13.4 其他常用库

13.4.1 Requests

Requests是一个简洁而优雅的HTTP库,它使得发送HTTP请求变得非常简单。

# 首先需要安装Requests: pip install requests
import requests
import json
from requests.auth import HTTPBasicAuth
from requests.exceptions import RequestException

# 发送GET请求
response = requests.get('https://api.github.com')
print(f"状态码: {response.status_code}")
print(f"响应头: {response.headers}")
print(f"响应内容: {response.text}")

# 发送带参数的GET请求
params = {'q': 'python', 'sort': 'stars'}
response = requests.get('https://api.github.com/search/repositories', params=params)
print(f"URL: {response.url}")
data = response.json()
print(f"找到的仓库数量: {data['total_count']}")

# 发送POST请求
url = 'https://httpbin.org/post'
data = {'name': 'Alice', 'age': 25}
response = requests.post(url, data=data)
print(f"状态码: {response.status_code}")
print(f"响应内容: {response.json()}")

# 发送JSON数据
url = 'https://httpbin.org/post'
data = {'name': 'Bob', 'age': 30}
headers = {'Content-Type': 'application/json'}
response = requests.post(url, data=json.dumps(data), headers=headers)
print(f"状态码: {response.status_code}")
print(f"响应内容: {response.json()}")

# 发送表单数据
url = 'https://httpbin.org/post'
data = {'username': 'alice', 'password': 'secret'}
response = requests.post(url, data=data)
print(f"状态码: {response.status_code}")
print(f"响应内容: {response.json()}")

# 上传文件
url = 'https://httpbin.org/post'
files = {'file': open('example.txt', 'rb')}
response = requests.post(url, files=files)
print(f"状态码: {response.status_code}")
print(f"响应内容: {response.json()}")

# 设置请求头
url = 'https://api.github.com'
headers = {'User-Agent': 'My User Agent 1.0'}
response = requests.get(url, headers=headers)
print(f"状态码: {response.status_code}")

# 设置超时
try:
    response = requests.get('https://api.github.com', timeout=5)  # 5秒超时
    print(f"状态码: {response.status_code}")
except RequestException as e:
    print(f"请求出错: {e}")

# 使用会话(Session)
session = requests.Session()
session.headers.update({'User-Agent': 'My User Agent 1.0'})

# 第一次请求
response = session.get('https://api.github.com')
print(f"第一次请求状态码: {response.status_code}")

# 第二次请求(会保持之前的设置)
response = session.get('https://api.github.com/users/python')
print(f"第二次请求状态码: {response.status_code}")

# 基本认证
url = 'https://httpbin.org/basic-auth/user/passwd'
response = requests.get(url, auth=HTTPBasicAuth('user', 'passwd'))
print(f"基本认证状态码: {response.status_code}")

# 处理异常
try:
    response = requests.get('https://api.github.com/invalid-url')
    response.raise_for_status()  # 如果状态码不是200,则抛出异常
except RequestException as e:
    print(f"请求出错: {e}")

# 使用代理
proxies = {
    'http': 'http://10.10.1.10:3128',
    'https': 'http://10.10.1.10:1080',
}
response = requests.get('https://api.github.com', proxies=proxies)
print(f"使用代理的状态码: {response.status_code}")

# 下载文件
url = 'https://example.com/image.jpg'
response = requests.get(url, stream=True)
if response.status_code == 200:
    with open('image.jpg', 'wb') as f:
        for chunk in response.iter_content(1024):
            f.write(chunk)
    print("文件下载完成")
else:
    print(f"下载失败,状态码: {response.status_code}")
13.4.2 BeautifulSoup

BeautifulSoup是一个用于解析HTML和XML文档的库,它提供了简单而灵活的方式来从网页中提取数据。

# 首先需要安装BeautifulSoup: pip install beautifulsoup4
import requests
from bs4 import BeautifulSoup
import csv
import json

# 发送HTTP请求
url = 'https://example.com'
response = requests.get(url)

# 检查请求是否成功
if response.status_code == 200:
    # 解析HTML内容
    soup = BeautifulSoup(response.text, 'html.parser')
    
    # 提取标题
    title = soup.title.text
    print(f"标题: {title}")
    
    # 提取所有链接
    links = soup.find_all('a')
    print("所有链接:")
    for link in links:
        print(link.get('href'))
    
    # 提取特定元素
    # 假设我们要提取所有class为'item'的div元素
    items = soup.find_all('div', class_='item')
    print("所有项目:")
    for item in items:
        print(item.text.strip())
    
    # 使用CSS选择器
    # 提取所有class为'item'的div元素中的h2标题
    titles = soup.select('div.item h2')
    print("所有标题:")
    for title in titles:
        print(title.text.strip())
    
    # 提取表格数据
    table = soup.find('table')
    if table:
        rows = table.find_all('tr')
        data = []
        for row in rows:
            cells = row.find_all(['td', 'th'])
            row_data = [cell.text.strip() for cell in cells]
            data.append(row_data)
        
        # 将数据保存为CSV文件
        with open('table_data.csv', 'w', newline='', encoding='utf-8') as file:
            writer = csv.writer(file)
            writer.writerows(data)
        
        # 将数据保存为JSON文件
        with open('table_data.json', 'w', encoding='utf-8') as file:
            json.dump(data, file, ensure_ascii=False, indent=4)
    
    # 处理分页
    # 假设我们要爬取多个页面的数据
    all_data = []
    for page in range(1, 6):  # 爬取前5页
        page_url = f'{url}?page={page}'
        page_response = requests.get(page_url)
        
        if page_response.status_code == 200:
            page_soup = BeautifulSoup(page_response.text, 'html.parser')
            page_items = page_soup.find_all('div', class_='item')
            
            for item in page_items:
                item_data = {
                    'title': item.find('h2').text.strip(),
                    'description': item.find('p').text.strip(),
                    'link': item.find('a').get('href')
                }
                all_data.append(item_data)
    
    # 将所有数据保存为JSON文件
    with open('all_data.json', 'w', encoding='utf-8') as file:
        json.dump(all_data, file, ensure_ascii=False, indent=4)
    
    print(f"共爬取了 {len(all_data)} 条数据")

else:
    print(f"请求失败,状态码: {response.status_code}")
13.4.3 Pillow

Pillow是Python中用于图像处理的库,它提供了广泛的图像处理功能。

# 首先需要安装Pillow: pip install pillow
from PIL import Image, ImageDraw, ImageFont, ImageFilter, ImageEnhance
import os

# 打开图像
image = Image.open('example.jpg')

# 显示图像信息
print(f"图像格式: {image.format}")
print(f"图像大小: {image.size}")
print(f"图像模式: {image.mode}")

# 显示图像
# image.show()

# 调整图像大小
resized_image = image.resize((800, 600))
resized_image.save('resized_image.jpg')

# 旋转图像
rotated_image = image.rotate(45)  # 旋转45度
rotated_image.save('rotated_image.jpg')

# 翻转图像
flipped_image = image.transpose(Image.FLIP_LEFT_RIGHT)  # 水平翻转
flipped_image.save('flipped_image.jpg')

# 裁剪图像
box = (100, 100, 400, 400)  # 左、上、右、下
cropped_image = image.crop(box)
cropped_image.save('cropped_image.jpg')

# 应用滤镜
blurred_image = image.filter(ImageFilter.BLUR)
blurred_image.save('blurred_image.jpg')

# 调整亮度
enhancer = ImageEnhance.Brightness(image)
brightened_image = enhancer.enhance(1.5)  # 增加50%的亮度
brightened_image.save('brightened_image.jpg')

# 调整对比度
enhancer = ImageEnhance.Contrast(image)
contrasted_image = enhancer.enhance(1.5)  # 增加50%的对比度
contrasted_image.save('contrasted_image.jpg')

# 调整颜色
enhancer = ImageEnhance.Color(image)
colored_image = enhancer.enhance(1.5)  # 增加50%的颜色饱和度
colored_image.save('colored_image.jpg')

# 转换为灰度图像
gray_image = image.convert('L')
gray_image.save('gray_image.jpg')

# 在图像上绘制文本
draw = ImageDraw.Draw(image)
try:
    # 尝试使用系统字体
    font = ImageFont.truetype("arial.ttf", 40)
except:
    # 如果找不到字体,使用默认字体
    font = ImageFont.load_default()

draw.text((10, 10), "Hello, World!", fill="white", font=font)
image.save('text_image.jpg')

# 创建缩略图
image.thumbnail((200, 200))
image.save('thumbnail.jpg')

# 创建新图像
new_image = Image.new('RGB', (500, 300), color='blue')
new_image.save('new_image.jpg')

# 在新图像上绘制形状
draw = ImageDraw.Draw(new_image)
draw.rectangle([(50, 50), (150, 150)], outline='white', width=3)
draw.ellipse([(200, 50), (350, 200)], outline='red', width=3)
draw.polygon([(400, 50), (450, 150), (350, 150)], outline='green', width=3)
new_image.save('shapes_image.jpg')

# 图像合成
background = Image.new('RGB', (800, 600), color='white')
foreground = Image.open('example.jpg')
foreground = foreground.resize((400, 300))

# 将前景图像粘贴到背景图像上
position = ((background.width - foreground.width) // 2, 
            (background.height - foreground.height) // 2)
background.paste(foreground, position)
background.save('composite_image.jpg')

# 批量处理图像
def process_images(input_dir, output_dir, size=(800, 600)):
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    
    for filename in os.listdir(input_dir):
        if filename.endswith(('.jpg', '.jpeg', '.png')):
            input_path = os.path.join(input_dir, filename)
            output_path = os.path.join(output_dir, filename)
            
            try:
                with Image.open(input_path) as img:
                    # 调整大小
                    img = img.resize(size)
                    
                    # 转换为RGB模式(如果是RGBA模式)
                    if img.mode == 'RGBA':
                        img = img.convert('RGB')
                    
                    # 保存图像
                    img.save(output_path, quality=85)
                    print(f"已处理: {filename}")
            except Exception as e:
                print(f"处理 {filename} 时出错: {e}")

# 调用批量处理函数
# process_images('input_images', 'output_images')

<!-- templates/register.html -->
{% extends "base.html" %}

{% block title %}注册 - 个人博客{% endblock %}

{% block content %}
<div class="row justify-content-center">
    <div class="col-md-6">
        <div class="card">
            <div class="card-header">
                <h2>注册</h2>
            </div>
            <div class="card-body">
                <form method="POST">
                    <div class="mb-3">
                        <label for="username" class="form-label">用户名</label>
                        <input type="text" class="form-control" id="username" name="username" required>
                        <div class="form-text">用户名必须是唯一的,且不能包含特殊字符。</div>
                    </div>
                    
                    <div class="mb-3">
                        <label for="email" class="form-label">邮箱</label>
                        <input type="email" class="form-control" id="email" name="email" required>
                        <div class="form-text">我们将使用此邮箱与你联系。</div>
                    </div>
                    
                    <div class="mb-3">
                        <label for="password" class="form-label">密码</label>
                        <input type="password" class="form-control" id="password" name="password" required>
                        <div class="form-text">密码长度至少为6个字符。</div>
                    </div>
                    
                    <div class="mb-3">
                        <label for="confirm_password" class="form-label">确认密码</label>
                        <input type="password" class="form-control" id="confirm_password" name="confirm_password" required>
                    </div>
                    
                    <button type="submit" class="btn btn-primary">注册</button>
                    <a href="{{ url_for('login') }}" class="btn btn-link">已有账号?登录</a>
                </form>
            </div>
        </div>
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值