Python中re使用方法

aixingjieluo

于 2025-04-10 20:06:06 发布

阅读量481

点赞数 14

文章标签： java python 数据库

本文链接：https://blog.youkuaiyun.com/aixingjieluo/article/details/147126519

版权

以下是 Python 中 re 模块的正则表达式常用方法及示例，注释中详细说明了代码逻辑和输出结果。

1. `re.match()`

从字符串开头匹配模式，成功返回 Match 对象，否则返回 None。

python

复制

import re

text = "Python is fun"
pattern = r"Python"  # 匹配开头的"Python"

result = re.match(pattern, text)
if result:
    print("Match found:", result.group())  # 输出: Match found: Python
else:
    print("No match")

# 失败示例：字符串开头不匹配
result = re.match(r"fun", text)  # 返回 None

2. `re.search()`

在字符串中搜索第一个匹配项，不限制必须从开头开始。

python

复制

text = "Learn Python programming"
pattern = r"Python"

result = re.search(pattern, text)
if result:
    print("Found:", result.group())  # 输出: Found: Python
    print("Position:", result.span())  # 输出: Position: (6, 12)

3. `re.findall()`

返回所有非重叠匹配项的列表。

python

复制

text = "a:1, b:2, c:3"
pattern = r"\d"  # 匹配所有数字

result = re.findall(pattern, text)
print(result)  # 输出: ['1', '2', '3']

4. `re.finditer()`

返回所有匹配项的迭代器（适合处理大量数据）。

python

复制

text = "a:1, b:2, c:3"
pattern = r"(\w+):(\d)"  # 分组匹配字母和数字

for match in re.finditer(pattern, text):
    print(f"Key: {match.group(1)}, Value: {match.group(2)}")
# 输出:
# Key: a, Value: 1
# Key: b, Value: 2
# Key: c, Value: 3

5. `re.sub()`

替换字符串中的匹配项。

python

复制

text = "2023-01-01"
pattern = r"-"
replacement = "/"

new_text = re.sub(pattern, replacement, text)
print(new_text)  # 输出: 2023/01/01

6. `re.subn()`

类似 sub()，但返回替换后的字符串和替换次数。

python

复制

text = "error: 404, error: 500"
pattern = r"error"
replacement = "status"

new_text, count = re.subn(pattern, replacement, text)
print(new_text)  # 输出: status: 404, status: 500
print("Replacements:", count)  # 输出: Replacements: 2

7. `re.split()`

用正则表达式分割字符串。

python

复制

text = "Apple,Banana;Cherry"
pattern = r"[,;]"  # 按逗号或分号分割

result = re.split(pattern, text)
print(result)  # 输出: ['Apple', 'Banana', 'Cherry']

8. `re.compile()`

预编译正则表达式，提升重复使用效率。

python

复制

pattern = re.compile(r"\b[A-Z]+\b")  # 匹配全大写单词
text = "HELLO world PYTHON"

matches = pattern.findall(text)
print(matches)  # 输出: ['HELLO', 'PYTHON']

9. 分组与命名分组

提取特定部分。

python

复制

text = "Date: 2023-10-05"
pattern = r"Date: (\d{4})-(\d{2})-(\d{2})"

match = re.search(pattern, text)
if match:
    print("Full match:", match.group(0))  # 输出: Date: 2023-10-05
    print("Year:", match.group(1))        # 输出: Year: 2023
    print("Month:", match.group(2))       # 输出: Month: 10
    print("Groups:", match.groups())      # 输出: Groups: ('2023', '10', '05')

# 命名分组（更清晰）
pattern = r"Date: (?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})"
match = re.search(pattern, text)
print("Year:", match.groupdict()['year'])  # 输出: Year: 2023

10. 标志（Flags）

控制匹配模式，如忽略大小写、多行匹配等。

python

复制

# re.IGNORECASE（忽略大小写）
text = "Hello World"
result = re.search(r"world", text, re.IGNORECASE)
print(result.group())  # 输出: World

# re.MULTILINE（多行模式下^/$匹配每行开头结尾）
text = "Line1\nLine2\nLine3"
matches = re.findall(r"^Line\d", text, re.MULTILINE)
print(matches)  # 输出: ['Line1', 'Line2', 'Line3']

# re.DOTALL（让.匹配包括换行符的所有字符）
text = "Start\nEnd"
result = re.search(r"Start.*End", text, re.DOTALL)
print(result.group())  # 输出: Start\nEnd

11. 错误处理

处理无效正则表达式。

python

复制

try:
    re.compile(r"*invalid")  # 语法错误
except re.error as e:
    print("Regex error:", e)  # 输出: Regex error: nothing to repeat

总结表格

方法/功能	用途
`re.match()`	从开头匹配模式
`re.search()`	搜索第一个匹配项
`re.findall()`	返回所有匹配的列表
`re.finditer()`	返回匹配的迭代器
`re.sub()`	替换匹配项
`re.subn()`	替换并返回次数
`re.split()`	正则分割字符串
`re.compile()`	预编译正则表达式
分组/命名分组	提取子匹配项
标志（Flags）	控制大小写、多行等匹配模式