Python正则表达式实战练习详解

Python正则表达式实战练习详解

【免费下载链接】py_regular_expressions Learn Python Regular Expressions step by step from beginner to advanced levels 【免费下载链接】py_regular_expressions 项目地址: https://gitcode.com/gh_mirrors/py/py_regular_expressions

正则表达式是文本处理中非常强大的工具,掌握它可以大幅提升文本处理效率。本文基于Python正则表达式练习项目,通过一系列精心设计的练习题,帮助读者逐步掌握正则表达式的核心概念和应用技巧。

基础入门练习

1. 十六进制数值检测

检测字符串中是否包含0xB0这样的十六进制数值:

import re
line1 = 'start address: 0xA0, func1 address: 0xC0'
line2 = 'end address: 0xFF, func2 address: 0xB0'

print(bool(re.search(r'0xB0', line1)))  # False
print(bool(re.search(r'0xB0', line2)))  # True

2. 数字替换

将字符串中所有数字5替换为five

ip = 'They ate 5 apples and 5 oranges'
print(re.sub(r'5', 'five', ip))
# 输出:'They ate five apples and five oranges'

3. 首次匹配替换

仅替换字符串中第一次出现的数字5

ip = 'They ate 5 apples and 5 oranges'
print(re.sub(r'5', 'five', ip, count=1))
# 输出:'They ate five apples and 5 oranges'

4. 过滤不含特定字符的元素

过滤列表中不包含字母e的元素:

items = ['goal', 'new', 'user', 'sit', 'eat', 'dinner']
print([w for w in items if not re.search(r'e', w)])
# 输出:['goal', 'sit']

进阶锚点练习

1. 字符串起始检测

检查字符串是否以be开头:

line1 = 'be nice'
line2 = '"best!"'
line3 = 'better?'
line4 = 'oh no\nbear spotted'

pat = re.compile(r'^be')

print(bool(pat.search(line1)))  # True
print(bool(pat.search(line2)))  # False
print(bool(pat.search(line3)))  # True
print(bool(pat.search(line4)))  # False

2. 全词匹配替换

仅替换整个单词redbrown

words = 'bred red spread credible red.'
print(re.sub(r'\bred\b', 'brown', words))
# 输出:'bred brown spread credible brown.'

3. 数字周围字符检测

过滤列表中包含被单词字符包围的42的元素:

words = ['hi42bye', 'nice1423', 'bad42', 'cool_42a', '42fake', '_42_']
print([w for w in words if re.search(r'\B42\B', w)])
# 输出:['hi42bye', 'nice1423', 'cool_42a', '_42_']

分组与选择练习

1. 多条件过滤

过滤以den开头或以ly结尾的元素:

items = ['lovely', '1\ndentist', '2 lonely', 'eden', 'fly\n', 'dent']
print([e for e in items if re.search(r'^den|ly$', e)])
# 输出:['lovely', '2 lonely', 'dent']

2. 多模式替换

替换多种模式为X

s1 = 'creed refuse removed read'
s2 = 'refused reed redo received'

pat = re.compile(r're(mov|ceiv|fus|)ed|reed')

print(pat.sub('X', s1))  # 'cX refuse X read'
print(pat.sub('X', s2))  # 'X X redo X'

元字符转义练习

1. 特殊字符处理

替换特定模式而不影响其他部分:

str1 = '(9-2)*5+qty/3-(9-2)*7'
str2 = '(qty+4)/2-(9-2)*5+pq/4'

pat = re.compile(r'\(9-2\)\*5')

print(pat.sub('35', str1))  # '35+qty/3-(9-2)*7'
print(pat.sub('35', str2))  # '(qty+4)/2-35+pq/4'

2. 边界条件替换

仅在字符串开始或结尾处替换特定模式:

s1 = r'2.3/(4)\|6 foo 5.3-(4)\|'
s2 = r'(4)\|42 - (4)\|3'
s3 = 'two - (4)\\|\n'

pat = re.compile(r'^(\(4\)\\\||\(4\)\\\|$)')

print(pat.sub('2', s1))  # '2.3/(4)\\|6 foo 5.3-2'
print(pat.sub('2', s2))  # '242 - (4)\\|3'
print(pat.sub('2', s3))  # 'two - (4)\\|\n'

量词与贪婪匹配

1. 贪婪与非贪婪匹配

理解贪婪匹配的行为:

ip = 'a<apple> 1<> b<bye> 2<> c<cat>'
# 错误示例
print(re.sub(r'<.+?>', '', ip))  # 输出:'a 1 2'

# 正确做法应使用
print(re.sub(r'<[^>]+>', '', ip))  # 输出:'a 1<> b 2<> c'

2. 量词等价表示

理解基本量词的等价形式:

  • ? 等价于 {0,1}
  • * 等价于 {0,}
  • + 等价于 {1,}

匹配部分处理

1. 范围匹配提取

提取从第一个is到最后一个t之间的内容:

str1 = 'This the biggest fruit you have seen?'
str2 = 'Your mission is to read and practice consistently'

pat = re.compile(r'is.*t')

print(pat.search(str1).group())  # 'is the biggest fruit'
print(pat.search(str2).group())  # 'ission is to read and practice consistent'

2. 多模式首次出现位置

查找多个模式的首次出现位置:

s1 = 'match after the last newline character'
s2 = 'and then you want to test'
s3 = 'this is good bye then'
s4 = 'who was there to see?'

pat = re.compile(r'is|the|was|to')

print(pat.search(s1).start())  # 12
print(pat.search(s2).start())  # 4
print(pat.search(s3).start())  # 2
print(pat.search(s4).start())  # 4

通过以上系统化的练习,读者可以逐步掌握Python正则表达式的核心功能和应用场景。建议读者实际运行这些代码示例,并尝试修改模式以观察不同效果,从而加深对正则表达式工作原理的理解。

【免费下载链接】py_regular_expressions Learn Python Regular Expressions step by step from beginner to advanced levels 【免费下载链接】py_regular_expressions 项目地址: https://gitcode.com/gh_mirrors/py/py_regular_expressions

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值