Python 中正则表达式的使用浅析

Python 学习,请参考这个网站:

https://pythonprogramming.net/


很多专题,每个都有视频,我觉得讲得不错。



Python 中的 正则表达式 (Regular Expression)的模块是: re

正则表达式中常见的各种rule:


Identifiers:

  • \d = any number
  • \D = anything but a number
  • \s = space
  • \S = anything but a space
  • \w = any letter
  • \W = anything but a letter
  • . = any character, except for a new line
  • \b = space around whole words
  • \. = period. must use backslash, because . normally means any character.

Modifiers:

  • {1,3} = for digits, u expect 1-3 counts of digits, or "places" 
  • + = match 1 or more
  • ? = match 0 or 1 repetitions.
  • * = match 0 or MORE repetitions
  • $ = matches at the end of string
  • ^ = matches start of a string
  • | = matches either/or. Example x|y = will match either x or y
  • [] = range, or "variance"
  • {x} = expect to see this amount of the preceding code.
  • {x,y} = expect to see this x-y amounts of the precedng code

White Space Charts:

  • \n = new line
  • \s = space
  • \t = tab
  • \e = escape
  • \f = form feed
  • \r = carriage return

Characters to REMEMBER TO ESCAPE IF USED!

  • . + * ? [ ] $ ^ ( ) { } | \

Brackets:

  • [] = quant[ia]tative = will find either quantitative, or quantatative.
  • [a-z] = return any lowercase letter a-z
  • [1-5a-qA-Z] = return all numbers 1-5, lowercase letters a-q and uppercase A-Z

举例说明:

import re

exampleString = '''
Jessica is 15 years old, and Daniel is 27 years old.
Edward is 97 years old, and his grandfather, Oscar, is 102. 
'''

ages = re.findall(r'\d{1,3}', exampleString)
names = re.findall(r'[A-Z][a-z]*', exampleString)

print(ages)
#print is:['15', '27', '97', '102']
print(names)
#print is:['Jessica', 'Daniel', 'Edward', 'Oscar']

ageDict={}
x=0
for eachName in names:
	ageDict[eachNmae] = ages[x]
	x+=1
	
print(ageDict)
#print is:  {'Jessica': '15', 'Oscar': '102', 'Edward': '97', 'Daniel': '27'}


上面的例子中只用到了re.findall () 这一个函数,re模块还有很多其他的函数。

re.findall() 返回的是 列表。


举例2:

用到了re.sub() 函数。

re.sub()  用来实现通过正则表达式,实现比普通字符串的replace更加强大的替换功能;


如果输入字符串是:



inputStr = "hello 123 world 456"


而你是想把123和456,都换成222

就需要借助于re.sub,通过正则表达式,来实现这种相对复杂的字符串的替换:



replacedStr = re.sub( "\d+" , "222" , inputStr)


当然,实际情况中,会有比这个例子更加复杂的,其他各种特殊情况,就只能通过此re.sub去实现如此复杂的替换的功能了。

所以,re.sub的功能就是:

对于输入的一个字符串,利用正则表达式(的强大的字符串处理功能),去实现(相对复杂的)字符串替换处理,然后返回被替换后的字符串

其中re.sub还支持各种参数,比如count指定要替换的个数等等。



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值