from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
html = urlopen("http://www.pythonscraping.com/pages/page3.html")
bsObj = BeautifulSoup(html)
images = bsObj.findAll("img",{"src":re.compile("\.\.\/img\/gifts/img.*\.jpg")}) //匹配../img/gifts/img(任意单个字符).jpg
for image in images:
print(image["src"])
函数re.compile将正则表达式(以字符串书写的)转换为模式对象,可以实现更加有效的匹配。
例子:
import re
text = "JGood is a handsome boy, he is cool, clever, and so on..."
re.findall(r'\w*oo\w*', text) #查找所有包含'oo'的单词