Playwright 是 Microsoft 开发的一个现代化的端到端测试和浏览器自动化库,支持 Chromium、WebKit 和 Firefox 浏览器。它提供了跨浏览器、跨平台的自动化能力,且具有高性能和可靠性。
一、核心特性
-
多浏览器支持:
- Chromium (Chrome, Edge)
- WebKit (Safari)
- Firefox
-
跨平台支持:
- Windows
- macOS
- Linux
-
多语言支持:
- JavaScript/TypeScript (原生)
- Python
- Java
- .NET
-
自动等待机制:
- 自动等待元素出现
- 自动等待网络请求完成
-
网络拦截:
- 可以模拟和修改网络请求
-
设备模拟:
- 支持移动设备模拟
二、安装与设置
安装 Playwright Python 包
pip install playwright
安装浏览器二进制文件
playwright install
三、核心 API 结构
1. 主要模块
playwright.sync_api
: 同步 APIplaywright.async_api
: 异步 APIplaywright.chromium
,playwright.firefox
,playwright.webkit
: 浏览器特定模块
2. 核心类
Browser
: 浏览器实例BrowserContext
: 浏览器上下文(类似隐身会话)Page
: 标签页Frame
: 页面中的框架ElementHandle
: DOM 元素Locator
: 元素定位器
四、常用方法与属性
Browser 类
browser = playwright.chromium.launch(headless=False) # 启动浏览器
browser.close() # 关闭浏览器
browser.version # 浏览器版本
Page 类
page.goto(url) # 导航到URL
page.title() # 获取页面标题
page.url # 当前URL
page.content() # 获取页面HTML内容
page.screenshot() # 截图
page.pdf() # 生成PDF
page.click(selector) # 点击元素
page.fill(selector, text) # 填充文本
page.wait_for_selector(selector) # 等待元素出现
page.evaluate(script) # 执行JavaScript
Locator 类
locator = page.locator(selector)
locator.click()
locator.fill(text)
locator.text_content()
locator.count()
locator.all_inner_texts()
五、实战案例
案例1:基本页面操作
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(headless=False)
page = browser.new_page()
page.goto("https://example.com")
print(page.title())
page.screenshot(path="example.png")
browser.close()
案例2:表单填写与提交
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("https://demoqa.com/automation-practice-form")
# 填写表单
page.fill("#firstName", "John")
page.fill("#lastName", "Doe")
page.fill("#userEmail", "john.doe@example.com")
page.click("#gender-radio-1", force=True) # 选择男性
page.fill("#userNumber", "1234567890")
# 提交表单
page.click("#submit")
# 验证提交结果
page.wait_for_selector("#example-modal-sizes-title-lg")
browser.close()
案例3:处理动态内容与等待
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
# 监听控制台日志
page.on("console", lambda msg: print(f"Console: {msg.text}"))
page.goto("https://dynamic-load.vercel.app/")
# 等待并点击加载按钮
page.wait_for_selector("button:has-text('Load')")
page.click("button:has-text('Load')")
# 等待动态内容加载
page.wait_for_selector(".dynamic-content")
print("Dynamic content loaded:", page.text_content(".dynamic-content"))
browser.close()
案例4:网络请求拦截
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
# 拦截并修改请求
def handle_request(route, request):
if "api.example.com" in request.url:
route.fulfill(
status=200,
content_type="application/json",
body='{"mock": "data"}'
)
else:
route.continue_()
page.route("**/*", handle_request)
page.goto("https://example.com")
browser.close()
案例5:多页面与上下文
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
# 创建两个独立的浏览器上下文
browser = p.chromium.launch()
context1 = browser.new_context()
context2 = browser.new_context()
# 在每个上下文中创建页面
page1 = context1.new_page()
page2 = context2.new_page()
# 独立操作
page1.goto("https://example.com")
page2.goto("https://google.com")
print("Page1 title:", page1.title())
print("Page2 title:", page2.title())
context1.close()
context2.close()
browser.close()
六、高级功能
1. 设备模拟
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
# 使用iPhone 11模拟
iphone_11 = p.devices["iPhone 11"]
browser = p.chromium.launch()
context = browser.new_context(**iphone_11)
page = context.new_page()
page.goto("https://example.com")
page.screenshot(path="mobile.png")
browser.close()
2. 录制测试
# 使用Playwright CLI录制测试
# 命令: playwright codegen https://example.com
3. 视频录制
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
context = browser.new_context(record_video_dir="videos/")
page = context.new_page()
page.goto("https://example.com")
# ...执行操作...
context.close()
browser.close()
# 视频会自动保存
4. 性能分析
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
# 开始跟踪
page.context.tracing.start(screenshots=True, snapshots=True, sources=True)
page.goto("https://example.com")
# ...执行操作...
# 停止跟踪并保存
page.context.tracing.stop(path="trace.zip")
browser.close()
七、注意事项
-
元素定位策略:
- 优先使用
page.locator()
而不是page.$()
或page.$$()
- 使用语义化的选择器,如
page.get_by_role("button", name="Submit")
- 避免使用XPath,除非必要
- 优先使用
-
等待机制:
- Playwright 有自动等待机制,通常不需要手动添加等待
- 需要时使用
page.wait_for_*
方法
-
并行执行:
- 每个 Playwright 实例应该是独立的
- 使用多个 BrowserContext 来实现并行
-
资源管理:
- 确保关闭浏览器和页面
- 使用
with
语句自动管理资源
-
调试技巧:
- 使用
headless=False
查看浏览器操作 - 添加
slow_mo
参数减慢执行速度便于观察
browser = p.chromium.launch(headless=False, slow_mo=100)
- 使用
八、常见问题与解决方案
Q1: 元素点击不生效
- 原因:元素可能被遮挡或不可见
- 解决:
page.click(selector, force=True) # 强制点击 # 或 page.locator(selector).click(timeout=5000) # 增加超时
Q2: 页面加载超时
- 原因:网络慢或页面资源过多
- 解决:
page.goto(url, timeout=60000) # 增加超时时间 # 或 page.goto(url, wait_until="domcontentloaded") # 只等待DOM加载
Q3: iframe 操作问题
- 解决:
frame = page.frame_locator("iframe").locator("button") frame.click()
Q4: 文件下载
- 解决:
with page.expect_download() as download_info: page.click("a#download-link") download = download_info.value path = download.path() download.save_as("/path/to/save")
Q5: 处理弹窗
- 解决:
page.on("dialog", lambda dialog: dialog.accept())
九、扩展知识
1. 与 Pytest 集成
# conftest.py
import pytest
from playwright.sync_api import sync_playwright
@pytest.fixture(scope="module")
def browser():
with sync_playwright() as p:
browser = p.chromium.launch()
yield browser
browser.close()
@pytest.fixture
def page(browser):
page = browser.new_page()
yield page
page.close()
2. 自定义选择器引擎
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
# 注册自定义选择器
def text_selector(selector, page):
return page.evaluate_handle("""selector => {
return Array.from(document.querySelectorAll('*'))
.find(el => el.textContent === selector);
}""", selector)
p.selectors.register("text", text_selector)
browser = p.chromium.launch()
page = browser.new_page()
page.goto("https://example.com")
element = page.locator("text=Example Domain")
print(element.text_content())
browser.close()
3. 性能监控
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
# 监听性能指标
page.on("metrics", lambda metrics: print(metrics))
page.goto("https://example.com")
# 获取特定指标
metrics = page.evaluate("""() => {
return {
memory: window.performance.memory,
timing: window.performance.timing
}
}""")
print(metrics)
browser.close()
十、最佳实践
-
使用 Page Object 模式:
class LoginPage: def __init__(self, page): self.page = page self.username = page.locator("#username") self.password = page.locator("#password") self.submit = page.locator("#submit") def login(self, username, password): self.username.fill(username) self.password.fill(password) self.submit.click()
-
配置重用:
# 创建可重用的浏览器上下文配置 def create_context(browser, **kwargs): return browser.new_context( viewport={"width": 1920, "height": 1080}, locale="en-US", timezone_id="America/New_York", **kwargs )
-
错误处理:
try: page.click(selector, timeout=5000) except TimeoutError: print(f"元素 {selector} 未找到") page.screenshot(path="error.png") raise
-
日志记录:
import logging from playwright.sync_api import sync_playwright logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) with sync_playwright() as p: browser = p.chromium.launch() page = browser.new_page() def log_request(request): logger.info(f"Request: {request.method} {request.url}") page.on("request", log_request) page.goto("https://example.com") browser.close()
Playwright 是一个功能强大且灵活的浏览器自动化工具,适用于测试、爬虫和各种自动化场景。通过合理利用其丰富的 API 和功能,可以构建高效可靠的自动化解决方案。
Python 图书推荐
书名 | 出版社 | 推荐 |
---|---|---|
Python编程 从入门到实践 第3版(图灵出品) | 人民邮电出版社 | ★★★★★ |
Python数据科学手册(第2版)(图灵出品) | 人民邮电出版社 | ★★★★★ |
图形引擎开发入门:基于Python语言 | 电子工业出版社 | ★★★★★ |
科研论文配图绘制指南 基于Python(异步图书出品) | 人民邮电出版社 | ★★★★★ |
Effective Python:编写好Python的90个有效方法(第2版 英文版) | 人民邮电出版社 | ★★★★★ |
Python人工智能与机器学习(套装全5册) | 清华大学出版社 | ★★★★★ |