1. 常用提示词
我期望能学习到对于Windows系统原理的最严谨的的描述,
2. 文档阅读
2.1 当MS文档中提到 “Win32 API” 时,实际上指代的是“Windows API (32/64)”
3. 常见术语
3.1 常见缩写
WM: Window Message
IME: Input Method Editor [doc]
Gemini2.0-Flash-Experimental:
例如,Windows自带的微软拼音、以及搜狗输入法等都属于IME。
IMM: Input Method Manager [doc]
3.2 Message(消息):“有点像人的五感”
3.3 Hook Chain(钩子链)
1.2.1 user32.CallNextHookEx
:链接器
南溪:
“如果把原有的钩子链想象成一个链表,最后一个执行的SetWindowsHookExW
会把当前的钩子放在链表的最前面,但是之后如果不调用CallNextHookEx
,就相当于把之前的链表断开了;而CallNextHookEx的作用则是把原有的链表再粘连上来”
4. 编程哲学
4.1 优先使用带有 “W” 后缀的Unicode版本的函数
4.2 使用win32con
中的Windows常量来代替手动定义常量数字
print(hex(win32con.WM_IME_CONTROL))
Note
win32con
是"Windows 32 Constants" 的缩写,它是pywin32
包中的一个模块,包含了 Windows API 中使用的各种常量定义。
4.3 使用独立实例方式来调用user32.dll
user32 = ctypes.WinDLL("user32")
5. 常用package介绍
pywinauto
:用于Win窗口的自动化操作autoit
:GUI操作自动化pyautogui
:自动化GUI操作pyOpenRPA
:RPA操作的测试工具包
6. 执行PS命令:subprocess
import subprocess
def run_powershell_command(command):
# 构造 PowerShell 执行命令
ps_command = ["powershell", "-Command", command]
try:
result = subprocess.run(ps_command, check=True, text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
return result.stdout
except subprocess.CalledProcessError as e:
print("命令执行出错:", e.stderr)
return None
# 示例:执行一个简单的 PowerShell 命令
output = run_powershell_command("Get-Date")
print(output)
7. 运行应用程序:subprocess.Popen
2.1 后台启动应用程序
Note:
使用subprocess.run()函数打开应用会阻塞当前Python程序,直到启动的程序结束。如果不希望Python程序被阻塞,可以使用subprocess.Popen()
函数。
import subprocess
print(os.path.dirname(__file__))
command = "app_path"
subprocess.Popen([command])
print("Opened app")
2.2 判断应用程序是否启动成功
2.1.1 LLM提出的无法使用的方案
使用process.poll()
无法确认应用是否成功启动
process = subprocess.Popen([command])
while process.poll() is None:
print("The application is running...")
process.poll()
只能判断进程是否存在,而无法确认应用程序的UI界面已经出现;
使用process.poll()
无法确认应用是否成功启动
def is_process_running(process_name):
'''Check if there is any running process that contains the given name process_name.'''
import psutil
for proc in psutil.process_iter(['name']):
try:
if process_name.lower() in proc.info['name'].lower():
return True
except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess):
pass
return False;
psutil
是通过遍历进程名称来进行判断的,所以仍然无法判断应用的UI界面是否已经出现;
2.1.2 显示所有的顶级窗口
def enum_windows(): // enum windows: 遍历所有窗口
def callback(hwnd, extra):
class_name = win32gui.GetClassName(hwnd)
window_text = win32gui.GetWindowText(hwnd)
print(f"Window HWND: {hwnd}, Class Name: {class_name}, Text: {window_text}")
win32gui.EnumWindows(callback, None)
enum_windows()
8. 后台操作:Win32 API
Note
PyAutoGUI模拟的操作需要脚本测试程序前台运行;这是因为PyAutoGUI是向系统发送输入,而不是向特定的窗口发送输入。
9. 窗口截图
9.1 对于是否需要使用 Microsoft UI Automation 来进行窗口截图
因为我们对Windows编程的经验也不丰富,所以关于这一点我们需要咨询一下GPT4教授:
Nanxi: Can the Microsoft UI Automation framework be used to take screenshots of a specific application window?
GPT4-32k:
No, the Microsoft UI Automation framework is not designed to capture screenshots of application windows. Its primary purpose is to provide programmatic access to most UI elements on the desktop for accessibility, automation of UI testing, and for assistive technology products.
If you need to take screenshots of a specific application window, you would typically use other APIs provided by the Windows platform, such as the Graphics Device Interface (GDI), GDI+, or the DirectX APIs. ThePrintWindow
function in the Windows API is one example of a function that can be used to capture the contents of a window and copy it into a bitmap.
For .NET applications, classes such asSystem.Drawing.Graphics
from theSystem.Drawing
namespace can be used to capture screenshots. In more advanced scenarios, where you might need to capture screenshots with DirectX content, you might use theBitBlt
function or DirectX APIs directly.
UI Automation could be used to identify the window handle (HWND) of the application window you are interested in, and then you could use the window handle with the appropriate graphics API to capture the screenshot.
9.2 对于是否需要使用pywinauto来进行窗口截图
这个是不需要的,因为pywinauto底层也使用了pywin32
的API(“bmp.GetBitmapBits(True)”)来实现截图操作;
10. 输入法设置:基于imm32
10.1 常用messages
WM_IME_CONTROL [doc]
IME_CMODE_ALPHANUMERIC:英文模式 [doc]
从官方文档可知,“IME_CMODE_ALPHANUMERIC”对应的应该是 IME Conversion Mode Alphanumeric.
10.2 常用消息设置
10.2.1 设置指定窗口的输入法为英文
(1)设置转换模式为英文【推荐:可读性更好】
user32.SendMessageW(ime_hwnd, win32con.WM_IME_CONTROL, win32con.IMN_SETCONVERSIONMODE, IME_CMODE_ALPHANUMERIC)
(2)打开IME状态窗口并设置为英文
user32.SendMessageW(ime_hwnd, win32con.WM_IME_CONTROL, win32con.IMN_OPENSTATUSWINDOW, IME_CMODE_ALPHANUMERIC)
Note:经测试,两者在对于切换英文状态的效果是相同的。