AI Virtual Mouse (使用Mediapipe实现虚拟鼠标控制)_视觉手势控制鼠标-优快云博客

视频检测

使用 MediaPipe 实现手势控制鼠标

MediaPipe 是一个强大的跨平台框架，可用于构建实时多媒体处理管道。结合 Python 和 OpenCV，可以通过手势识别实现鼠标控制功能。以下是实现步骤：

安装必要的库

确保已安装以下 Python 库：

pip install mediapipe opencv-python pyautogui

mediapipe：用于手势识别。
opencv-python：用于图像处理和摄像头捕获。
pyautogui：用于控制鼠标。

初始化 MediaPipe 手势识别

加载 MediaPipe 的手势识别模块，并初始化摄像头：

import cv2
import mediapipe as mp
import pyautogui

mp_hands = mp.solutions.hands
hands = mp_hands.Hands(max_num_hands=1, min_detection_confidence=0.7)
mp_draw = mp.solutions.drawing_utils
cap = cv2.VideoCapture(0)

检测手势并提取关键点

通过摄像头捕获实时画面，并检测手势关键点：

while cap.isOpened():
    success, image = cap.read()
    if not success:
        continue

    image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)
    results = hands.process(image)
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

    if results.multi_hand_landmarks:
        for hand_landmarks in results.multi_hand_landmarks:
            mp_draw.draw_landmarks(image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
            
            # 获取食指指尖坐标（Landmark 8）
            index_finger = hand_landmarks.landmark[8]
            height, width, _ = image.shape
            x, y = int(index_finger.x * width), int(index_finger.y * height)

映射手势到鼠标移动

将食指指尖的坐标映射到屏幕分辨率，并控制鼠标移动：

            # 获取屏幕尺寸
            screen_width, screen_height = pyautogui.size()
            
            # 将摄像头坐标映射到屏幕坐标
            mouse_x = int(index_finger.x * screen_width)
            mouse_y = int(index_finger.y * screen_height)
            
            # 移动鼠标
            pyautogui.moveTo(mouse_x, mouse_y)

实现点击操作

通过手势（如拇指和食指接触）触发鼠标点击：

            # 获取拇指指尖坐标（Landmark 4）
            thumb_tip = hand_landmarks.landmark[4]
            thumb_x, thumb_y = int(thumb_tip.x * width), int(thumb_tip.y * height)
            
            # 计算拇指和食指的距离
            distance = ((x - thumb_x) ** 2 + (y - thumb_y) ** 2) ** 0.5
            
            # 如果距离小于阈值，触发点击
            if distance < 30:
                pyautogui.click()

优化与调试

调整灵敏度：根据实际需求调整鼠标移动的映射比例或点击距离阈值。
显示反馈：在摄像头画面中绘制关键点和连线，便于调试。
错误处理：添加异常处理以避免程序崩溃。

完整代码示例

将上述步骤整合为一个完整的脚本：

import cv2
import mediapipe as mp
import pyautogui

mp_hands = mp.solutions.hands
hands = mp_hands.Hands(max_num_hands=1, min_detection_confidence=0.7)
mp_draw = mp.solutions.drawing_utils
cap = cv2.VideoCapture(0)

while cap.isOpened():
    success, image = cap.read()
    if not success:
        continue

    image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)
    results = hands.process(image)
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

    if results.multi_hand_landmarks:
        for hand_landmarks in results.multi_hand_landmarks:
            mp_draw.draw_landmarks(image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
            
            index_finger = hand_landmarks.landmark[8]
            thumb_tip = hand_landmarks.landmark[4]
            height, width, _ = image.shape
            x, y = int(index_finger.x * width), int(index_finger.y * height)
            
            screen_width, screen_height = pyautogui.size()
            mouse_x = int(index_finger.x * screen_width)
            mouse_y = int(index_finger.y * screen_height)
            pyautogui.moveTo(mouse_x, mouse_y)
            
            thumb_x, thumb_y = int(thumb_tip.x * width), int(thumb_tip.y * height)
            distance = ((x - thumb_x) ** 2 + (y - thumb_y) ** 2) ** 0.5
            if distance < 30:
                pyautogui.click()

    cv2.imshow('Gesture Mouse Control', image)
    if cv2.waitKey(5) & 0xFF == 27:
        break

cap.release()
cv2.destroyAllWindows()