ffmpeg-python 实战教程：从视频处理到流媒体应用

洪赫逊

于 2025-06-04 09:02:03 发布

阅读量420

点赞数 4

CC 4.0 BY-SA版权

本文链接：https://blog.youkuaiyun.com/gitblog_00613/article/details/148415668

ffmpeg-python 实战教程：从视频处理到流媒体应用

ffmpeg-python Python bindings for FFmpeg - with complex filtering support 项目地址: https://gitcode.com/gh_mirrors/ff/ffmpeg-python

ffmpeg-python 是一个基于 FFmpeg 的 Python 绑定库，它提供了简洁的 API 来处理音视频文件。本文将介绍该库的几种典型应用场景，帮助开发者快速掌握音视频处理的核心技术。

基础视频信息获取

使用 ffprobe 可以轻松获取视频文件的元数据信息，这是视频处理的第一步：

probe = ffmpeg.probe('input.mp4')
video_stream = next((stream for stream in probe['streams'] 
                   if stream['codec_type'] == 'video'), None)
width = int(video_stream['width'])
height = int(video_stream['height'])

这段代码会返回视频的宽度和高度等基本信息，对于后续的视频处理操作至关重要。

视频缩略图生成

为视频生成缩略图是常见的需求，ffmpeg-python 可以精确截取指定时间点的画面：

(
    ffmpeg
    .input('video.mp4', ss='00:00:10')  # 截取第10秒的画面
    .filter('scale', width=320, height=-1)  # 缩放宽度为320，高度按比例自动调整
    .output('thumbnail.jpg', vframes=1)  # 只输出一帧
    .run()
)

视频帧转 NumPy 数组

将视频帧转换为 NumPy 数组便于进行计算机视觉处理：

out, _ = (
    ffmpeg
    .input('input.mp4')
    .output('pipe:', format='rawvideo', pix_fmt='rgb24')
    .run(capture_stdout=True)
)
video_frames = np.frombuffer(out, np.uint8).reshape([-1, height, width, 3])

得到的 video_frames 是一个四维数组，包含了所有视频帧的 RGB 数据。

音频处理：转换为 PCM 格式

音频处理同样简单，以下代码将音频文件转换为 16kHz 单声道的 PCM 格式：

out, _ = (
    ffmpeg
    .input('audio.wav')
    .output('-', format='s16le', acodec='pcm_s16le', ac=1, ar='16k')
    .run(capture_stdout=True)
)

高级应用：视频合成与特效

多视频合成

将多个视频和音频流合并，并添加特效：

in1 = ffmpeg.input('video1.mp4')
in2 = ffmpeg.input('video2.mp4')

# 对第一个视频水平翻转
v1 = in1.video.hflip()
a1 = in1.audio

# 对第二个视频添加反向和色调效果
v2 = in2.video.filter('reverse').filter('hue', s=0)
a2 = in2.audio.filter('areverse').filter('aphaser')

# 合并视频和音频
joined = ffmpeg.concat(v1, a1, v2, a2, v=1, a=1).node
output = ffmpeg.output(joined[0], joined[1], 'output.mp4')
output.run()

单声道转立体声

将两个单声道音频文件合并为立体声：

left = ffmpeg.input('left.wav').filter('atrim', start=5)
right = ffmpeg.input('right.wav').filter('atrim', start=10)

(
    ffmpeg
    .filter([left, right], 'join', inputs=2, channel_layout='stereo')
    .output('output.mp3')
    .run()
)

流媒体处理

本地视频转 HTTP 直播流

(
    ffmpeg
    .input("input.mp4")
    .output("http://localhost:8080", 
           codec="copy", 
           listen=1, 
           f="flv")
    .global_args("-re")  # 模拟实时流
    .run()
)

RTSP 流转 TCP 套接字

process = (
    ffmpeg
    .input('rtsp://example.com:8554/stream')
    .output('-', format='h264')
    .run_async(pipe_stdout=True)
)

while True:
    packet = process.stdout.read(4096)
    if not packet:
        break
    tcp_socket.send(packet)

与深度学习框架集成

ffmpeg-python 可以与 TensorFlow 等框架无缝集成，实现视频流的实时处理：

# 视频解码
decoder = ffmpeg.input('input.mp4').output('pipe:', format='rawvideo', 
                                         pix_fmt='rgb24').run_async(pipe_stdout=True)

# 视频编码
encoder = ffmpeg.input('pipe:', format='rawvideo', pix_fmt='rgb24', 
                      s=f'{width}x{height}')
            .output('output.mp4').run_async(pipe_stdin=True)

while True:
    frame = decoder.stdout.read(width*height*3)
    if not frame:
        break
    
    # TensorFlow 处理帧
    processed_frame = model.process(np.frombuffer(frame, np.uint8))
    
    encoder.stdin.write(processed_frame.tobytes())