AGX Xavier 搭建360环视教程【四、主线流程和原理-硬解码】

Jetson Xavier AGX 下基于 GStreamer + CUDA + OpenCV四路 RTSP → 硬解码 → CUDA remap → 拼接 → GStreamer 推 RTSPC++代码。

📌 一、思路

1️⃣ 用 GStreamer pipeline 拉取 RTSP 流
2️⃣ 用 nvv4l2decoder 实现 硬件 H.264/H.265 解码
3️⃣ 将解码后的帧放入 appsink,交给 OpenCV (或 CUDA)
4️⃣ 用 cv::cuda::remap 做畸变矫正
5️⃣ 用 OpenCV CUDA 拼接
6️⃣ 拼接好的帧放回 appsrc,通过 nvv4l2h264enc 编码
7️⃣ rtspclientsink udpsink 推给外部 RTSP 服务器(比如 mediamtx


全流程用 appsink/appsrc 把 GStreamer 和 OpenCV/CUDA 串起来

核心思路

  • 每路 RTSP 用独立的 GStreamer pipeline + appsink 拉帧。

  • 用 OpenCV CUDA (cv::cuda) 做 remap

  • 拼接后从 GPU 拷贝回 CPU (GpuMat.download),推给 appsrc

  • appsrc 是把 CPU 内存帧送回 GStreamer 编码并通过 RTSP 回推。

  • nvv4l2decoder → Jetson 硬件 H264 解码单元

  • cv::cuda::remap → GPU 上跑畸变校正

  • appsrc / appsink → GStreamer 和 OpenCV 交互

  • 全流程 GPU 加速,CPU 只做拼接上传和结果推送。

✅ 二、环境检查

1、确认 nvv4l2decodernvvidconv 可用

在终端执行:

gst-inspect-1.0 nvv4l2decoder gst-inspect-1.0 nvvidconv

如果输出了插件描述,比如:

gst-inspect-1.0 nvv4l2decoder
Factory Details:
  Rank                     primary + 11 (267)
  Long-name                NVIDIA v4l2 video decoder
  Klass                    Codec/Decoder/Video
  Description              Decode video streams via V4L2 API
  Author                   Nicolas Dufresne <nicolas.dufresne@collabora.com>, Viranjan Pagar <vpagar@nvidia.com>

Plugin Details:
  Name                     nvvideo4linux2
  Description              Nvidia elements for Video 4 Linux
  Filename                 /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstnvvideo4linux2.so
  Version                  1.14.0
  License                  LGPL
  Source module            nvvideo4linux2
  Binary package           nvvideo4linux2
  Origin URL               http://nvidia.com/

GObject
 +----GInitiallyUnowned
       +----GstObject
             +----GstElement
                   +----GstVideoDecoder
                         +----GstNvV4l2VideoDec
                               +----nvv4l2decoder

Pad Templates:
  SRC template: 'src'
    Availability: Always
    Capabilities:
      video/x-raw(memory:NVMM)
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
  
  SINK template: 'sink'
    Availability: Always
    Capabilities:
      image/jpeg
      video/x-h264
          stream-format: { (string)byte-stream }
              alignment: { (string)au }
      video/x-h265
          stream-format: { (string)byte-stream }
              alignment: { (string)au }
      video/mpeg
            mpegversion: 4
           systemstream: false
                 parsed: true
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
      video/mpeg
            mpegversion: [ 1, 2 ]
           systemstream: false
                 parsed: true
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
      video/x-divx
            divxversion: [ 4, 5 ]
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
      video/x-vp8
      video/x-vp9
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]

Element has no clocking capabilities.
Element has no URI handling capabilities.

Pads:
  SINK: 'sink'
    Pad Template: 'sink'
  SRC: 'src'
    Pad Template: 'src'

Element Properties:
  name                : The name of the object
                        flags: readable, writable
                        String. Default: "nvv4l2decoder0"
  parent              : The parent of the object
                        flags: readable, writable
                        Object of type "GstObject"
  device              : Device location
                        flags: readable
                        String. Default: "/dev/nvhost-nvdec"
  device-name         : Name of the device
                        flags: Opening in BLOCKING MODE
Opening in BLOCKING MODE 
readable
                        String. Default: ""
  device-fd           : File descriptor of the device
                        flags: readable
                        Integer. Range: -1 - 2147483647 Default: -1 
  output-io-mode      : Output side I/O mode (matches sink pad)
                        flags: readable, writable
                        Enum "GstNvV4l2DecOutputIOMode" Default: 0, "auto"
                           (0): auto             - GST_V4L2_IO_AUTO
                           (2): mmap             - GST_V4L2_IO_MMAP
                           (3): userptr          - GST_V4L2_IO_USERPTR
  capture-io-mode     : Capture I/O mode (matches src pad)
                        flags: readable, writable
                        Enum "GstNvV4l2DecCaptureIOMode" Default: 0, "auto"
                           (0): auto             - GST_V4L2_IO_AUTO
                           (2): mmap             - GST_V4L2_IO_MMAP
  extra-controls      : Extra v4l2 controls (CIDs) for the device
                        flags: readable, writable
                        Boxed pointer of type "GstStructure"
  skip-frames         : Type of frames to skip during decoding
                        flags: readable, writable, changeable in NULL, READY, PAUSED or PLAYING state
                        Enum "SkipFrame" Default: 0, "decode_all"
                           (0): decode_all       - Decode all frames
                           (1): decode_non_ref   - Decode non-ref frames
                           (2): decode_key       - decode key frames
  drop-frame-interval : Interval to drop the frames,ex: value of 5 means every 5th frame will be given by decoder, rest all dropped
                        flags: readable, writable, changeable only in NULL or READY state
                        Unsigned Integer. Range: 0 - 30 Default: 0 
  num-extra-surfaces  : Additional number of surfaces in addition to min decode surfaces given by the v4l2 driver
                        flags: readable, writable, changeable only in NULL or READY state
                        Unsigned Integer. Range: 0 - 24 Default: 1 
  disable-dpb         : Set to disable DPB buffer for low latency
                        flags: readable, writable
                        Boolean. Default: false
  enable-full-frame   : Whether or not the data is full framed
                        flags: readable, writable
                        Boolean. Default: false
  enable-frame-type-reporting: Set to enable frame type reporting
                        flags: readable, writable
                        Boolean. Default: false
  enable-error-check  : Set to enable error check
                        flags: readable, writable
                        Boolean. Default: false
  enable-max-performance: Set to enable max performance
                        flags: readable, writable
                        Boolean. Default: false
  mjpeg               : Set to open MJPEG block
                        flags: readable, writable
                        Boolean. Default: false
gst-inspect-1.0 nvvidconv
gst-inspect-1.0 nvvidconv
Factory Details:
  Rank                     primary (256)
  Long-name                NvVidConv Plugin
  Klass                    Filter/Converter/Video/Scaler
  Description              Converts video from one colorspace to another & Resizes
  Author                   amit pandya <apandya@nvidia.com>

Plugin Details:
  Name                     nvvidconv
  Description              video Colorspace conversion & scaler
  Filename                 /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstnvvidconv.so
  Version                  1.2.3
  License                  Proprietary
  Source module            gstreamer-nvvconv-plugin
  Binary package           GStreamer nvvconv Plugin
  Origin URL               http://nvidia.com/

GObject
 +----GInitiallyUnowned
       +----GstObject
             +----GstElement
                   +----GstBaseTransform
                         +----Gstnvvconv

Pad Templates:
  SINK template: 'sink'
    Availability: Always
    Capabilities:
      video/x-raw(memory:NVMM)
                 format: { (string)I420, (string)I420_10LE, (string)P010_10LE, (string)I420_12LE, (string)UYVY, (string)YUY2, (string)YVYU, (string)NV12, (string)NV16, (string)NV24, (string)GRAY8, (string)BGRx, (string)RGBA, (string)Y42B }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
      video/x-raw
                 format: { (string)I420, (string)UYVY, (string)YUY2, (string)YVYU, (string)NV12, (string)NV16, (string)NV24, (string)P010_10LE, (string)GRAY8, (string)BGRx, (string)RGBA, (string)Y42B }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
  
  SRC template: 'src'
    Availability: Always
    Capabilities:
      video/x-raw(memory:NVMM)
                 format: { (string)I420, (string)I420_10LE, (string)P010_10LE, (string)UYVY, (string)YUY2, (string)YVYU, (string)NV12, (string)NV16, (string)NV24, (string)GRAY8, (string)BGRx, (string)RGBA, (string)Y42B }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
      video/x-raw
                 format: { (string)I420, (string)UYVY, (string)YUY2, (string)YVYU, (string)NV12, (string)NV16, (string)NV24, (string)GRAY8, (string)BGRx, (string)RGBA, (string)Y42B }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]

Element has no clocking capabilities.
Element has no URI handling capabilities.

Pads:
  SINK: 'sink'
    Pad Template: 'sink'
  SRC: 'src'
    Pad Template: 'src'

Element Properties:
  name                : The name of the object
                        flags: readable, writable
                        String. Default: "nvvconv0"
  parent              : The parent of the object
                        flags: readable, writable
                        Object of type "GstObject"
  qos                 : Handle Quality-of-Service events
                        flags: readable, writable
                        Boolean. Default: false
  silent              : Produce verbose output ?
                        flags: readable, writable
                        Boolean. Default: false
  flip-method         : video flip methods
                        flags: readable, writable, controllable
                        Enum "GstNvVideoFlipMethod" Default: 0, "none"
                           (0): none             - Identity (no rotation)
                           (1): counterclockwise - Rotate counter-clockwise 90 degrees
                           (2): rotate-180       - Rotate 180 degrees
                           (3): clockwise        - Rotate clockwise 90 degrees
                           (4): horizontal-flip  - Flip horizontally
                           (5): upper-right-diagonal - Flip across upper right/lower left diagonal
                           (6): vertical-flip    - Flip vertically
                           (7): upper-left-diagonal - Flip across upper left/lower right diagonal
  output-buffers      : number of output buffers
                        flags: readable, writable, changeable in NULL, READY, PAUSED or PLAYING state
                        Unsigned Integer. Range: 1 - 4294967295 Default: 4 
  interpolation-method: Set interpolation methods
                        flags: readable, writable, controllable
                        Enum "GstInterpolationMethod" Default: 0, "Nearest"
                           (0): Nearest          - Nearest
                           (1): Bilinear         - Bilinear
                           (2): 5-Tap            - 5-Tap
                           (3): 10-Tap           - 10-Tap
                           (4): Smart            - Smart
                           (5): Nicest           - Nicest
  left                : Pixels to crop at left
                        flags: readable, writable
                        Integer. Range: 0 - 2147483647 Default: 0 
  right               : Pixels to crop at right
                        flags: readable, writable
                        Integer. Range: 0 - 2147483647 Default: 0 
  top                 : Pixels to crop at top
                        flags: readable, writable
                        Integer. Range: 0 - 2147483647 Default: 0 
  bottom              : Pixels to crop at bottom
                        flags: readable, writable
                        Integer. Range: 0 - 2147483647 Default: 0 
  bl-output           : Blocklinear output, applicable only for memory:NVMM NV12 format output buffer
                        flags: readable, writable
                        Boolean. Default: true

如果输出了插件描述,那就表示可用。
若显示:

No such element or plugin 'nvv4l2decoder'

就说明你 JetPack 或 GStreamer 没配好,需要重装 GStreamer + NVIDIA 插件。


2、检查 CUDA 版本

可以在 Python 里验证:

python3 -c "import cv2; print(cv2.getBuildInformation())" | grep CUDA

要看到:

  NVIDIA CUDA:                   YES (ver 10.2, CUFFT CUBLAS FAST_MATH)

3、推流服务

1️⃣ 先有服务端
必须先跑一个 RTSP 服务器(最常用 mediamtx)。
它监听 rtsp://0.0.0.0:8554,等待客户端推流进来。

2️⃣ 再有推流端
Jetson 用 appsrc + x264enc + rtspclientsink,往 rtsp://192.168.10.20:8554/live 推流。

3️⃣ 最后才有拉流端
其他地方用 gst-launch-1.0 rtspsrc 或 VLC 去拉 rtsp://192.168.10.20:8554/live

# 下载二进制
wget https://github.com/bluenviron/mediamtx/releases/latest/download/mediamtx_linux_arm64.tar.gz
tar -xzvf mediamtx_linux_arm64.tar.gz
# 启动
./mediamtx

默认会监听 rtsp://0.0.0.0:8554

执行上面代码生成的./demo,然后查看端口情况。

sudo lsof -i :8554

输出:

COMMAND    PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
mediamtx 25235 nvidia    7u  IPv6 525958      0t0  TCP *:8554 (LISTEN)

此处我用了RTMP服务。注意IP修改为实际IP。

检查推流服务是否可以启动:验证RTMP服务端是否正常:

gst-launch-1.0 videotestsrc ! videoconvert ! video/x-raw,format=I420 ! x264enc tune=zerolatency ! flvmux streamable=true ! rtmpsink location="rtmp://192.168.10.236:1935/live/mystream"

然后另一个终端打开

sudo apt install ffmpeg
ffplay rtmp://192.168.10.236:1935/live/mystream

或者VLC打开链接

rtmp://192.168.10.236:1935/live/mystream

显示如下:

  • 若此测试 可播放,证明 mediamtx 没问题,网络没问题,ffplay 没问题。

✅ 三、最小试验

先搞定一个「最小可运行、可播放」的纯 GStreamer + OpenCV 推流示例,验证整个 appsrcx264encflvmuxrtmpsink 链路。

main.cpp如下:

#include <gst/gst.h>
#include <gst/app/gstappsrc.h>
#include <opencv2/opencv.hpp>
#include <thread>
#include <chrono>
#include <iostream>

using namespace cv;

int main() {
    gst_init(nullptr, nullptr);

    int width = 1280;
    int height = 720;
    int fps = 30;

    // ---- 构造 pipeline ----
    std::string launch =
        "appsrc name=mysrc is-live=true block=true format=time ! "
        "videoconvert ! video/x-raw,format=I420 ! "
        "x264enc speed-preset=ultrafast tune=zerolatency ! "
        "flvmux streamable=true name=mux ! "
        "rtmpsink location=rtmp://192.168.10.236:1935/live/mystream";

    GError* err = nullptr;
    GstElement* pipeline = gst_parse_launch(launch.c_str(), &err);
    if (!pipeline) {
        std::cerr << "Failed to create pipeline: " << err->message << std::endl;
        g_error_free(err);
        return -1;
    }

    GstElement* appsrc = gst_bin_get_by_name(GST_BIN(pipeline), "mysrc");

    // 设置 caps
    GstCaps* caps = gst_caps_new_simple("video/x-raw",
                                        "format", G_TYPE_STRING, "I420",
                                        "width", G_TYPE_INT, width,
                                        "height", G_TYPE_INT, height,
                                        "framerate", GST_TYPE_FRACTION, fps, 1,
                                        nullptr);
    gst_app_src_set_caps(GST_APP_SRC(appsrc), caps);
    gst_caps_unref(caps);

    gst_element_set_state(pipeline, GST_STATE_PLAYING);

    guint64 timestamp = 0;

    // 用黑图代替真视频
    Mat blackBGR(height, width, CV_8UC3, Scalar(0, 0, 0));

    for (int i = 0; i < fps * 10; ++i) {  // 只推 10 秒
        // BGR -> I420
        Mat frameI420;
        cvtColor(blackBGR, frameI420, COLOR_BGR2YUV_I420);

        // 分配 buffer
        GstBuffer* buffer = gst_buffer_new_allocate(nullptr, frameI420.total() * frameI420.elemSize(), nullptr);
        GstMapInfo map;
        gst_buffer_map(buffer, &map, GST_MAP_WRITE);

        memcpy(map.data, frameI420.data, frameI420.total() * frameI420.elemSize());

        gst_buffer_unmap(buffer, &map);

        GST_BUFFER_PTS(buffer) = timestamp;
        GST_BUFFER_DURATION(buffer) = gst_util_uint64_scale_int(1, GST_SECOND, fps);
        timestamp += GST_BUFFER_DURATION(buffer);

        GstFlowReturn ret;
        g_signal_emit_by_name(appsrc, "push-buffer", buffer, &ret);
        gst_buffer_unref(buffer);

        if (ret != GST_FLOW_OK) {
            std::cerr << "Push buffer failed!" << std::endl;
            break;
        }

        std::this_thread::sleep_for(std::chrono::milliseconds(1000 / fps));
    }

    gst_element_send_event(appsrc, gst_event_new_eos());
    gst_element_set_state(pipeline, GST_STATE_NULL);
    gst_object_unref(pipeline);

    std::cout << "Done." << std::endl;
    return 0;
}

CMakeLists.txt

cmake_minimum_required(VERSION 3.10)
project(demo)

find_package(OpenCV REQUIRED)
find_package(PkgConfig REQUIRED)
pkg_check_modules(GST REQUIRED gstreamer-1.0>=1.14 gstreamer-app-1.0)

find_package(Threads REQUIRED)

include_directories(
    ${OpenCV_INCLUDE_DIRS}
    ${GST_INCLUDE_DIRS}
)

add_executable(demo main.cpp)

target_link_libraries(demo
    ${OpenCV_LIBS}
    ${GST_LIBRARIES}
    ${CMAKE_THREAD_LIBS_INIT}
)

另起终端:

ffplay rtmp://192.168.10.236:1935/live/mystream

或者VLC输入:rtmp://192.168.10.236:1935/live/mystream

这个跑通就说明:

  1. appsrcx264encflvmuxrtmpsink 没有结构性 bug;

  2. mediamtx 服务、网络、推流地址没问题;

  3. 后续只需把黑帧替换成拼接好的 remap 结果即可。



✅ 四、主线流程原理

「转 I420 → appsrc → x264enc → flvmux → rtmpsink」核心原理技术主线:

🎯 1️⃣ 为什么要转成 I420?

  • OpenCV 默认处理的是 BGR 格式,摄像头、视频文件也常是 BGR 或 YUV。

  • 视频编码器(H.264)对输入帧通常要求 YUV 格式,最典型就是 I420(即 YUV420P)。

  • I420:先是平面 Y(亮度),接着是 U、V 两个色度分量,UV 都是 1/4 分辨率。

  • 转成 I420,既节省带宽,也能让 H.264 编码器正确工作,避免颜色异常。


🎯 2️⃣ appsrc 的角色:

  • appsrc 是 GStreamer 的“用户态数据源”。

  • 意思是:GStreamer 自己没有采集摄像头或读取文件,而是由 (C++)在循环里把帧喂给它。

  • 它把你的内存帧封装成 GstBuffer,插到后面的管道里流转。


🎯 3️⃣ x264enc 的作用:

  • x264enc 是 GStreamer 插件里调用 x264 库的模块。

  • 它把 I420 原始帧压缩成 H.264 编码帧。

  • speed-preset=ultrafasttune=zerolatency 是典型低延迟直播配置:

    • ultrafast 牺牲压缩率换速度;

    • zerolatency 关闭 B 帧等缓冲,保证实时输出。


🎯 4️⃣ flvmux:

  • flvmux 把压缩后的视频(H.264)封装成 FLV 容器格式。

  • RTMP 推流协议要求封装为 FLV,然后才发送。

  • 没有 mux,就只是裸 H.264 流,不符合 RTMP 要求。


🎯 5️⃣ rtmpsink:

  • rtmpsink 把封装好的 FLV 流,通过 RTMP 协议,推送到流媒体服务器(如 Nginx-RTMP、MediaMTX、SRS)。

  • 这一步就把你的自制帧变成了观众/播放器可拉的 RTMP 直播流。

🎯 6️⃣ 数据主线: 

[OpenCV Mat (BGR)] 
   → cv::cvtColor → I420
      → GstBuffer
         → appsrc 
            → x264enc 
               → flvmux 
                  → rtmpsink → RTMP 推到流媒体服务器

总结

整个过程是:

1️⃣ 采帧/处理帧(拼接、去畸变)
2️⃣ 格式转化(BGR → I420)
3️⃣ 把帧做成 GstBuffer
4️⃣ 通过 appsrc 注入到 GStreamer
5️⃣ 用 x264enc 压缩成 H.264
6️⃣ 用 flvmux 封装成 FLV
7️⃣ rtmpsink 通过 RTMP 发到服务器

链路意义:

  • 帧的生成与内容(自定义画面、拼接结果、加水印都可以)

  • 用 GStreamer 保证编码稳定低延迟跨平台

  • 用 RTMP 兼容绝大多数直播播放器、网页、推流平台

  • 全流程只要带宽够,Jetson 的硬件编解码 + CUDA 可以吃掉绝大部分算力需求

I420 → appsrc → x264enc → flvmux → rtmpsink
= 自研环视拼接 + 工业视觉推流 的最简、最可控、最实时的标准范式。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值