VINS-Mono 加rgbd

原创

已于 2022-10-18 15:47:47 修改 · 1.3k 阅读

20 ·

CC 4.0 BY-SA版权

文章标签：

#c++

于 2022-08-09 19:13:25 首次发布

本文详细解析了 VINS-Mono 从单目视觉惯性里程计扩展到 RGBD 版本的具体改动，包括特征跟踪、位姿估计及回环闭合等模块的调整。

通过对比VINS-Mono与其RGBD版本，分析其改动思路

一、feature_tracker

feature_tracker_node.cpp

头文件加入了ros的多传感器时间戳

#include <message_filters/time_synchronizer.h>
#include <message_filters/synchronizer.h>
#include <message_filters/sync_policies/approximate_time.h>

主程序中

ros::Subscriber sub_img = n.subscribe(IMAGE_TOPIC, 100, img_callback);

改为

    message_filters::Subscriber<sensor_msgs::Image> sub_image(n, IMAGE_TOPIC, 1);
    message_filters::Subscriber<sensor_msgs::Image> sub_depth(n, DEPTH_TOPIC, 1);
    typedef message_filters::sync_policies::ApproximateTime<sensor_msgs::Image,sensor_msgs::Image> syncPolicy;
    message_filters::Synchronizer<syncPolicy> sync(syncPolicy(10), sub_image, sub_depth);
    sync.registerCallback(boost::bind(&img_callback, _1, _2));

参照 ros消息时间同步与回调_kint_zhao的博客-优快云博客

回调函数也需要修改，接收depth，且此时将img_msg全改为color_msg

void img_callback(const sensor_msgs::ImageConstPtr &img_msg)


void img_callback(const sensor_msgs::ImageConstPtr &color_msg, const sensor_msgs::ImageConstPtr &depth_msg)

增加以下，把depth的ros message也转成cv::Mat，放在image里，创造depth_ptr指向

    cv_bridge::CvImageConstPtr depth_ptr;
    {
        sensor_msgs::Image img;
        img.header = depth_msg->header;
        img.height = depth_msg->height;
        img.width = depth_msg->width;
        img.is_bigendian = depth_msg->is_bigendian;
        img.step = depth_msg->step;
        img.data = depth_msg->data;
        img.encoding = sensor_msgs::image_encodings::MONO16;
        depth_ptr = cv_bridge::toCvCopy(img, sensor_msgs::image_encodings::MONO16);
    }

下面的readImage()，加入depth_ptr

            trackerData[i].readImage(ptr->image.rowRange(ROW * i, ROW * (i + 1)), img_msg->header.stamp.toSec());

         trackerData[i].readImage(ptr->image.rowRange(ROW * i, ROW * (i + 1)), depth_ptr->image.rowRange(ROW * i, ROW * (i + 1)), color_msg->header.stamp.toSec());

双目才用，可改可不改

                trackerData[i].cur_img = ptr->image.rowRange(ROW * i, ROW * (i + 1));


                trackerData[i].cur_img = ptr->image.rowRange(ROW * i, ROW * (i + 1));
                trackerData[i].cur_depth = depth_ptr->image.rowRange(ROW * i, ROW * (i + 1));

经过readImage()操作以后，prev-cur-forw帧所有的数据都基本上获取了也对应上了，然后就需要封装到sensor_msgs发布出去，由vins_estimator_node和rviz接收，分别进行后端操作和可视化。

发布过程中，生成feature_points的过程：给后端喂数据时，增加封装depth为sensor_msgs

    if (PUB_THIS_FRAME)
{
        cv::Mat show_depth = depth_ptr->image;
...
        sensor_msgs::ChannelFloat32 velocity_x_of_point;
        sensor_msgs::ChannelFloat32 velocity_y_of_point;

        sensor_msgs::ChannelFloat32 depth_of_point;

利用这个ros消息的格式进行信息存储，增加depth

向depth_of_point.values添加像素2D点(u,v)的深度信息

        for (int i = 0; i < NUM_OF_CAM; i++)
...
                    velocity_x_of_point.values.push_back(pts_velocity[j].x);
                    velocity_y_of_point.values.push_back(pts_velocity[j].y);

                    depth_of_point.values.push_back((int)show_depth.at<unsigned short>(round(cur_pts[j].y), round(cur_pts[j].x)));

将特征点id，矫正后归一化平面的3D点(x,y,z=1)，像素2D点(u,v)，像素的速度(vx,vy)，和深度depth封装成sensor_msgs::PointCloudPtr类型的feature_points实例中,发布到pub_img

        feature_points->channels.push_back(velocity_x_of_point);
        feature_points->channels.push_back(velocity_y_of_point);

        feature_points->channels.push_back(depth_of_point);

feature_tracker.h

从头文件可以看到，改了readImage(）和增加了特征点的depth信息

void readImage(const cv::Mat &_img, const cv::Mat &_depth, double _cur_time);
...

    cv::Mat prev_depth, cur_depth, forw_depth;

接下来看readImage(）具体加了什么

feature_tracker.cpp

如果第一次输入图像，prev_img这个没用，prev_depth也是

    if (forw_img.empty())
    {
        prev_img = cur_img = forw_img = img;
        prev_depth = cur_depth = forw_depth = _depth;
    }
    else
    {
        forw_img = img;
        forw_depth = _depth;
    }

到函数末尾，所有特征点去畸变，计算速度之后，往前走，准备接收新帧，depth跟着走

    prev_img = cur_img;
    prev_depth = cur_depth;
    prev_pts = cur_pts;
    prev_un_pts = cur_un_pts;
    cur_img = forw_img;
    cur_depth = forw_depth;
    cur_pts = forw_pts;
    undistortedPoints();
    prev_time = cur_time;

parameters.h

增加了深度话题，feature_tracker_node.cpp调用

extern std::string DEPTH_TOPIC;

parameters.cpp

   fsSettings["depth_topic"] >> DEPTH_TOPIC;

二、 vins-estimator

estimator_node.cpp

void process() ，开启一个VIO 线程，没有回环部分

把每张图像中所有特征点信息取出来放在map容器 image里去，然后进行 estimator.processImage(image, img_msg->header)，把深度信息也放到里面去，增加1维

            map<int, vector<pair<int, Eigen::Matrix<double, 7, 1>>>> image;

            map<int, vector<pair<int, Eigen::Matrix<double, 8, 1>>>> image;

                double velocity_x = img_msg->channels[3].values[i];
                double velocity_y = img_msg->channels[4].values[i];
                double depth = img_msg->channels[5].values[i] / 1000.0;

emplace_back到image容器时也要改

                Eigen::Matrix<double, 8, 1> xyz_uv_velocity_depth;
                xyz_uv_velocity_depth << x, y, z, p_u, p_v, velocity_x, velocity_y, depth;
                image[feature_id].emplace_back(camera_id,  xyz_uv_velocity_depth);

这些信息都是在feature_tracker_node.cpp中获得

estimator.h

增加李代数相关的sophus库

#include <sophus/se3.h>
#include <sophus/so3.h>
using Sophus::SE3;
using Sophus::SO3;

用于判断是否为关键帧的processImage()，加入深度阈值判断

    void processImage(const map<int, vector<pair<int, Eigen::Matrix<double, 8, 1>>>> &image, const std_msgs::Header &header);

仿照visualInitialAlign()，visualInitialAlignWithDepth()

    bool visualInitialAlignWithDepth();

estimator.cpp

initialStructure()

在用来后续做sfm的 SFMFeature tmp_feature中，SFMFeature结构体增加observation_depth（initial_sfm.h），来储存深度，把当前特征点的深度和当前帧的编号imu_j配对

            tmp_feature.observation.push_back(make_pair(imu_j, Eigen::Vector2d{pts_j.x(), pts_j.y()}));


            tmp_feature.observation.push_back(make_pair(imu_j, Eigen::Vector2d{pts_j.x(), pts_j.y()}));
            tmp_feature.observation_depth.push_back(make_pair(imu_j, it_per_frame.depth));

初始化函数用改之后的visualInitialAlignWithDepth()

    if (visualInitialAlign())

    if (visualInitialAlignWithDepth())

bool Estimator::visualInitialAlign()也改了，

？？？为什么，不是已经改用visualInitialAlignWithDepth()来初始化了么

    f_manager.triangulate(Ps, &(TIC_TMP[0]), &(RIC[0]));

    f_manager.triangulateWithDepth(Ps, &(TIC_TMP[0]), &(RIC[0]));

visualInitialAlignWithDepth()对于visualInitialAlign()的改动：

改用新的triangulateWithDepth(）来三角化特征点，得到深度

    f_manager.triangulate(Ps, &(TIC_TMP[0]), &(RIC[0]));

    f_manager.triangulateWithDepth(Ps, &(TIC_TMP[0]), &(RIC[0]));

删去了假设的尺度因子s，因此此时的深度已经获得

    double s = (x.tail<1>())(0);

对应的，在把所有的平移对齐到滑窗中的第0帧时，删去s

？？？为什么要对齐

        Ps[i] = s * Ps[i] - Rs[i] * TIC[0] - (s * Ps[0] - Rs[0] * TIC[0]);

        Ps[i] = Ps[i] - Rs[i] * TIC[0] - (Ps[0] - Rs[0] * TIC[0]);

删去“把尺度模糊的3d点恢复到真实尺度下”这一步

    for (auto &it_per_id : f_manager.feature)
    {
        it_per_id.used_num = it_per_id.feature_per_frame.size();
        if (!(it_per_id.used_num >= 2 && it_per_id.start_frame < WINDOW_SIZE - 2))
            continue;
        it_per_id.estimated_depth *= s;
    }

bool Estimator::relativePose()中，

把同时被frame_count_l frame_count_r帧看到的特征点在各自的像素坐标，返回给corres时，用新的getCorrespondingWithDepth（）

        corres = f_manager.getCorresponding(i, WINDOW_SIZE);

        corres = f_manager.getCorrespondingWithDepth(i, WINDOW_SIZE);

用于计算视差的pts_0（u0,v0）；pts_1（u1,v1）；改为 pts_0（u0/z0,v0/z0）；pts_1（u1/z1,v1/z1）

                Vector2d pts_0(corres[j].first(0), corres[j].first(1));
                Vector2d pts_1(corres[j].second(0), corres[j].second(1));

                Vector2d pts_0(corres[j].first(0)/corres[j].first(2), corres[j].first(1)/corres[j].first(2));
                Vector2d pts_1(corres[j].second(0)/corres[j].second(2), corres[j].second(1)/corres[j].second(2));

原本使用2D-2D的E矩阵（对极几何）恢复当前帧到枢纽帧的R、T，现在改用PNP。因为现在有depth（放在corres[？].second(2)），是3D-2D

            if(average_parallax * 460 > 30 && m_estimator.solveRelativeRT(corres, relative_R, relative_T))

            if(average_parallax * 460 > 30 && m_estimator.solveRelativeRT_PNP(corres, relative_R, relative_T))

solveOdometry()中，

把应该三角化但是没有三角化的特征点三角化时，加一步triangulateWithDepth()，对于深度传感器探测范围以外的点，采用原有的三角化方法triangulate()

        f_manager.triangulate(Ps, tic, ric);

        f_manager.triangulateWithDepth(Ps, tic, ric);
        f_manager.triangulate(Ps, tic, ric);

其中的optimization（）（借助ceres进行非线性优化）中,

在定义待优化的参数块，类似g2o的顶点时，不管带不带有传感器时间延时，加入约束的变量（该特征点的第一个观测帧以及其他一个观测帧，加上外参和特征点逆深度）的参数块之后，在有输入深度信息的情况下，设定特征点深度para_Feature在优化过程中不可变

            if (ESTIMATE_TD)
            {
                    ProjectionTdFactor *f_td = new ProjectionTdFactor(pts_i, pts_j, it_per_id.feature_per_frame[0].velocity, it_per_frame.velocity,
                                                                     it_per_id.feature_per_frame[0].cur_td, it_per_frame.cur_td,
                                                                     it_per_id.feature_per_frame[0].uv.y(), it_per_frame.uv.y());
                    problem.AddResidualBlock(f_td, loss_function, para_Pose[imu_i], para_Pose[imu_j], para_Ex_Pose[0], para_Feature[feature_index], para_Td[0]);
            }
            else
            {
                ProjectionFactor *f = new ProjectionFactor(pts_i, pts_j);
                problem.AddResidualBlock(f, loss_function, para_Pose[imu_i], para_Pose[imu_j], para_Ex_Pose[0], para_Feature[feature_index]);
            }



            if (ESTIMATE_TD)
            {
                    ProjectionTdFactor *f_td = new ProjectionTdFactor(pts_i, pts_j, it_per_id.feature_per_frame[0].velocity, it_per_frame.velocity,
                                                                     it_per_id.feature_per_fram

最低0.47元/天解锁文章