Foundationpose算法复现

-俊后生-

已于 2024-08-17 22:01:27 修改

阅读量5.5k

点赞数 9

文章标签：计算机视觉人工智能 python

于 2024-04-08 14:10:12 首次发布

本文链接：https://blog.youkuaiyun.com/qq_41977396/article/details/137501249

版权

本文详细记录了如何复现BOP排名第一的Foundationpose算法，包括下载源码、配置环境、安装依赖、构建扩展以及遇到的OpenCV错误解决方案。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Foundationpose算法复现过程记录

Foundationpose算法github地址
 原论文地址

复现了一下Foundationpose，这个目前BOP排名第一的算法，简要记录一下。
在这里插入图片描述

首先下载源码

git clone https://github.com/NVlabs/FoundationPose.git
cd FoundationPose

在FoundationPose新建两个文件夹demo_data, weights，并将权重放到weights文件夹中并解压，将测试数据放到demo_data并解压。
环境配置：

# create conda environment
create -n foundationpose python=3.9

# activate conda environment
conda activate foundationpose

# install dependencies \\requirements.txt中的包比较多，建议分块分多次安装，避免冲突问题
python -m pip install -r requirements.txt

# Install NVDiffRast
python -m pip install --quiet --no-cache-dir git+https://github.com/NVlabs/nvdiffrast.git
# 如果上述命令安装失败，则可以从该git地址下载nvdiffrast源码然后使用  python setup.py  命令安装



# Kaolin (Optional, needed if running model-free setup)
python -m pip install --quiet --no-cache-dir kaolin==0.15.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.0.0_cu118.html
# # 如果上述命令安装失败，则可以从该https地址下载合适版本的kaolin然后使用  pip install "/path_to_whl"  命令安装，下面的 PyTorch3D也是类似的

# PyTorch3D
python -m pip install --quiet --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py39_cu118_pyt200/download.html

# Build extensions
CMAKE_PREFIX_PATH=$CONDA_PREFIX/lib/python3.9/site-packages/pybind11/share/cmake/pybind11 bash build_all_conda.sh

安装eigen库

cd $HOME && wget -q https://gitlab.com/libeigen/eigen/-/archive/3.4.0/eigen-3.4.0.tar.gz && \
tar -xzf eigen-3.4.0.tar.gz && \
cd eigen-3.4.0 && mkdir build && cd build
cmake .. -Wno-dev -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS=-std=c++14 ..
sudo make install
cd $HOME && rm -rf eigen-3.4.0 eigen-3.4.0.tar.gz

测试demo

python run_demo.py

得到如下可视化结果：
在这里插入图片描述
6. 使用自己的模型做位姿估计
这里我使用的是自己做的linemod数据集（参考这个教程：linemod数据集制作与处理）
然后修改run_demo.py代码对应的数据路径即可：

  parser.add_argument('--mesh_file', type=str, default='/root/FoundationPose/Linemod_preprocessed/models/***.ply') #重建的ply模型路径
  parser.add_argument('--test_scene_dir', type=str, default='/root/FoundationPose/Linemod_preprocessed/data/01') #数据路径，主要需要rgb，mask，depth以及相机内参K.txt(3*3矩阵)

另外，需要对datareader.py文件中读取mask以及内参部分的代码做小修改，根据报错来就行。

此外，还要注意模型文件.ply文件中需要有法向量信息，可以用meshlab生成。

todo
使用nerf重建模型；
使用realsense相机完成实时位姿估计；

bug

OpenCV Error: Unspecified error (The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Carbon support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script) in cvShowImage, file /io/opencv/modules/highgui/src/window.cpp, line 545

sudo apt install libgtk2.0-dev pkg-config as the prompt says for Ubuntu users -> [Same Error]
pip uninstall opencv-python-headless -> [Other Error]
pip uninstall opencv-python; pip install opencv-python -> [Solved]