MMCV核心组件知识整理

最新推荐文章于 2025-09-16 07:36:52 发布

原创

最新推荐文章于 2025-09-16 07:36:52 发布 · 2.2k 阅读

3 ·

CC 4.0 BY-SA版权

文章标签：

#python #numpy #人工智能

3.1 MMCV整体概述

提供了上层框架需要的 hook 机制以及可以直接使用的 runner

MMCV 提供了非常多的高性能 cuda 算子及其 python 接口

3.2 FileHandler

可参考https://zhuanlan.zhihu.com/p/336097883

fileio中的核心组件，设计文件读写。

mmcv提供了底层逻辑的读写handler，目前支持的有.json/.yaml/.yml/.pickle/.pkl文件

# 具体用法
import mmcv

# load data from a file
data = mmcv.load('test.json')
data = mmcv.load('test.yaml')
data = mmcv.load('test.pkl')

mmcv.dump(data, 'out.pkl')

mmcv支持自定义拓展的文件格式（即需要的文件格式不在上述列表），链接中给了.npy的例子。

3.3 FileClient

其作用是对外提供统一的文件内容获取 API，主要用于训练过程中数据的后端读取，通过用户选择默认或者自定义不同的 FileClient 后端，可以轻松实现文件缓存、文件加速读取等等功能。

https://zhuanlan.zhihu.com/p/339190576

FileClinet用法示例，其实际调用在 mmseg/datasets/pipelines/loading.py/LoadImageFromFile 类中

class LoadImageFromFile(object): # 加载图片到内存中
    """Load an image from file.

    Required keys are "img_prefix" and "img_info" (a dict that must contain the
    key "filename"). Added or updated keys are "filename", "img", "img_shape",
    "ori_shape" (same as `img_shape`), "pad_shape" (same as `img_shape`),
    "scale_factor" (1.0) and "img_norm_cfg" (means=0 and stds=1).

    Args:
        to_float32 (bool): Whether to convert the loaded image to a float32
            numpy array. If set to False, the loaded image is an uint8 array.
            Defaults to False.
        color_type (str): The flag argument for :func:`mmcv.imfrombytes`.
            Defaults to 'color'.
        file_client_args (dict): Arguments to instantiate a FileClient.
            See :class:`mmcv.fileio.FileClient` for details.
            Defaults to ``dict(backend='disk')``.
        imdecode_backend (str): Backend for :func:`mmcv.imdecode`. Default:
            'cv2'
    """

    def __init__(self,
                 to_float32=False,
                 color_type='color',
                 file_client_args=dict(backend='disk'),
                 imdecode_backend='cv2'):
        self.to_float32 = to_float32
        self.color_type = color_type
        # 默认是disk后端
        self.file_client_args = file_client_args.copy()
        self.file_client = None
        self.imdecode_backend = imdecode_backend

    def __call__(self, results):
        """Call functions to load image and get image meta information.

        Args:
            results (dict): Result dict from :obj:`mmseg.CustomDataset`.

        Returns:
            dict: The dict contains loaded image and meta information.
        """

        if self.file_client is None:
            self.file_client = mmcv.FileClient(**self.file_client_args)

        if results.get('img_prefix') is not None:
            filename = osp.join(results['img_prefix'],
                                results['img_info']['filename'])
        else:
            filename = results['img_info']['filename']
        # 读取图片字节内容
        img_bytes = self.file_client.get(filename)
        # 对字节内容进行解码
        img = mmcv.imfrombytes(
            img_bytes, flag=self.color_type, backend=self.imdecode_backend)
        if self.to_float32:
            img = img.astype(np.float32)

        results['filename'] = filename
        results['ori_filename'] = results['img_info']['filename']
        results['img'] = img
        results['img_shape'] = img.shape
        results['ori_shape'] = img.shape
        # Set initial values for default meta_keys
        results['pad_shape'] = img.shape
        results['scale_factor'] = 1.0
        num_channels = 1 if len(img.shape) < 3 else img.shape[2]
        results['img_norm_cfg'] = dict(
            mean=np.zeros(num_channels, dtype=np.float32),
            std=np.ones(num_channels, dtype=np.float32),
            to_rgb=False)
        return results

    def __repr__(self):
        repr_str = self.__class__.__name__
        repr_str += f'(to_float32={self.to_float32},'
        repr_str += f"color_type='{self.color_type}',"
        repr_str += f"imdecode_backend='{self.imdecode_backend}')"
        return repr_str

扩展开发示例提供了img文件和annotations 文件不在同一个地方的例子。

3.4 Config

Config 主要是提供各种格式的配置文件解析功能，包括 py、json、ymal 和 yml，是一个非常基础常用类

https://zhuanlan.zhihu.com/p/346203167

3.4.1 Config用法汇总

3.4.1.1 通过 dict 生成 config

mmseg/configs目录下的很多文件是用这个方法定义的

cfg = Config(dict(a=1, b=dict(b1=[0, 1])))

# 可以通过 .属性方式访问，比较方便
cfg.b.b1 # [0, 1]

3.4.1.2 通过配置文件生成 config

该功能最为常用，配置文件可以是 py、yaml、yml 和 json 格式。

cfg = Config.fromfile('tests/data/config/a.py')

cfg.filename
cfg.item4 # 'test'
cfg # 打印 config path，和字典内容...

3.4.1.3 自动替换预定义变量

假设h.py文件里面存储的内容是：

cfg_dict = dict(
        item1='{
   
   {fileBasename}}',
        item2='{
   
   {fileDirname}}',
        item3='abc_{
   
   {fileBasenameNoExtension }}')

则可以通过参数 use_predefined_variables 实现自动替换预定义变量功能

# cfg_file 文件名是 h.py
cfg = Config.fromfile(cfg_file, use_predefined_variables=True)
print(cfg.pretty_text)

# 输出
item1 = 'h.py'
item2 = 'config 文件路径'
item3 = 'abc_h'

该参数主要用途是自动替换 Config 类中已经预定义好的变量模板为真实值，在某些场合有用，目前支持 4 个变量：fileDirname、fileBasename、fileBasenameNoExtension 和 fileExtname，预定义变量参考自 VS Code

如果 use_predefined_variables=False( 默认为 True )，则不会进行任何替换。

3.4.1.4 导入自定义模块

Config.fromfile 函数除了有 filename 和 use_predefined_variables 参数外，还有 import_custom_modules，默认是 True，即当 cfg中存在 custom_imports 键时候会对里面的内容进行自动导入，其输入格式要么是 str 要么是 list[str]，表示待导入的模块路径，一个典型用法是：

在mmseg/datasets目录下新建greenscreen.py时，需要在__init__里面加入

from .greenscreen import GreenScreenDataset

但是上述做法在某些场景下会比较麻烦。例如该模块处于非常深的层级，那么就需要逐层修改 __init__.py，有了本参数，便可以采用如下做法：

# .py 文件里面存储如下内容
custom_imports = dict(
    imports=['mmdet.models.backbones.mobilenet'],
    allow_failed_imports=False)

# 自动导入 mmdet.models.backbones.mobilenet
Config.fromfile(cfg_file, import_custom_modules=True)

3.4.1.5 合并多个配置文件

(1) 从 base 文件中合并 Config 支持基于单个 base 配置文件，然后合并其余配置，最终得到一个汇总配置，该功能在各大上层框架中使用非常频繁，可以极大的增加配置复用性。一个典型用法是：

# base.py 内容

item1 = [1, 2]
item2 = {
   
   'a': 0}
item3 = True
item4 = 'test'

# d.py 内容
_base_ = './base.py'
item1 = [2, 3]
item2 = {
   
   'a': 1}
item3 = False
item4 = 'test_base'

# 用法
cfg = Config.fromfile('d.py'