[阅读]Think Python - CH3 Functions

本文深入探讨Python中的函数定义、调用及模块导入方法,解析自定义函数、参数传递、局部变量、函数返回值等核心概念,同时介绍如何使用Python内置数学模块。

调用函数

在程序中,函数是一段被命名的语句,进行某个运算。当你定义一个函数时,需要指明函数名和语句。然后,可以通过函数名调用函数。我们已经见过函数调用了:

>>> type(32)
<type 'int'>

函数名为type。括号内的表达式叫做函数的参数。这个函数的结果,是返回参数的类型。

一般来说,函数都需要获得参数,并返回结果。结果被叫做返回值

类型转换函数

Python提供内建函数,将一种类型的值转化为另外一种。int函数将任何可以转化的参数值转化为一个整数,否则提示错误:

>>> int('32')
32
>>> int('Hello')
ValueError: invalid literal for int(): Hello

int可以把浮点数转化为整数,但是不会四舍五入,而是截断小数部分:

>>> int(3.99999)
3
>>> int(-2.3)
-2

float将整数或者字符串转化成浮点数:

>>> float(32)
32.0
>>> float('3.14159')
3.14159

最后,str将参数转化成字符串:

>>> str(32)
'32'
>>> str(3.14159)
'3.14159'

数学函数

Python有一个数学模块,他提供了大部分常见的数学函数。模块是一个包含相关函数的文件。

在使用之前,必须先加载它:

>>>import math

上面的语句创建一个名叫math的模块对象。如果你print 模块对象,可以得到一些相关信息:

>>> print math
<module 'math' (built-in)>

模块对象包含其中定义的函数和变量。使用模块中的函数,必须写出模块名称和函数名称,它们中间以点隔开。这种格式叫做圆点记法

>>> ratio = signal_power / noise_power
>>> decibels = 10 * math.log10(ratio)
>>> radians = 0.7
>>> height = math.sin(radians)

第一个例子用log10计算分贝的信噪比(假设singal_power和noise_power已经定义)。math模块也提供了log,它计算以e为底的对数。

第二个例子是计算弧度的余弦值。参数变量名,说明了sin或者其他三角函数(cos,tan,等)的参数是弧度。除以360乘以2pi可以把角度转化为弧度:

>>> degrees = 45
>>> radians = degrees / 360.0 * 2 * math.pi
>>> math.sin(radians)
0.707106781187

结构

目前为止,我们接触到的程序元素——变量、表达式和语句——都是孤立的,没有说如何将它们组合在一起。

程序语言一项重要的特性便是包括一些小的代码块并何以组合他们。例如,函数的参数可以是任何包括算术操作符的表达式:

x = math.sin(degrees / 360.0 * 2 * math.pi)

甚至是函数调用:

x = math.exp(math.log(x+1))

几乎所有需要一个值的地方都可以用表达式替代,除了赋值语句的左边,它必须是一个变量名。任何表达式放在左边都会导致语法错误(也有例外的情况,将在后面讲到)。

>>> minutes = hours * 60 # right
>>> hours * 60 = minutes # wrong!
SyntaxError: can't assign to operator

自定义函数

目前,我们只用到Python自带的函数,但也可以自己编写一个函数。函数定义指出函数的名称和函数调用时需要执行的代码。

如下面的例子:

def print_lyrics():
print "I'm a lumberjack, and I'm okay."
print "I sleep all night and I work all day."

def 是定义函数的关键字。函数的名称是 print_lyrice。函数名称的规则和变量名规则一样:字母、数字和一些符号是合法的,但是第一个字符不能是数字。你可以使用关键字作为函数名,但应该避免变量和函数使用同样的名字。

函数名后面的括号为空,表示该函数没有参数。

函数定义的第一样叫做头部;其余的部分叫做函数体。头部以冒号结尾,函数体必须缩进,缩进始终都是4个空格;见“调试部分”。函数体可以保护任意数量的语句。

print 语句中的字符串被双引号括住。单引号和双引号的作用一样,大部分情况下使用单引号,除非像上面这样,字符串中包含了单引号(也叫撇号)。

如果在交互环境中定义函数,解释器会输出省略号(...)提示,定义是否完成:  //试了一下win7 2.7并没有...

>>> def print_lyrics():
... print "I'm a lumberjack, and I'm okay."
... print "I sleep all night and I work all day."
...

在函数的末尾,必须输入一个空行(在脚本里面,不需要这么做)。

函数定义会产生一个同名变量。

>>> print print_lyrics
<function print_lyrics at 0xb7e99e9c>
>>> type(print_lyrics)
<type 'function'>

print_lyrice的值是函数对象,它的类型是'function'。

调用自定义函数的语法和内建函数一样:

>>> print_lyrics()
I'm a lumberjack, and I'm okay.
I sleep all night and I work all day.

定义一个函数以后,可以在其他函数里面使用他,例如,重复上面的函数两次,我们可以写一个叫做repeat_lyrice的函数:

def repeat_lyrics():
    print_lyrics()
    print_lyrics()

然后调用repeat_lyrice:

>>> repeat_lyrics()
I'm a lumberjack, and I'm okay.
I sleep all night and I work all day.
I'm a lumberjack, and I'm okay.
I sleep all night and I work all day.

可惜歌曲不是这么唱的。

 定义和使用

将上面的几段代码放在一起,会成下面的样子:

def print_lyrics():
    print "I'm a lumberjack, and I'm okay."
    print "I sleep all night and I work all day."

def repeat_lyrics():
    print_lyrics()
    print_lyrics()

repeat_lyrics()

这段代码包含两个函数定义:print_lyrice 和 repeat_lyrice。定义函数再执行和其他代码一样,只是创建了一个函数对象。函数里面的语句不会执行,直到调用它,一般函数定义也没有输出。

如你所想,你必须在函数执行之前创建它。也就是说,函数定义的语句必须在首次函数调用之前执行。

练习 3-1

把上面程序的最后一行移动到第一行,这样函数调用出现在定义前面。运行程序,看看错误提示是什么。

//NameError: name 'repeat_lyrics' is not defined

练习 3-2

把函数调用语句放到最后,把print_lyrice定义放在repeat_lyrice后面,运行会发生什么。

//成功

运行流程

为了保证函数首次调用之前已经定义,你必须了解代码运行顺序,叫做,运行流程

程序始终从代码的第一行开始执行。语句从上到下按顺序执行。

函数定义不会改变代码执行顺序,但是记住,函数定义中的代码不会执行,知道它被调用。

函数调用就想运行流程绕了弯。不是执行下一行代码,而是跳到函数体,执行函数体的所有代码,然后回到刚刚的地方。

这听起来很简单,但是一个函数调用另外一个就不简单了。在函数体中间,程序或许会执行另外一个程序中的代码。然而在执行这个函数的过程中,或许会再去执行另外一个函数的代码。

幸运的是,Python可以很好的记住运行到哪一行,函数每次调用完成,程序继续回到刚刚没有完成的并被调用的函数。运行完所有代码,便结束。

说明了什么?当你阅读一个程序的时候,不用每次都从第一行到最后一行。有些时候按照运行流程来阅读会更容易理解。

形参和实参

我们看到一些内建函数需要实参。例如,调用math.sin必须传递一个数值作为实参。一些函数不止一个实参,例如math.pow有两个,基数和指数。

在函数里面,实参被复制给形参变量。这里有一个例子,一个用户自定义的函数有一个参数:

def print_twice(bruce):
    print bruce
    print bruce

这个函数把实参传递给形参变量bruce.当函数调用的时候,打印这个形参两次。

任何可以被打印的值,函数都可以运行。

>>> print_twice('Spam')
Spam
Spam
>>> print_twice(17)
17
17
>>> print_twice(math.pi)
3.14159265359
3.14159265359

与结构同样的规则可以应用于内建函数和自定义函数,所以,我么可以用任何表达式作为print_twice的实参。

>>> print_twice('Spam '*4)
Spam Spam Spam Spam
Spam Spam Spam Spam
>>> print_twice(math.cos(math.pi))
-1.0
-1.0

参数会在函数调用之前计算,所以在上面的例子中,'Span '*4 和math.cos(math.pi)只计算了一次。

也可以使用变量作为函数的实参:

>>> michael = 'Eric, the half a bee.'
>>> print_twice(michael)
Eric, the half a bee.
Eric, the half a bee.

(略)

变量和形参是局部的

在函数内部定义一个变量,它是局部的,也就是说它只在函数内部有效,例如:

def cat_twice(part1, part2):
    cat = part1 + part2
    print_twice(cat)

 这个函数有两个参数,将它们连接起来,并打印两次,这里有一个运行的例子:

>>> line1 = 'Bing tiddle '
>>> line2 = 'tiddle bang.'
>>> cat_twice(line1, line2)
Bing tiddle tiddle bang.
Bing tiddle tiddle bang.

当cat_twice运行结束,变量cat被回收。如果我们尝试打印它,会报错:

>>> print cat
NameError: name 'cat' is not defined

形参也是局部变量。例如,函数print_twice外部,不存在bruce变量。

栈图

为了明白每个变量在什么地方可用,有时候画一个栈图非常有用。就像程序流程图,栈图指明每个变量的值,也指明变量属于哪个函数。

每个函数用一个框架表示。框架是一个旁边为函数名的方框,方框里面是函数的参数和变量名。上面的例子的栈图如图3-1.

框架被分配在栈内存,用来表明哪个函数调用哪个,等等。在这个例子中,print_twice被cat_twice调用,cat_twice被__main__调用,__main__是顶层框架的特殊名称。如果创建一个变量,在任何函数外部,它属于__main__。

每一个形参指向形参相同的值。因此,part1的值和line1相同,part2的值和line2相同,bruce的值和cat相同。

如果函数调用的过程中发生错误,Python打印函数名称和调用它的函数名称,以及调用改函数的函数名称,如此下去,到__main__为止。

例如,你尝试在print_twice中使用cat,将会产生一个变量名错误:

Traceback (innermost last):
  File "test.py", line 13, in __main__
    cat_twice(line1, line2)
  File "test.py", line 5, in cat_twice
    print_twice(cat)
  File "test.py", line 9, in print_twice
    print cat
NameError: name 'cat' is not defined

这一串函数叫做回溯。它告诉你哪个程序文件的哪一行,执行哪个函数时发生错误。展示发生错误的哪一行代码。

回溯的函数的排序和栈图中框架的排序一样。错误发生时,正在运行的函数列在最下方。

有返回值和没有返回值的函数

一些函数我们用于产生结果,例如数学函数;因为没有更好的名称,我们把它叫做有返回值的函数。另外一些函数,例如print_twice,会运算,但是没有返回值。它们叫没有返回值的函数

当你调用一个有返回值的函数时,大多是想使用函数结果;例如,将它复制给一个变量或者用它作为表达式的一部分:

x = math.cos(radians)
golden = (math.sqrt(5) + 1) / 2

如果在交互环境调用函数,Python会展示函数结果:

>>> math.sqrt(5)
2.2360679774997898

但是,在脚本中,如果仅仅调用一个有返回值的函数,返回值会丢掉!

math.sqrt(5)

这个脚本计算了5的平方根,但是没有保存或者展示结果,因此它并没有用。

 没有返回值的函可能会在屏幕上输出,或者有一些其它效果,但是没有返回值。如果你尝试将结果复制给某个变量,会得到一个特殊的值,叫做None。

>>> result = print_twice('Bing')
Bing
Bing
>>> print result
None

None值和字符串'None'不同。它是一个特殊的值,有自己的类型:

>>> print type(None)
<type 'NoneType'>

目前我们写的所有函数都是没有参数的。在后面一些章节中,会开始写有参数的函数。

为什么使用函数?

或许还不清楚,为什么值得,麻烦把程序划分成函数。有几个原因:

  • 创建函数可以将一段代码命名,这样利于程序阅读和调试。
  • 让程序看起来更简洁,因为去掉重复代码。另外,如果需要改动它,只需要改动一次。
  • 将很长的程序划分成函数,可以让你只调试某一段代码,调试好后再将它们放在一起运行。
  • 设计良好的函数可以应用于多个程序。写好并调试好一个函数,可多次使用。

使用from导入

Python提供两种方式导入模块;我们已经见过一种:

>>> import math
>>> print math
<module 'math' (built-in)>
>>> print math.pi
3.14159265359

如果导入math,会得到一个名叫math的模块对象。模块对象包含常数,例如pi,和函数,例如,sin和exp。

但是如果你想直接使用pi,会得到一个报错。

>>> print pi
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'pi' is not defined

 另一种方法,你可以从模块中导入对象,例如:

>>> from math import pi

现在可以直接使用pi了,不需要点标记。

>>> print pi
3.14159265359

或者,可以使用星号(*)导出模块的所有内容:

>>> from math import *
>>> cos(pi)
-1.0

 导入math模块所有内容的好处是代码会变得更简洁。坏处是不同模块的命名可能会发生冲突,或者模块与工作环境中定义的变量名冲突。

调试

如果使用文本编辑器写脚本,可能运行时会由于使用了空格和制表符出错。避免的最好方法是只使用空格(不使用制表符)。很多文本编辑器知道Python,默认会这样做,但是一些不会。

制表符和空格都是看不见的,因此很难调试,所以,为自己找一个便于缩进的编辑器。

在运行之前,不要忘记保存程序。一些开发环境会自动保存,但一些不会。这样,有可能编辑器里面你正在看着的程序和实际运行的不一样。

调试可能会花很长时间,如果你一直运行这个一样的,不正确的程序。

保证你看到的代码就是运行的。如果不确定,输入一些类似  print 'Hello' 的代码在程序最前面,再次运行。如果没有看见hello,说明没有正确运行代码。 

词汇表

函数:

  一段被命名的语句,用来执行某个操作。函数可能有参数和没有参数,也可能有返回值和没有返回值。

函数定义: 

  创建一个新函数的语句,指明函数的名称,参数,和要执行的代码。

函数对象:

  函数定义创建的一个值。函数名是一个指向函数对象的变量。

(函数)头:

  函数定义的第一行

(函数)体:

  函数定义中的一段语句。

形参:

  函数中使用的一个变量名,指向实参传递过来的值。

函数调用:

  执行函数的代码。由函数名称和跟在后面的实参列表组成。

实参:

  函数执行时传递给其的一个值。这个值传递给函数定义时相应的形参。

局部变量:

  函数内部定义的变量。局部变量仅能在函数内部使用。

返回值:

  函数的结果。如果函数作为一个表达式调用,那么返回值也是表达式的值。

有返回值的函数:

  一个有返回值的函数。

没有返回值的函数:

  不返回任何值的函数。

模块:

  一个文件,它包含了相关的函数和一些其他定义。

导入语句:

  一个语句,可以读取模块文件和创建模块对象。

模块对象:

  import语句创建的一个值,可以通过它访问模块中定义的值。

圆点记法:

  一种调用模块中函数的语法,写出模块的名称,跟一个点(实心句号),然后再写函数的名称。

composition:

  用表达式作为另外一个更大的表达式的一部分,或者语句作为另外一个更大语句的一部分。

运行流程:

  程序运行的过程中,语句运行的顺序。

栈图:

  通过图表示函数的栈,包括变量,和变量指向的值。

框架:

  栈图中用来表示一个函数调用的方框。包括局部变量和函数的形参。

回溯:

  运行的函数列表,当错误发生时输出。

 

转载于:https://www.cnblogs.com/zhaoxy/p/5005066.html

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license import contextlib import pickle import re import types from copy import deepcopy from pathlib import Path from .AddModules import * import thop import torch import torch.nn as nn from ultralytics.nn.modules import ( AIFI, C1, C2, C2PSA, C3, C3TR, ELAN1, OBB, PSA, SPP, SPPELAN, SPPF, AConv, ADown, Bottleneck, BottleneckCSP, C2f, C2fAttn, C2fCIB, C2fPSA, C3Ghost, C3k2, C3x, CBFuse, CBLinear, Classify, Concat, Conv, Conv2, ConvTranspose, Detect, DWConv, DWConvTranspose2d, Focus, GhostBottleneck, GhostConv, HGBlock, HGStem, ImagePoolingAttn, Index, Pose, RepC3, RepConv, RepNCSPELAN4, RepVGGDW, ResNetLayer, RTDETRDecoder, SCDown, Segment, TorchVision, WorldDetect, v10Detect, A2C2f, ) from ultralytics.utils import DEFAULT_CFG_DICT, DEFAULT_CFG_KEYS, LOGGER, colorstr, emojis, yaml_load from ultralytics.utils.checks import check_requirements, check_suffix, check_yaml from ultralytics.utils.loss import ( E2EDetectLoss, v8ClassificationLoss, v8DetectionLoss, v8OBBLoss, v8PoseLoss, v8SegmentationLoss, ) from ultralytics.utils.ops import make_divisible from ultralytics.utils.plotting import feature_visualization from ultralytics.utils.torch_utils import ( fuse_conv_and_bn, fuse_deconv_and_bn, initialize_weights, intersect_dicts, model_info, scale_img, time_sync, ) from .AddModules import * class BaseModel(nn.Module): """The BaseModel class serves as a base class for all the models in the Ultralytics YOLO family.""" def forward(self, x, *args, **kwargs): """ Perform forward pass of the model for either training or inference. If x is a dict, calculates and returns the loss for training. Otherwise, returns predictions for inference. Args: x (torch.Tensor | dict): Input tensor for inference, or dict with image tensor and labels for training. *args (Any): Variable length argument list. **kwargs (Any): Arbitrary keyword arguments. Returns: (torch.Tensor): Loss if x is a dict (training), or network predictions (inference). """ if isinstance(x, dict): # for cases of training and validating while training. return self.loss(x, *args, **kwargs) return self.predict(x, *args, **kwargs) def predict(self, x, profile=False, visualize=False, augment=False, embed=None): """ Perform a forward pass through the network. Args: x (torch.Tensor): The input tensor to the model. profile (bool): Print the computation time of each layer if True, defaults to False. visualize (bool): Save the feature maps of the model if True, defaults to False. augment (bool): Augment image during prediction, defaults to False. embed (list, optional): A list of feature vectors/embeddings to return. Returns: (torch.Tensor): The last output of the model. """ if augment: return self._predict_augment(x) return self._predict_once(x, profile, visualize, embed) def _predict_once(self, x, profile=False, visualize=False, embed=None): y, dt, embeddings = [], [], [] # outputs for m in self.model: if m.f != -1: # if not from previous layer x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers if profile: self._profile_one_layer(m, x, dt) if hasattr(m, &#39;backbone&#39;): x = m(x) if len(x) != 5: # 0 - 5 x.insert(0, None) for index, i in enumerate(x): if index in self.save: y.append(i) else: y.append(None) x = x[-1] # 最后一个输出传给下一层 else: x = m(x) # run y.append(x if m.i in self.save else None) # save output if visualize: feature_visualization(x, m.type, m.i, save_dir=visualize) if embed and m.i in embed: embeddings.append(nn.functional.adaptive_avg_pool2d(x, (1, 1)).squeeze(-1).squeeze(-1)) # flatten if m.i == max(embed): return torch.unbind(torch.cat(embeddings, 1), dim=0) return x def _predict_augment(self, x): """Perform augmentations on input image x and return augmented inference.""" LOGGER.warning( f"WARNING ⚠️ {self.__class__.__name__} does not support &#39;augment=True&#39; prediction. " f"Reverting to single-scale prediction." ) return self._predict_once(x) def _profile_one_layer(self, m, x, dt): """ Profile the computation time and FLOPs of a single layer of the model on a given input. Appends the results to the provided list. Args: m (nn.Module): The layer to be profiled. x (torch.Tensor): The input data to the layer. dt (list): A list to store the computation time of the layer. Returns: None """ c = m == self.model[-1] and isinstance(x, list) # is final layer list, copy input as inplace fix flops = thop.profile(m, inputs=[x.copy() if c else x], verbose=False)[0] / 1e9 * 2 if thop else 0 # GFLOPs t = time_sync() for _ in range(10): m(x.copy() if c else x) dt.append((time_sync() - t) * 100) if m == self.model[0]: LOGGER.info(f"{&#39;time (ms)&#39;:>10s} {&#39;GFLOPs&#39;:>10s} {&#39;params&#39;:>10s} module") LOGGER.info(f"{dt[-1]:10.2f} {flops:10.2f} {m.np:10.0f} {m.type}") if c: LOGGER.info(f"{sum(dt):10.2f} {&#39;-&#39;:>10s} {&#39;-&#39;:>10s} Total") def fuse(self, verbose=True): """ Fuse the `Conv2d()` and `BatchNorm2d()` layers of the model into a single layer, in order to improve the computation efficiency. Returns: (nn.Module): The fused model is returned. """ if not self.is_fused(): for m in self.model.modules(): if isinstance(m, (Conv, Conv2, DWConv)) and hasattr(m, "bn"): if isinstance(m, Conv2): m.fuse_convs() m.conv = fuse_conv_and_bn(m.conv, m.bn) # update conv delattr(m, "bn") # remove batchnorm m.forward = m.forward_fuse # update forward if isinstance(m, ConvTranspose) and hasattr(m, "bn"): m.conv_transpose = fuse_deconv_and_bn(m.conv_transpose, m.bn) delattr(m, "bn") # remove batchnorm m.forward = m.forward_fuse # update forward if isinstance(m, RepConv): m.fuse_convs() m.forward = m.forward_fuse # update forward if isinstance(m, RepVGGDW): m.fuse() m.forward = m.forward_fuse self.info(verbose=verbose) return self def is_fused(self, thresh=10): """ Check if the model has less than a certain threshold of BatchNorm layers. Args: thresh (int, optional): The threshold number of BatchNorm layers. Default is 10. Returns: (bool): True if the number of BatchNorm layers in the model is less than the threshold, False otherwise. """ bn = tuple(v for k, v in nn.__dict__.items() if "Norm" in k) # normalization layers, i.e. BatchNorm2d() return sum(isinstance(v, bn) for v in self.modules()) < thresh # True if < &#39;thresh&#39; BatchNorm layers in model def info(self, detailed=False, verbose=True, imgsz=640): """ Prints model information. Args: detailed (bool): if True, prints out detailed information about the model. Defaults to False verbose (bool): if True, prints out the model information. Defaults to False imgsz (int): the size of the image that the model will be trained on. Defaults to 640 """ return model_info(self, detailed=detailed, verbose=verbose, imgsz=imgsz) def _apply(self, fn): """ Applies a function to all the tensors in the model that are not parameters or registered buffers. Args: fn (function): the function to apply to the model Returns: (BaseModel): An updated BaseModel object. """ self = super()._apply(fn) m = self.model[-1] # Detect() if isinstance(m, Detect): # includes all Detect subclasses like Segment, Pose, OBB, WorldDetect m.stride = fn(m.stride) m.anchors = fn(m.anchors) m.strides = fn(m.strides) return self def load(self, weights, verbose=True): """ Load the weights into the model. Args: weights (dict | torch.nn.Module): The pre-trained weights to be loaded. verbose (bool, optional): Whether to log the transfer progress. Defaults to True. """ model = weights["model"] if isinstance(weights, dict) else weights # torchvision models are not dicts csd = model.float().state_dict() # checkpoint state_dict as FP32 csd = intersect_dicts(csd, self.state_dict()) # intersect self.load_state_dict(csd, strict=False) # load if verbose: LOGGER.info(f"Transferred {len(csd)}/{len(self.model.state_dict())} items from pretrained weights") def loss(self, batch, preds=None): """ Compute loss. Args: batch (dict): Batch to compute loss on preds (torch.Tensor | List[torch.Tensor]): Predictions. """ if getattr(self, "criterion", None) is None: self.criterion = self.init_criterion() preds = self.forward(batch["img"]) if preds is None else preds return self.criterion(preds, batch) def init_criterion(self): """Initialize the loss criterion for the BaseModel.""" raise NotImplementedError("compute_loss() needs to be implemented by task heads") class DetectionModel(BaseModel): """YOLOv8 detection model.""" def __init__(self, cfg="yolov8n.yaml", ch=3, nc=None, verbose=True): # model, input channels, number of classes """Initialize the YOLOv8 detection model with the given config and parameters.""" super().__init__() self.yaml = cfg if isinstance(cfg, dict) else yaml_model_load(cfg) # cfg dict if self.yaml["backbone"][0][2] == "Silence": LOGGER.warning( "WARNING ⚠️ YOLOv9 `Silence` module is deprecated in favor of nn.Identity. " "Please delete local *.pt file and re-download the latest model checkpoint." ) self.yaml["backbone"][0][2] = "nn.Identity" # Define model ch = self.yaml["ch"] = self.yaml.get("ch", ch) # input channels if nc and nc != self.yaml["nc"]: LOGGER.info(f"Overriding model.yaml nc={self.yaml[&#39;nc&#39;]} with nc={nc}") self.yaml["nc"] = nc # override YAML value self.model, self.save = parse_model(deepcopy(self.yaml), ch=ch, verbose=verbose) # model, savelist self.names = {i: f"{i}" for i in range(self.yaml["nc"])} # default names dict self.inplace = self.yaml.get("inplace", True) self.end2end = getattr(self.model[-1], "end2end", False) # Build strides m = self.model[-1] # Detect() if isinstance(m, Detect): # includes all Detect subclasses like Segment, Pose, OBB, WorldDetect s = 256 # 2x min stride m.inplace = self.inplace def _forward(x): """Performs a forward pass through the model, handling different Detect subclass types accordingly.""" if self.end2end: return self.forward(x)["one2many"] return self.forward(x)[0] if isinstance(m, (Segment, Pose, OBB)) else self.forward(x) m.stride = torch.tensor([s / x.shape[-2] for x in _forward(torch.zeros(1, ch, s, s))]) # forward self.stride = m.stride m.bias_init() # only run once else: self.stride = torch.Tensor([32]) # default stride for i.e. RTDETR # Init weights, biases initialize_weights(self) if verbose: self.info() LOGGER.info("") def _predict_augment(self, x): """Perform augmentations on input image x and return augmented inference and train outputs.""" if getattr(self, "end2end", False) or self.__class__.__name__ != "DetectionModel": LOGGER.warning("WARNING ⚠️ Model does not support &#39;augment=True&#39;, reverting to single-scale prediction.") return self._predict_once(x) img_size = x.shape[-2:] # height, width s = [1, 0.83, 0.67] # scales f = [None, 3, None] # flips (2-ud, 3-lr) y = [] # outputs for si, fi in zip(s, f): xi = scale_img(x.flip(fi) if fi else x, si, gs=int(self.stride.max())) yi = super().predict(xi)[0] # forward yi = self._descale_pred(yi, fi, si, img_size) y.append(yi) y = self._clip_augmented(y) # clip augmented tails return torch.cat(y, -1), None # augmented inference, train @staticmethod def _descale_pred(p, flips, scale, img_size, dim=1): """De-scale predictions following augmented inference (inverse operation).""" p[:, :4] /= scale # de-scale x, y, wh, cls = p.split((1, 1, 2, p.shape[dim] - 4), dim) if flips == 2: y = img_size[0] - y # de-flip ud elif flips == 3: x = img_size[1] - x # de-flip lr return torch.cat((x, y, wh, cls), dim) def _clip_augmented(self, y): """Clip YOLO augmented inference tails.""" nl = self.model[-1].nl # number of detection layers (P3-P5) g = sum(4**x for x in range(nl)) # grid points e = 1 # exclude layer count i = (y[0].shape[-1] // g) * sum(4**x for x in range(e)) # indices y[0] = y[0][..., :-i] # large i = (y[-1].shape[-1] // g) * sum(4 ** (nl - 1 - x) for x in range(e)) # indices y[-1] = y[-1][..., i:] # small return y def init_criterion(self): """Initialize the loss criterion for the DetectionModel.""" return E2EDetectLoss(self) if getattr(self, "end2end", False) else v8DetectionLoss(self) class OBBModel(DetectionModel): """YOLOv8 Oriented Bounding Box (OBB) model.""" def __init__(self, cfg="yolov8n-obb.yaml", ch=3, nc=None, verbose=True): """Initialize YOLOv8 OBB model with given config and parameters.""" super().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose) def init_criterion(self): """Initialize the loss criterion for the model.""" return v8OBBLoss(self) class SegmentationModel(DetectionModel): """YOLOv8 segmentation model.""" def __init__(self, cfg="yolov8n-seg.yaml", ch=3, nc=None, verbose=True): """Initialize YOLOv8 segmentation model with given config and parameters.""" super().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose) def init_criterion(self): """Initialize the loss criterion for the SegmentationModel.""" return v8SegmentationLoss(self) class PoseModel(DetectionModel): """YOLOv8 pose model.""" def __init__(self, cfg="yolov8n-pose.yaml", ch=3, nc=None, data_kpt_shape=(None, None), verbose=True): """Initialize YOLOv8 Pose model.""" if not isinstance(cfg, dict): cfg = yaml_model_load(cfg) # load model YAML if any(data_kpt_shape) and list(data_kpt_shape) != list(cfg["kpt_shape"]): LOGGER.info(f"Overriding model.yaml kpt_shape={cfg[&#39;kpt_shape&#39;]} with kpt_shape={data_kpt_shape}") cfg["kpt_shape"] = data_kpt_shape super().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose) def init_criterion(self): """Initialize the loss criterion for the PoseModel.""" return v8PoseLoss(self) class ClassificationModel(BaseModel): """YOLOv8 classification model.""" def __init__(self, cfg="yolov8n-cls.yaml", ch=3, nc=None, verbose=True): """Init ClassificationModel with YAML, channels, number of classes, verbose flag.""" super().__init__() self._from_yaml(cfg, ch, nc, verbose) def _from_yaml(self, cfg, ch, nc, verbose): """Set YOLOv8 model configurations and define the model architecture.""" self.yaml = cfg if isinstance(cfg, dict) else yaml_model_load(cfg) # cfg dict # Define model ch = self.yaml["ch"] = self.yaml.get("ch", ch) # input channels if nc and nc != self.yaml["nc"]: LOGGER.info(f"Overriding model.yaml nc={self.yaml[&#39;nc&#39;]} with nc={nc}") self.yaml["nc"] = nc # override YAML value elif not nc and not self.yaml.get("nc", None): raise ValueError("nc not specified. Must specify nc in model.yaml or function arguments.") self.model, self.save = parse_model(deepcopy(self.yaml), ch=ch, verbose=verbose) # model, savelist self.stride = torch.Tensor([1]) # no stride constraints self.names = {i: f"{i}" for i in range(self.yaml["nc"])} # default names dict self.info() @staticmethod def reshape_outputs(model, nc): """Update a TorchVision classification model to class count &#39;n&#39; if required.""" name, m = list((model.model if hasattr(model, "model") else model).named_children())[-1] # last module if isinstance(m, Classify): # YOLO Classify() head if m.linear.out_features != nc: m.linear = nn.Linear(m.linear.in_features, nc) elif isinstance(m, nn.Linear): # ResNet, EfficientNet if m.out_features != nc: setattr(model, name, nn.Linear(m.in_features, nc)) elif isinstance(m, nn.Sequential): types = [type(x) for x in m] if nn.Linear in types: i = len(types) - 1 - types[::-1].index(nn.Linear) # last nn.Linear index if m[i].out_features != nc: m[i] = nn.Linear(m[i].in_features, nc) elif nn.Conv2d in types: i = len(types) - 1 - types[::-1].index(nn.Conv2d) # last nn.Conv2d index if m[i].out_channels != nc: m[i] = nn.Conv2d(m[i].in_channels, nc, m[i].kernel_size, m[i].stride, bias=m[i].bias is not None) def init_criterion(self): """Initialize the loss criterion for the ClassificationModel.""" return v8ClassificationLoss() class RTDETRDetectionModel(DetectionModel): """ RTDETR (Real-time DEtection and Tracking using Transformers) Detection Model class. This class is responsible for constructing the RTDETR architecture, defining loss functions, and facilitating both the training and inference processes. RTDETR is an object detection and tracking model that extends from the DetectionModel base class. Attributes: cfg (str): The configuration file path or preset string. Default is &#39;rtdetr-l.yaml&#39;. ch (int): Number of input channels. Default is 3 (RGB). nc (int, optional): Number of classes for object detection. Default is None. verbose (bool): Specifies if summary statistics are shown during initialization. Default is True. Methods: init_criterion: Initializes the criterion used for loss calculation. loss: Computes and returns the loss during training. predict: Performs a forward pass through the network and returns the output. """ def __init__(self, cfg="rtdetr-l.yaml", ch=3, nc=None, verbose=True): """ Initialize the RTDETRDetectionModel. Args: cfg (str): Configuration file name or path. ch (int): Number of input channels. nc (int, optional): Number of classes. Defaults to None. verbose (bool, optional): Print additional information during initialization. Defaults to True. """ super().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose) def init_criterion(self): """Initialize the loss criterion for the RTDETRDetectionModel.""" from ultralytics.models.utils.loss import RTDETRDetectionLoss return RTDETRDetectionLoss(nc=self.nc, use_vfl=True) def loss(self, batch, preds=None): """ Compute the loss for the given batch of data. Args: batch (dict): Dictionary containing image and label data. preds (torch.Tensor, optional): Precomputed model predictions. Defaults to None. Returns: (tuple): A tuple containing the total loss and main three losses in a tensor. """ if not hasattr(self, "criterion"): self.criterion = self.init_criterion() img = batch["img"] # NOTE: preprocess gt_bbox and gt_labels to list. bs = len(img) batch_idx = batch["batch_idx"] gt_groups = [(batch_idx == i).sum().item() for i in range(bs)] targets = { "cls": batch["cls"].to(img.device, dtype=torch.long).view(-1), "bboxes": batch["bboxes"].to(device=img.device), "batch_idx": batch_idx.to(img.device, dtype=torch.long).view(-1), "gt_groups": gt_groups, } preds = self.predict(img, batch=targets) if preds is None else preds dec_bboxes, dec_scores, enc_bboxes, enc_scores, dn_meta = preds if self.training else preds[1] if dn_meta is None: dn_bboxes, dn_scores = None, None else: dn_bboxes, dec_bboxes = torch.split(dec_bboxes, dn_meta["dn_num_split"], dim=2) dn_scores, dec_scores = torch.split(dec_scores, dn_meta["dn_num_split"], dim=2) dec_bboxes = torch.cat([enc_bboxes.unsqueeze(0), dec_bboxes]) # (7, bs, 300, 4) dec_scores = torch.cat([enc_scores.unsqueeze(0), dec_scores]) loss = self.criterion( (dec_bboxes, dec_scores), targets, dn_bboxes=dn_bboxes, dn_scores=dn_scores, dn_meta=dn_meta ) # NOTE: There are like 12 losses in RTDETR, backward with all losses but only show the main three losses. return sum(loss.values()), torch.as_tensor( [loss[k].detach() for k in ["loss_giou", "loss_class", "loss_bbox"]], device=img.device ) def predict(self, x, profile=False, visualize=False, batch=None, augment=False, embed=None): """ Perform a forward pass through the model. Args: x (torch.Tensor): The input tensor. profile (bool, optional): If True, profile the computation time for each layer. Defaults to False. visualize (bool, optional): If True, save feature maps for visualization. Defaults to False. batch (dict, optional): Ground truth data for evaluation. Defaults to None. augment (bool, optional): If True, perform data augmentation during inference. Defaults to False. embed (list, optional): A list of feature vectors/embeddings to return. Returns: (torch.Tensor): Model&#39;s output tensor. """ y, dt, embeddings = [], [], [] # outputs for m in self.model[:-1]: # except the head part if m.f != -1: # if not from previous layer x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers if profile: self._profile_one_layer(m, x, dt) x = m(x) # run y.append(x if m.i in self.save else None) # save output if visualize: feature_visualization(x, m.type, m.i, save_dir=visualize) if embed and m.i in embed: embeddings.append(nn.functional.adaptive_avg_pool2d(x, (1, 1)).squeeze(-1).squeeze(-1)) # flatten if m.i == max(embed): return torch.unbind(torch.cat(embeddings, 1), dim=0) head = self.model[-1] x = head([y[j] for j in head.f], batch) # head inference return x class WorldModel(DetectionModel): """YOLOv8 World Model.""" def __init__(self, cfg="yolov8s-world.yaml", ch=3, nc=None, verbose=True): """Initialize YOLOv8 world model with given config and parameters.""" self.txt_feats = torch.randn(1, nc or 80, 512) # features placeholder self.clip_model = None # CLIP model placeholder super().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose) def set_classes(self, text, batch=80, cache_clip_model=True): """Set classes in advance so that model could do offline-inference without clip model.""" try: import clip except ImportError: check_requirements("git+https://github.com/ultralytics/CLIP.git") import clip if ( not getattr(self, "clip_model", None) and cache_clip_model ): # for backwards compatibility of models lacking clip_model attribute self.clip_model = clip.load("ViT-B/32")[0] model = self.clip_model if cache_clip_model else clip.load("ViT-B/32")[0] device = next(model.parameters()).device text_token = clip.tokenize(text).to(device) txt_feats = [model.encode_text(token).detach() for token in text_token.split(batch)] txt_feats = txt_feats[0] if len(txt_feats) == 1 else torch.cat(txt_feats, dim=0) txt_feats = txt_feats / txt_feats.norm(p=2, dim=-1, keepdim=True) self.txt_feats = txt_feats.reshape(-1, len(text), txt_feats.shape[-1]) self.model[-1].nc = len(text) def predict(self, x, profile=False, visualize=False, txt_feats=None, augment=False, embed=None): """ Perform a forward pass through the model. Args: x (torch.Tensor): The input tensor. profile (bool, optional): If True, profile the computation time for each layer. Defaults to False. visualize (bool, optional): If True, save feature maps for visualization. Defaults to False. txt_feats (torch.Tensor): The text features, use it if it&#39;s given. Defaults to None. augment (bool, optional): If True, perform data augmentation during inference. Defaults to False. embed (list, optional): A list of feature vectors/embeddings to return. Returns: (torch.Tensor): Model&#39;s output tensor. """ txt_feats = (self.txt_feats if txt_feats is None else txt_feats).to(device=x.device, dtype=x.dtype) if len(txt_feats) != len(x): txt_feats = txt_feats.repeat(len(x), 1, 1) ori_txt_feats = txt_feats.clone() y, dt, embeddings = [], [], [] # outputs for m in self.model: # except the head part if m.f != -1: # if not from previous layer x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers if profile: self._profile_one_layer(m, x, dt) if isinstance(m, C2fAttn): x = m(x, txt_feats) elif isinstance(m, WorldDetect): x = m(x, ori_txt_feats) elif isinstance(m, ImagePoolingAttn): txt_feats = m(x, txt_feats) else: x = m(x) # run y.append(x if m.i in self.save else None) # save output if visualize: feature_visualization(x, m.type, m.i, save_dir=visualize) if embed and m.i in embed: embeddings.append(nn.functional.adaptive_avg_pool2d(x, (1, 1)).squeeze(-1).squeeze(-1)) # flatten if m.i == max(embed): return torch.unbind(torch.cat(embeddings, 1), dim=0) return x def loss(self, batch, preds=None): """ Compute loss. Args: batch (dict): Batch to compute loss on. preds (torch.Tensor | List[torch.Tensor]): Predictions. """ if not hasattr(self, "criterion"): self.criterion = self.init_criterion() if preds is None: preds = self.forward(batch["img"], txt_feats=batch["txt_feats"]) return self.criterion(preds, batch) class Ensemble(nn.ModuleList): """Ensemble of models.""" def __init__(self): """Initialize an ensemble of models.""" super().__init__() def forward(self, x, augment=False, profile=False, visualize=False): """Function generates the YOLO network&#39;s final layer.""" y = [module(x, augment, profile, visualize)[0] for module in self] # y = torch.stack(y).max(0)[0] # max ensemble # y = torch.stack(y).mean(0) # mean ensemble y = torch.cat(y, 2) # nms ensemble, y shape(B, HW, C) return y, None # inference, train output # Functions ------------------------------------------------------------------------------------------------------------ @contextlib.contextmanager def temporary_modules(modules=None, attributes=None): """ Context manager for temporarily adding or modifying modules in Python&#39;s module cache (`sys.modules`). This function can be used to change the module paths during runtime. It&#39;s useful when refactoring code, where you&#39;ve moved a module from one location to another, but you still want to support the old import paths for backwards compatibility. Args: modules (dict, optional): A dictionary mapping old module paths to new module paths. attributes (dict, optional): A dictionary mapping old module attributes to new module attributes. Example: ```python with temporary_modules({"old.module": "new.module"}, {"old.module.attribute": "new.module.attribute"}): import old.module # this will now import new.module from old.module import attribute # this will now import new.module.attribute ``` Note: The changes are only in effect inside the context manager and are undone once the context manager exits. Be aware that directly manipulating `sys.modules` can lead to unpredictable results, especially in larger applications or libraries. Use this function with caution. """ if modules is None: modules = {} if attributes is None: attributes = {} import sys from importlib import import_module try: # Set attributes in sys.modules under their old name for old, new in attributes.items(): old_module, old_attr = old.rsplit(".", 1) new_module, new_attr = new.rsplit(".", 1) setattr(import_module(old_module), old_attr, getattr(import_module(new_module), new_attr)) # Set modules in sys.modules under their old name for old, new in modules.items(): sys.modules[old] = import_module(new) yield finally: # Remove the temporary module paths for old in modules: if old in sys.modules: del sys.modules[old] class SafeClass: """A placeholder class to replace unknown classes during unpickling.""" def __init__(self, *args, **kwargs): """Initialize SafeClass instance, ignoring all arguments.""" pass def __call__(self, *args, **kwargs): """Run SafeClass instance, ignoring all arguments.""" pass class SafeUnpickler(pickle.Unpickler): """Custom Unpickler that replaces unknown classes with SafeClass.""" def find_class(self, module, name): """Attempt to find a class, returning SafeClass if not among safe modules.""" safe_modules = ( "torch", "collections", "collections.abc", "builtins", "math", "numpy", # Add other modules considered safe ) if module in safe_modules: return super().find_class(module, name) else: return SafeClass def torch_safe_load(weight, safe_only=False): """ Attempts to load a PyTorch model with the torch.load() function. If a ModuleNotFoundError is raised, it catches the error, logs a warning message, and attempts to install the missing module via the check_requirements() function. After installation, the function again attempts to load the model using torch.load(). Args: weight (str): The file path of the PyTorch model. safe_only (bool): If True, replace unknown classes with SafeClass during loading. Example: ```python from ultralytics.nn.tasks import torch_safe_load ckpt, file = torch_safe_load("path/to/best.pt", safe_only=True) ``` Returns: ckpt (dict): The loaded model checkpoint. file (str): The loaded filename """ from ultralytics.utils.downloads import attempt_download_asset check_suffix(file=weight, suffix=".pt") file = attempt_download_asset(weight) # search online if missing locally try: with temporary_modules( modules={ "ultralytics.yolo.utils": "ultralytics.utils", "ultralytics.yolo.v8": "ultralytics.models.yolo", "ultralytics.yolo.data": "ultralytics.data", }, attributes={ "ultralytics.nn.modules.block.Silence": "torch.nn.Identity", # YOLOv9e "ultralytics.nn.tasks.YOLOv10DetectionModel": "ultralytics.nn.tasks.DetectionModel", # YOLOv10 "ultralytics.utils.loss.v10DetectLoss": "ultralytics.utils.loss.E2EDetectLoss", # YOLOv10 }, ): if safe_only: # Load via custom pickle module safe_pickle = types.ModuleType("safe_pickle") safe_pickle.Unpickler = SafeUnpickler safe_pickle.load = lambda file_obj: SafeUnpickler(file_obj).load() with open(file, "rb") as f: ckpt = torch.load(f, pickle_module=safe_pickle) else: ckpt = torch.load(file, map_location="cpu") except ModuleNotFoundError as e: # e.name is missing module name if e.name == "models": raise TypeError( emojis( f"ERROR ❌️ {weight} appears to be an Ultralytics YOLOv5 model originally trained " f"with https://github.com/ultralytics/yolov5.\nThis model is NOT forwards compatible with " f"YOLOv8 at https://github.com/ultralytics/ultralytics." f"\nRecommend fixes are to train a new model using the latest &#39;ultralytics&#39; package or to " f"run a command with an official Ultralytics model, i.e. &#39;yolo predict model=yolov8n.pt&#39;" ) ) from e LOGGER.warning( f"WARNING ⚠️ {weight} appears to require &#39;{e.name}&#39;, which is not in Ultralytics requirements." f"\nAutoInstall will run now for &#39;{e.name}&#39; but this feature will be removed in the future." f"\nRecommend fixes are to train a new model using the latest &#39;ultralytics&#39; package or to " f"run a command with an official Ultralytics model, i.e. &#39;yolo predict model=yolov8n.pt&#39;" ) check_requirements(e.name) # install missing module ckpt = torch.load(file, map_location="cpu") if not isinstance(ckpt, dict): # File is likely a YOLO instance saved with i.e. torch.save(model, "saved_model.pt") LOGGER.warning( f"WARNING ⚠️ The file &#39;{weight}&#39; appears to be improperly saved or formatted. " f"For optimal results, use model.save(&#39;filename.pt&#39;) to correctly save YOLO models." ) ckpt = {"model": ckpt.model} return ckpt, file def attempt_load_weights(weights, device=None, inplace=True, fuse=False): """Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a.""" ensemble = Ensemble() for w in weights if isinstance(weights, list) else [weights]: ckpt, w = torch_safe_load(w) # load ckpt args = {**DEFAULT_CFG_DICT, **ckpt["train_args"]} if "train_args" in ckpt else None # combined args model = (ckpt.get("ema") or ckpt["model"]).to(device).float() # FP32 model # Model compatibility updates model.args = args # attach args to model model.pt_path = w # attach *.pt file path to model model.task = guess_model_task(model) if not hasattr(model, "stride"): model.stride = torch.tensor([32.0]) # Append ensemble.append(model.fuse().eval() if fuse and hasattr(model, "fuse") else model.eval()) # model in eval mode # Module updates for m in ensemble.modules(): if hasattr(m, "inplace"): m.inplace = inplace elif isinstance(m, nn.Upsample) and not hasattr(m, "recompute_scale_factor"): m.recompute_scale_factor = None # torch 1.11.0 compatibility # Return model if len(ensemble) == 1: return ensemble[-1] # Return ensemble LOGGER.info(f"Ensemble created with {weights}\n") for k in "names", "nc", "yaml": setattr(ensemble, k, getattr(ensemble[0], k)) ensemble.stride = ensemble[int(torch.argmax(torch.tensor([m.stride.max() for m in ensemble])))].stride assert all(ensemble[0].nc == m.nc for m in ensemble), f"Models differ in class counts {[m.nc for m in ensemble]}" return ensemble def attempt_load_one_weight(weight, device=None, inplace=True, fuse=False): """Loads a single model weights.""" ckpt, weight = torch_safe_load(weight) # load ckpt args = {**DEFAULT_CFG_DICT, **(ckpt.get("train_args", {}))} # combine model and default args, preferring model args model = (ckpt.get("ema") or ckpt["model"]).to(device).float() # FP32 model # Model compatibility updates model.args = {k: v for k, v in args.items() if k in DEFAULT_CFG_KEYS} # attach args to model model.pt_path = weight # attach *.pt file path to model model.task = guess_model_task(model) if not hasattr(model, "stride"): model.stride = torch.tensor([32.0]) model = model.fuse().eval() if fuse and hasattr(model, "fuse") else model.eval() # model in eval mode # Module updates for m in model.modules(): if hasattr(m, "inplace"): m.inplace = inplace elif isinstance(m, nn.Upsample) and not hasattr(m, "recompute_scale_factor"): m.recompute_scale_factor = None # torch 1.11.0 compatibility # Return model and ckpt return model, ckpt def parse_model(d, ch, verbose=True): # model_dict, input_channels(3) """Parse a YOLO model.yaml dictionary into a PyTorch model.""" import ast # Args legacy = True # backward compatibility for v3/v5/v8/v9 models max_channels = float("inf") nc, act, scales = (d.get(x) for x in ("nc", "activation", "scales")) depth, width, kpt_shape = (d.get(x, 1.0) for x in ("depth_multiple", "width_multiple", "kpt_shape")) if scales: scale = d.get("scale") if not scale: scale = tuple(scales.keys())[0] LOGGER.warning(f"WARNING ⚠️ no model scale passed. Assuming scale=&#39;{scale}&#39;.") depth, width, max_channels = scales[scale] if act: Conv.default_act = eval(act) # redefine default activation, i.e. Conv.default_act = nn.SiLU() if verbose: LOGGER.info(f"{colorstr(&#39;activation:&#39;)} {act}") # print if verbose: LOGGER.info(f"\n{&#39;&#39;:>3}{&#39;from&#39;:>20}{&#39;n&#39;:>3}{&#39;params&#39;:>10} {&#39;module&#39;:<45}{&#39;arguments&#39;:<30}") ch = [ch] layers, save, c2 = [], [], ch[-1] # layers, savelist, ch out backbone = False for i, (f, n, m, args) in enumerate(d["backbone"] + d["head"]): # from, number, module, args t=m m = getattr(torch.nn, m[3:]) if "nn." in m else globals()[m] # get module for j, a in enumerate(args): if isinstance(a, str): with contextlib.suppress(ValueError): args[j] = locals()[a] if a in locals() else ast.literal_eval(a) n = n_ = max(round(n * depth), 1) if n > 1 else n # depth gain if m in { Classify, Conv, ConvTranspose, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, C2fPSA, C2PSA, DWConv, Focus, BottleneckCSP, C1, C2, C2f, C3k2, RepNCSPELAN4, ELAN1, ADown, AConv, SPPELAN, C2fAttn, C3, C3TR, C3Ghost, nn.ConvTranspose2d, DWConvTranspose2d, C3x, RepC3, PSA, SCDown, C2fCIB, A2C2f, }: c1, c2 = ch[f], args[0] if c2 != nc: # if c2 not equal to number of classes (i.e. for Classify() output) c2 = make_divisible(min(c2, max_channels) * width, 8) if m is C2fAttn: args[1] = make_divisible(min(args[1], max_channels // 2) * width, 8) # embed channels args[2] = int( max(round(min(args[2], max_channels // 2 // 32)) * width, 1) if args[2] > 1 else args[2] ) # num heads args = [c1, c2, *args[1:]] if m in { BottleneckCSP, C1, C2, C2f, C3k2, C2fAttn, C3, C3TR, C3Ghost, C3x, RepC3, C2fPSA, C2fCIB, C2PSA, A2C2f, }: args.insert(2, n) # number of repeats n = 1 if m is C3k2: # for M/L/X sizes legacy = False if scale in "mlx": args[3] = True if m is A2C2f: legacy = False if scale in "lx": # for L/X sizes args.append(True) args.append(1.2) elif m is AIFI: args = [ch[f], *args] elif m in {HGStem, HGBlock}: c1, cm, c2 = ch[f], args[0], args[1] args = [c1, cm, c2, *args[2:]] if m is HGBlock: args.insert(4, n) # number of repeats n = 1 elif m is ResNetLayer: c2 = args[1] if args[3] else args[1] * 4 elif m is nn.BatchNorm2d: args = [ch[f]] #LSKNet elif m in {LSKNET_T,LSKNET_S}: m = m(*args) c2 = m.width_list backbone =True #BiFPN elif m is BiFPN: length = len([ch[x] for x in f]) args = [length] elif m is Concat: c2 = sum(ch[x] for x in f) elif m in {Detect, WorldDetect, Segment, Pose, OBB, ImagePoolingAttn, v10Detect}: args.append([ch[x] for x in f]) if m is Segment: args[2] = make_divisible(min(args[2], max_channels) * width, 8) if m in {Detect, Segment, Pose, OBB}: m.legacy = legacy elif m is RTDETRDecoder: # special case, channels arg must be passed in index 1 args.insert(1, [ch[x] for x in f]) elif m in {CBLinear, TorchVision, Index}: c2 = args[0] c1 = ch[f] args = [c1, c2, *args[1:]] elif m is CBFuse: c2 = ch[f[-1]] else: c2 = ch[f] # m_ = nn.Sequential(*(m(*args) for _ in range(n))) if n > 1 else m(*args) # module # t = str(m)[8:-2].replace("__main__.", "") # module type # m_.np = sum(x.numel() for x in m_.parameters()) # number params # m_.i, m_.f, m_.type = i, f, t # attach index, &#39;from&#39; index, type # if verbose: # LOGGER.info(f"{i:>3}{str(f):>20}{n_:>3}{m_.np:10.0f} {t:<45}{str(args):<30}") # print # save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist # layers.append(m_) # if i == 0: # ch = [] # ch.append(c2) #替换上面的 if isinstance(c2, list): backbone = True m_ = m m_.backbone = True else: m_ = nn.Sequential(*(m(*args) for _ in range(n))) if n > 1 else m(*args) # module t = str(m)[8:-2].replace(&#39;__main__.&#39;, &#39;&#39;) # module type m.np = sum(x.numel() for x in m_.parameters()) # number params m_.i, m_.f, m_.type = i + 4 if backbone else i, f, t # attach index, &#39;from&#39; index, type if verbose: LOGGER.info(f&#39;{i:>3}{str(f):>20}{n_:>3}{m.np:10.0f} {t:<45}{str(args):<30}&#39;) # print save.extend(x % (i + 4 if backbone else i) for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist layers.append(m_) if i == 0: ch = [] if isinstance(c2, list): ch.extend(c2) for _ in range(5 - len(ch)): ch.insert(0, 0) else: ch.append(c2) return nn.Sequential(*layers), sorted(save) def yaml_model_load(path): """Load a YOLOv8 model from a YAML file.""" path = Path(path) if path.stem in (f"yolov{d}{x}6" for x in "nsmlx" for d in (5, 8)): new_stem = re.sub(r"(\d+)([nslmx])6(.+)?$", r"\1\2-p6\3", path.stem) LOGGER.warning(f"WARNING ⚠️ Ultralytics YOLO P6 models now use -p6 suffix. Renaming {path.stem} to {new_stem}.") path = path.with_name(new_stem + path.suffix) unified_path = re.sub(r"(\d+)([nslmx])(.+)?$", r"\1\3", str(path)) # i.e. yolov8x.yaml -> yolov8.yaml yaml_file = check_yaml(unified_path, hard=False) or check_yaml(path) d = yaml_load(yaml_file) # model dict d["scale"] = guess_model_scale(path) d["yaml_file"] = str(path) return d def guess_model_scale(model_path): """ Takes a path to a YOLO model&#39;s YAML file as input and extracts the size character of the model&#39;s scale. The function uses regular expression matching to find the pattern of the model scale in the YAML file name, which is denoted by n, s, m, l, or x. The function returns the size character of the model scale as a string. Args: model_path (str | Path): The path to the YOLO model&#39;s YAML file. Returns: (str): The size character of the model&#39;s scale, which can be n, s, m, l, or x. """ try: return re.search(r"yolo[v]?\d+([nslmx])", Path(model_path).stem).group(1) # noqa, returns n, s, m, l, or x except AttributeError: return "" def guess_model_task(model): """ Guess the task of a PyTorch model from its architecture or configuration. Args: model (nn.Module | dict): PyTorch model or model configuration in YAML format. Returns: (str): Task of the model (&#39;detect&#39;, &#39;segment&#39;, &#39;classify&#39;, &#39;pose&#39;). Raises: SyntaxError: If the task of the model could not be determined. """ def cfg2task(cfg): """Guess from YAML dictionary.""" m = cfg["head"][-1][-2].lower() # output module name if m in {"classify", "classifier", "cls", "fc"}: return "classify" if "detect" in m: return "detect" if m == "segment": return "segment" if m == "pose": return "pose" if m == "obb": return "obb" # Guess from model cfg if isinstance(model, dict): with contextlib.suppress(Exception): return cfg2task(model) # Guess from PyTorch model if isinstance(model, nn.Module): # PyTorch model for x in "model.args", "model.model.args", "model.model.model.args": with contextlib.suppress(Exception): return eval(x)["task"] for x in "model.yaml", "model.model.yaml", "model.model.model.yaml": with contextlib.suppress(Exception): return cfg2task(eval(x)) for m in model.modules(): if isinstance(m, Segment): return "segment" elif isinstance(m, Classify): return "classify" elif isinstance(m, Pose): return "pose" elif isinstance(m, OBB): return "obb" elif isinstance(m, (Detect, WorldDetect, v10Detect)): return "detect" # Guess from model filename if isinstance(model, (str, Path)): model = Path(model) if "-seg" in model.stem or "segment" in model.parts: return "segment" elif "-cls" in model.stem or "classify" in model.parts: return "classify" elif "-pose" in model.stem or "pose" in model.parts: return "pose" elif "-obb" in model.stem or "obb" in model.parts: return "obb" elif "detect" in model.parts: return "detect" # Unable to determine task from model LOGGER.warning( "WARNING ⚠️ Unable to automatically guess model task, assuming &#39;task=detect&#39;. " "Explicitly define task for your model, i.e. &#39;task=detect&#39;, &#39;segment&#39;, &#39;classify&#39;,&#39;pose&#39; or &#39;obb&#39;." ) return "detect" # assume detect 这上面是task.py文件的内容。下面是LSKNet.py文件的内容。 import torch import torch.nn as nn from torch.nn.modules.utils import _pair as to_2tuple from timm.models.layers import DropPath, to_2tuple from functools import partial import warnings __all__ = [&#39;LSKNET_T&#39;, &#39;LSKNET_S&#39;] class Mlp(nn.Module): def __init__(self, in_features, hidden_features=None, out_features=None, act_layer=nn.GELU, drop=0.): super().__init__() out_features = out_features or in_features hidden_features = hidden_features or in_features self.fc1 = nn.Conv2d(in_features, hidden_features, 1) self.dwconv = DWConv(hidden_features) self.act = act_layer() self.fc2 = nn.Conv2d(hidden_features, out_features, 1) self.drop = nn.Dropout(drop) def forward(self, x): x = self.fc1(x) x = self.dwconv(x) x = self.act(x) x = self.drop(x) x = self.fc2(x) x = self.drop(x) return x class LSKblock(nn.Module): def __init__(self, dim): super().__init__() self.conv0 = nn.Conv2d(dim, dim, 5, padding=2, groups=dim) self.conv_spatial = nn.Conv2d(dim, dim, 7, stride=1, padding=9, groups=dim, dilation=3) self.conv1 = nn.Conv2d(dim, dim // 2, 1) self.conv2 = nn.Conv2d(dim, dim // 2, 1) self.conv_squeeze = nn.Conv2d(2, 2, 7, padding=3) self.conv = nn.Conv2d(dim // 2, dim, 1) def forward(self, x): attn1 = self.conv0(x) attn2 = self.conv_spatial(attn1) attn1 = self.conv1(attn1) attn2 = self.conv2(attn2) attn = torch.cat([attn1, attn2], dim=1) avg_attn = torch.mean(attn, dim=1, keepdim=True) max_attn, _ = torch.max(attn, dim=1, keepdim=True) agg = torch.cat([avg_attn, max_attn], dim=1) sig = self.conv_squeeze(agg).sigmoid() attn = attn1 * sig[:, 0, :, :].unsqueeze(1) + attn2 * sig[:, 1, :, :].unsqueeze(1) attn = self.conv(attn) return x * attn class Attention(nn.Module): def __init__(self, d_model): super().__init__() self.proj_1 = nn.Conv2d(d_model, d_model, 1) self.activation = nn.GELU() self.spatial_gating_unit = LSKblock(d_model) self.proj_2 = nn.Conv2d(d_model, d_model, 1) def forward(self, x): shorcut = x.clone() x = self.proj_1(x) x = self.activation(x) x = self.spatial_gating_unit(x) x = self.proj_2(x) x = x + shorcut return x class Block(nn.Module): def __init__(self, dim, mlp_ratio=4., drop=0., drop_path=0., act_layer=nn.GELU, norm_cfg=None): super().__init__() if norm_cfg: self.norm1 = nn.BatchNorm2d(norm_cfg, dim) self.norm2 = nn.BatchNorm2d(norm_cfg, dim) else: self.norm1 = nn.BatchNorm2d(dim) self.norm2 = nn.BatchNorm2d(dim) self.attn = Attention(dim) self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity() mlp_hidden_dim = int(dim * mlp_ratio) self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, act_layer=act_layer, drop=drop) layer_scale_init_value = 1e-2 self.layer_scale_1 = nn.Parameter( layer_scale_init_value * torch.ones((dim)), requires_grad=True) self.layer_scale_2 = nn.Parameter( layer_scale_init_value * torch.ones((dim)), requires_grad=True) def forward(self, x): x = x + self.drop_path(self.layer_scale_1.unsqueeze(-1).unsqueeze(-1) * self.attn(self.norm1(x))) x = x + self.drop_path(self.layer_scale_2.unsqueeze(-1).unsqueeze(-1) * self.mlp(self.norm2(x))) return x class OverlapPatchEmbed(nn.Module): """ Image to Patch Embedding """ def __init__(self, img_size=224, patch_size=7, stride=4, in_chans=3, embed_dim=768, norm_cfg=None): super().__init__() patch_size = to_2tuple(patch_size) self.proj = nn.Conv2d(in_chans, embed_dim, kernel_size=patch_size, stride=stride, padding=(patch_size[0] // 2, patch_size[1] // 2)) if norm_cfg: self.norm = nn.BatchNorm2d(norm_cfg, embed_dim) else: self.norm = nn.BatchNorm2d(embed_dim) def forward(self, x): x = self.proj(x) _, _, H, W = x.shape x = self.norm(x) return x, H, W class LSKNet(nn.Module): def __init__(self, img_size=224, in_chans=3, dim=None, embed_dims=[64, 128, 256, 512], mlp_ratios=[8, 8, 4, 4], drop_rate=0., drop_path_rate=0., norm_layer=partial(nn.LayerNorm, eps=1e-6), depths=[3, 4, 6, 3], num_stages=4, pretrained=None, init_cfg=None, norm_cfg=None): super().__init__() assert not (init_cfg and pretrained), \ &#39;init_cfg and pretrained cannot be set at the same time&#39; if isinstance(pretrained, str): warnings.warn(&#39;DeprecationWarning: pretrained is deprecated, &#39; &#39;please use "init_cfg" instead&#39;) self.init_cfg = dict(type=&#39;Pretrained&#39;, checkpoint=pretrained) elif pretrained is not None: raise TypeError(&#39;pretrained must be a str or None&#39;) self.depths = depths self.num_stages = num_stages dpr = [x.item() for x in torch.linspace(0, drop_path_rate, sum(depths))] # stochastic depth decay rule cur = 0 for i in range(num_stages): patch_embed = OverlapPatchEmbed(img_size=img_size if i == 0 else img_size // (2 ** (i + 1)), patch_size=7 if i == 0 else 3, stride=4 if i == 0 else 2, in_chans=in_chans if i == 0 else embed_dims[i - 1], embed_dim=embed_dims[i], norm_cfg=norm_cfg) block = nn.ModuleList([Block( dim=embed_dims[i], mlp_ratio=mlp_ratios[i], drop=drop_rate, drop_path=dpr[cur + j], norm_cfg=norm_cfg) for j in range(depths[i])]) norm = norm_layer(embed_dims[i]) cur += depths[i] setattr(self, f"patch_embed{i + 1}", patch_embed) setattr(self, f"block{i + 1}", block) setattr(self, f"norm{i + 1}", norm) self.width_list = [i.size(1) for i in self.forward(torch.randn(1, 3, 640, 640))] def freeze_patch_emb(self): self.patch_embed1.requires_grad = False @torch.jit.ignore def no_weight_decay(self): return {&#39;pos_embed1&#39;, &#39;pos_embed2&#39;, &#39;pos_embed3&#39;, &#39;pos_embed4&#39;, &#39;cls_token&#39;} # has pos_embed may be better def get_classifier(self): return self.head def reset_classifier(self, num_classes, global_pool=&#39;&#39;): self.num_classes = num_classes self.head = nn.Linear(self.embed_dim, num_classes) if num_classes > 0 else nn.Identity() def forward_features(self, x): B = x.shape[0] outs = [] for i in range(self.num_stages): patch_embed = getattr(self, f"patch_embed{i + 1}") block = getattr(self, f"block{i + 1}") norm = getattr(self, f"norm{i + 1}") x, H, W = patch_embed(x) for blk in block: x = blk(x) x = x.flatten(2).transpose(1, 2) x = norm(x) x = x.reshape(B, H, W, -1).permute(0, 3, 1, 2).contiguous() outs.append(x) return outs def forward(self, x): x = self.forward_features(x) # x = self.head(x) return x class DWConv(nn.Module): def __init__(self, dim=768): super(DWConv, self).__init__() self.dwconv = nn.Conv2d(dim, dim, 3, 1, 1, bias=True, groups=dim) def forward(self, x): x = self.dwconv(x) return x def _conv_filter(state_dict, patch_size=16): """ convert patch embedding weight from manual patchify + linear proj to conv""" out_dict = {} for k, v in state_dict.items(): if &#39;patch_embed.proj.weight&#39; in k: v = v.reshape((v.shape[0], 3, patch_size, patch_size)) out_dict[k] = v return out_dict def LSKNET_T(): model = LSKNet(depths=[2, 2, 2, 2]) return model def LSKNET_S(): model = LSKNet() return model if __name__ == &#39;__main__&#39;: model = LSKNet() inputs = torch.randn((1, 3, 640, 640)) for i in model(inputs): print(i.size()) 最下面是BiFPN.py文件,请你结合这三个文件和上面刚刚的运行错误解决这个问题。 import torch.nn as nn import torch class swish(nn.Module): def forward(self, x): return x * torch.sigmoid(x) class BiFPN(nn.Module): def __init__(self, length): super().__init__() self.weight = nn.Parameter(torch.ones(length, dtype=torch.float32), requires_grad=True) self.swish = swish() self.epsilon = 0.0001 def forward(self, x): weights = self.weight / (torch.sum(self.swish(self.weight), dim=0) + self.epsilon) weighted_feature_maps = [weights[i] * x[i] for i in range(len(x))] stacked_feature_maps = torch.stack(weighted_feature_maps, dim=0) result = torch.sum(stacked_feature_maps, dim=0) return result
09-01
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符  | 博主筛选后可见
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值