开始讲解之前推荐一下我的专栏本专栏的内容支持(分类、检测、分割、追踪、关键点检测),专栏目前为限时折扣欢迎大家订阅本专栏本专栏每周更新5-7篇最新机制更有包含我所有改进的文件和交流群提供给大家本人定期在群内分享发表论文方法和经验。一、本文介绍本文给家大家带来的改进机制是iRMB其是在论文Rethinking Mobile Block for Efficient Attention-based Models种提出论文提出了一个新的主干网络EMO(后面我也会教大家如何使用该主干本文先教大家使用该文中提出的注意力机制)。其主要思想是将轻量级的CNN架构与基于注意力的模型结构相结合(有点类似ACmix)我将iRMB和C2PSA结合然后也将其用在了检测头种进行尝试三种结果进行对比针对的作用也不相同但是无论那种实验均有一定涨点效果同时该注意力机制属于是比较轻量化的参数量比较小训练速度也很快本文后面我会将各种添加方法教给大家让大家在自己的模型中进行复现。专栏链接YOLOv26有效涨点专栏包含Conv、注意力机制、主干/Backbone、损失函数、优化器、后处理等改进机制目录一、本文介绍二、iRMB的框架原理2.1 iRMB结构2.2 倒置残差块2.3 元移动块Meta-Mobile Block三、iRMB的核心代码四、手把手教你添加iRMB和C2PSA_iRMB机制4.1 修改一4.2 修改二4.3 修改三4.4 修改四4.5 修改五4.6 修改六五、正式训练5.1 yaml文件5.1.1 yaml文件15.1.2 yaml文件25.2 训练代码5.3 训练过程截图五、本文总结二、iRMB的框架原理​官方论文地址官方论文地址点击即可跳转官方代码地址官方代码地址点击即可跳转​iRMBInverted Residual Mobile Block的主要思想是将轻量级的CNN架构与基于注意力的模型结构相结合(有点类似ACmix)以创建高效的移动网络。iRMB通过重新考虑倒置残差块IRB和Transformer的有效组件实现了一种统一的视角从而扩展了CNN的IRB到基于注意力的模型。iRMB的设计目标是在保持模型轻量级的同时实现对计算资源的有效利用和高准确率。这一方法通过在下游任务上的广泛实验得到验证展示出其在轻量级模型领域的优越性能。iRMB的主要创新点在于以下三点1. 结合CNN的轻量级特性和Transformer的动态模型能力创新提出了iRMB结构适用于移动设备上的密集预测任务。2. 使用倒置残差块设计扩展了传统CNN的IRB到基于注意力的模型增强了模型处理长距离信息的能力。3. 提出了元移动块Meta-Mobile Block通过不同的扩展比率和高效操作符实现了模型的模块化设计使得模型更加灵活和高效。2.1 iRMB结构iRMB 结构的主要创新点是它结合了卷积神经网络CNN的轻量级特性和 Transformer 模型的动态处理能力。这种结构特别适用于移动设备上的密集预测任务因为它旨在在计算能力有限的环境中提供高效的性能。iRMB 通过其倒置残差设计改进了信息流的处理允许在保持模型轻量的同时捕捉和利用长距离依赖这对于图像分类、对象检测和语义分割等任务至关重要。这种设计使得模型在资源受限的设备上也能高效运行同时保持或提高预测准确性。​上面的图片来自与论文的图片2展示了iRMBInverted Residual Mobile Block的设计理念和结构。左侧是从多头自注意力和前馈网络中抽象出的统一元移动块Meta-Mobile Block它将不同扩展比率和高效操作符结合起来形成特定的模块。右侧是基于iRMB构建的类似ResNet的高效模型EMO它仅由推导出的iRMB组成并用于各种下游任务如分类CLS、检测Det和分割Seg。这种设计实现了模型的轻量化同时保持了良好的性能和效率。​这幅图展示了iRMBInverted Residual Mobile Block的结构范式。iRMB是一种混合网络模块它结合了深度可分离卷积3x3 DW-Conv和自注意力机制。1x1卷积用于通道数的压缩和扩张以此优化计算效率。深度可分离卷积DW-Conv用于捕捉空间特征而注意力机制则用于捕获特征间的全局依赖关系。2.2 倒置残差块在iRMB设计中使用倒置残差块IRB的概念被扩展到了基于注意力的模型中。这使得模型能够更有效地处理长距离信息这是因为自注意力机制能够捕获输入数据中不同部分之间的全局依赖关系。传统的CNN通常只能捕捉到局部特征而通过引入注意力机制iRMB能够在提取特征时考虑到整个输入空间增强了模型对复杂数据模式的理解能力特别是在处理视觉和序列数据时。这种结合了传统CNN的轻量化和Transformer的长距离建模能力的设计为在资源受限的环境中实现高效的深度学习模型提供了新的可能性(文章中并没有关于IRB的结构图)。2.3 元移动块Meta-Mobile Block元移动块Meta-Mobile Block它通过不同的扩展比率和高效操作符实现模块化设计。这种方法使得模型可以根据需要调整其容量而无需重新设计整个网络。元移动块的核心理念是通过可插拔的方式将不同的操作如卷积、自注意力等集成到一个统一的框架中从而提高模型的效率和灵活性。这允许模型在复杂性和计算效率之间进行更好的权衡特别适用于那些需要在有限资源下运行的应用。​图中展示的是Meta Mobile Block的设计。在这个构件中1x1的卷积层被用来改变特征图的通道数从而控制网络的容量。中间的“Efficient Operator”是一个高效的运算符可以是自注意力机制或其他任何高效的层或操作。这种设计使得Meta Mobile Block能够灵活地适应不同的任务需求并保持高效的计算性能。通过这样的模块化网络能够在不同的环境和任务中进行快速调整和优化。三、iRMB的核心代码该代码的使用方式我们看章节四来进行使用.import math import torch import torch.nn as nn import torch.nn.functional as F from functools import partial from einops import rearrange from timm.models._efficientnet_blocks import SqueezeExcite from timm.models.layers import DropPath __all__ [iRMB, C2PSA_iRMB] inplace True # 全局变量 class LayerNorm2d(nn.Module): def __init__(self, normalized_shape, eps1e-6, elementwise_affineTrue): super().__init__() self.norm nn.LayerNorm(normalized_shape, eps, elementwise_affine) def forward(self, x): x rearrange(x, b c h w - b h w c).contiguous() x self.norm(x) x rearrange(x, b h w c - b c h w).contiguous() return x def get_norm(norm_layerin_1d): eps 1e-6 norm_dict { none: nn.Identity, in_1d: partial(nn.InstanceNorm1d, epseps), in_2d: partial(nn.InstanceNorm2d, epseps), in_3d: partial(nn.InstanceNorm3d, epseps), bn_1d: partial(nn.BatchNorm1d, epseps), bn_2d: partial(nn.BatchNorm2d, epseps), # bn_2d: partial(nn.SyncBatchNorm, epseps), bn_3d: partial(nn.BatchNorm3d, epseps), gn: partial(nn.GroupNorm, epseps), ln_1d: partial(nn.LayerNorm, epseps), ln_2d: partial(LayerNorm2d, epseps), } return norm_dict[norm_layer] def get_act(act_layerrelu): act_dict { none: nn.Identity, relu: nn.ReLU, relu6: nn.ReLU6, silu: nn.SiLU } return act_dict[act_layer] class ConvNormAct(nn.Module): def __init__(self, dim_in, dim_out, kernel_size, stride1, dilation1, groups1, biasFalse, skipFalse, norm_layerbn_2d, act_layerrelu, inplaceTrue, drop_path_rate0.): super(ConvNormAct, self).__init__() self.has_skip skip and dim_in dim_out padding math.ceil((kernel_size - stride) / 2) self.conv nn.Conv2d(dim_in, dim_out, kernel_size, stride, padding, dilation, groups, bias) self.norm get_norm(norm_layer)(dim_out) self.act get_act(act_layer)(inplaceinplace) self.drop_path DropPath(drop_path_rate) if drop_path_rate else nn.Identity() def forward(self, x): shortcut x x self.conv(x) x self.norm(x) x self.act(x) if self.has_skip: x self.drop_path(x) shortcut return x class iRMB(nn.Module): def __init__(self, dim_in, norm_inTrue, has_skipTrue, exp_ratio1.0, norm_layerbn_2d, act_layerrelu, v_projTrue, dw_ks3, stride1, dilation1, se_ratio0.0, dim_head8, window_size7, attn_sTrue, qkv_biasFalse, attn_drop0., drop0., drop_path0., v_groupFalse, attn_preFalse): super().__init__() dim_out dim_in self.norm get_norm(norm_layer)(dim_in) if norm_in else nn.Identity() dim_mid int(dim_in * exp_ratio) self.has_skip (dim_in dim_out and stride 1) and has_skip self.attn_s attn_s if self.attn_s: assert dim_in % dim_head 0, dim should be divisible by num_heads self.dim_head dim_head self.window_size window_size self.num_head dim_in // dim_head self.scale self.dim_head ** -0.5 self.attn_pre attn_pre self.qk ConvNormAct(dim_in, int(dim_in * 2), kernel_size1, biasqkv_bias, norm_layernone, act_layernone) self.v ConvNormAct(dim_in, dim_mid, kernel_size1, groupsself.num_head if v_group else 1, biasqkv_bias, norm_layernone, act_layeract_layer, inplaceinplace) self.attn_drop nn.Dropout(attn_drop) else: if v_proj: self.v ConvNormAct(dim_in, dim_mid, kernel_size1, biasqkv_bias, norm_layernone, act_layeract_layer, inplaceinplace) else: self.v nn.Identity() self.conv_local ConvNormAct(dim_mid, dim_mid, kernel_sizedw_ks, stridestride, dilationdilation, groupsdim_mid, norm_layerbn_2d, act_layersilu, inplaceinplace) self.se SqueezeExcite(dim_mid, rd_ratiose_ratio, act_layerget_act(act_layer)) if se_ratio 0.0 else nn.Identity() self.proj_drop nn.Dropout(drop) self.proj ConvNormAct(dim_mid, dim_out, kernel_size1, norm_layernone, act_layernone, inplaceinplace) self.drop_path DropPath(drop_path) if drop_path else nn.Identity() def forward(self, x): shortcut x x self.norm(x) B, C, H, W x.shape if self.attn_s: # padding if self.window_size 0: window_size_W, window_size_H W, H else: window_size_W, window_size_H self.window_size, self.window_size pad_l, pad_t 0, 0 pad_r (window_size_W - W % window_size_W) % window_size_W pad_b (window_size_H - H % window_size_H) % window_size_H x F.pad(x, (pad_l, pad_r, pad_t, pad_b, 0, 0,)) n1, n2 (H pad_b) // window_size_H, (W pad_r) // window_size_W x rearrange(x, b c (h1 n1) (w1 n2) - (b n1 n2) c h1 w1, n1n1, n2n2).contiguous() # attention b, c, h, w x.shape qk self.qk(x) qk rearrange(qk, b (qk heads dim_head) h w - qk b heads (h w) dim_head, qk2, headsself.num_head, dim_headself.dim_head).contiguous() q, k qk[0], qk[1] attn_spa (q k.transpose(-2, -1)) * self.scale attn_spa attn_spa.softmax(dim-1) attn_spa self.attn_drop(attn_spa) if self.attn_pre: x rearrange(x, b (heads dim_head) h w - b heads (h w) dim_head, headsself.num_head).contiguous() x_spa attn_spa x x_spa rearrange(x_spa, b heads (h w) dim_head - b (heads dim_head) h w, headsself.num_head, hh, ww).contiguous() x_spa self.v(x_spa) else: v self.v(x) v rearrange(v, b (heads dim_head) h w - b heads (h w) dim_head, headsself.num_head).contiguous() x_spa attn_spa v x_spa rearrange(x_spa, b heads (h w) dim_head - b (heads dim_head) h w, headsself.num_head, hh, ww).contiguous() # unpadding x rearrange(x_spa, (b n1 n2) c h1 w1 - b c (h1 n1) (w1 n2), n1n1, n2n2).contiguous() if pad_r 0 or pad_b 0: x x[:, :, :H, :W].contiguous() else: x self.v(x) x x self.se(self.conv_local(x)) if self.has_skip else self.se(self.conv_local(x)) x self.proj_drop(x) x self.proj(x) x (shortcut self.drop_path(x)) if self.has_skip else x return x def autopad(k, pNone, d1): # kernel, padding, dilation Pad to same shape outputs. if d 1: k d * (k - 1) 1 if isinstance(k, int) else [d * (x - 1) 1 for x in k] # actual kernel-size if p is None: p k // 2 if isinstance(k, int) else [x // 2 for x in k] # auto-pad return p class Conv(nn.Module): Standard convolution with args(ch_in, ch_out, kernel, stride, padding, groups, dilation, activation). default_act nn.SiLU() # default activation def __init__(self, c1, c2, k1, s1, pNone, g1, d1, actTrue): Initialize Conv layer with given arguments including activation. super().__init__() self.conv nn.Conv2d(c1, c2, k, s, autopad(k, p, d), groupsg, dilationd, biasFalse) self.bn nn.BatchNorm2d(c2) self.act self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity() def forward(self, x): Apply convolution, batch normalization and activation to input tensor. return self.act(self.bn(self.conv(x))) def forward_fuse(self, x): Perform transposed convolution of 2D data. return self.act(self.conv(x)) class PSABlock(nn.Module): PSABlock class implementing a Position-Sensitive Attention block for neural networks. This class encapsulates the functionality for applying multi-head attention and feed-forward neural network layers with optional shortcut connections. Attributes: attn (Attention): Multi-head attention module. ffn (nn.Sequential): Feed-forward neural network module. add (bool): Flag indicating whether to add shortcut connections. Methods: forward: Performs a forward pass through the PSABlock, applying attention and feed-forward layers. Examples: Create a PSABlock and perform a forward pass psablock PSABlock(c128, attn_ratio0.5, num_heads4, shortcutTrue) input_tensor torch.randn(1, 128, 32, 32) output_tensor psablock(input_tensor) def __init__(self, c, attn_ratio0.5, num_heads4, shortcutTrue) - None: Initializes the PSABlock with attention and feed-forward layers for enhanced feature extraction. super().__init__() self.attn iRMB(c) self.ffn nn.Sequential(Conv(c, c * 2, 1), Conv(c * 2, c, 1, actFalse)) self.add shortcut def forward(self, x): Executes a forward pass through PSABlock, applying attention and feed-forward layers to the input tensor. x x self.attn(x) if self.add else self.attn(x) x x self.ffn(x) if self.add else self.ffn(x) return x class C2PSA_iRMB(nn.Module): C2PSA module with attention mechanism for enhanced feature extraction and processing. This module implements a convolutional block with attention mechanisms to enhance feature extraction and processing capabilities. It includes a series of PSABlock modules for self-attention and feed-forward operations. Attributes: c (int): Number of hidden channels. cv1 (Conv): 1x1 convolution layer to reduce the number of input channels to 2*c. cv2 (Conv): 1x1 convolution layer to reduce the number of output channels to c. m (nn.Sequential): Sequential container of PSABlock modules for attention and feed-forward operations. Methods: forward: Performs a forward pass through the C2PSA module, applying attention and feed-forward operations. Notes: This module essentially is the same as PSA module, but refactored to allow stacking more PSABlock modules. Examples: c2psa C2PSA(c1256, c2256, n3, e0.5) input_tensor torch.randn(1, 256, 64, 64) output_tensor c2psa(input_tensor) def __init__(self, c1, c2, n1, e0.5): Initializes the C2PSA module with specified input/output channels, number of layers, and expansion ratio. super().__init__() assert c1 c2 self.c int(c1 * e) self.cv1 Conv(c1, 2 * self.c, 1, 1) self.cv2 Conv(2 * self.c, c1, 1) self.m nn.Sequential(*(PSABlock(self.c, attn_ratio0.5, num_headsself.c // 64) for _ in range(n))) def forward(self, x): Processes the input tensor x through a series of PSA blocks and returns the transformed tensor. a, b self.cv1(x).split((self.c, self.c), dim1) b self.m(b) return self.cv2(torch.cat((a, b), 1)) if __name__ __main__: # Generating Sample image image_size (1, 64, 224, 224) image torch.rand(*image_size) # Model model C2PSA_iRMB(64, 64) out model(image) print(out.size())四、手把手教你添加iRMB和C2PSA_iRMB机制下面的步骤如果你不会或者不想麻烦操作可以联系作者获得本专栏添加所有项目文件的源代码可直接训练.4.1 修改一第一还是建立文件我们找到如下ultralytics/nn文件夹下建立一个目录名字呢就是Addmodules文件夹​4.2 修改二然后在Addmodules文件夹内建立一个新的py文件将本文章节三中的“核心代码复制粘贴进去。4.3 修改三第二步我们在该目录下创建一个新的py文件名字为__init__.py然后在其内部导入我们的文件如下图所示。​​​​4.4 修改四第三步我门中到如下文件ultralytics/nn/tasks.py进行导入和注册我们的模块(此处只需要添加一次即可如果你用我其它的改进机制这里的步骤只需要添加一次)​​​​4.5 修改五在ultralytics/nn/tasks.py文件内的parse_model方法函数内位置大概在1500行左右按照图示位置添加即可此处需要自己有一定的判别能力如果不会可联系作者获得视频教程。​​​​4.6 修改六在ultralytics/nn/tasks.py文件内的parse_model方法函数内位置大概在1550行左右按照图示位置添加即可此处一定要对应好位置和缩进否则很容易报错。elif m in {此处填写本章代码的名字.}: c2 ch[f] args [c2, *args]五、正式训练5.1 yaml文件5.1.1 yaml文件1训练信息YOLO26-C2PSA-EMA summary: 259 layers, 2,455,484 parameters, 2,455,484 gradients, 5.7 GFLOPs# Ultralytics AGPL-3.0 License - https://ultralytics.com/license # Ultralytics YOLO26 object detection model with P3/8 - P5/32 outputs # Model docs: https://docs.ultralytics.com/models/yolo26 # Task docs: https://docs.ultralytics.com/tasks/detect # Parameters nc: 80 # number of classes end2end: True # whether to use end-to-end mode reg_max: 1 # DFL bins scales: # model compound scaling constants, i.e. modelyolo26n.yaml will call yolo26.yaml with scale n # [depth, width, max_channels] n: [0.50, 0.25, 1024] # summary: 260 layers, 2,572,280 parameters, 2,572,280 gradients, 6.1 GFLOPs s: [0.50, 0.50, 1024] # summary: 260 layers, 10,009,784 parameters, 10,009,784 gradients, 22.8 GFLOPs m: [0.50, 1.00, 512] # summary: 280 layers, 21,896,248 parameters, 21,896,248 gradients, 75.4 GFLOPs l: [1.00, 1.00, 512] # summary: 392 layers, 26,299,704 parameters, 26,299,704 gradients, 93.8 GFLOPs x: [1.00, 1.50, 512] # summary: 392 layers, 58,993,368 parameters, 58,993,368 gradients, 209.5 GFLOPs # YOLO26n backbone backbone: # [from, repeats, module, args] - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2 - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4 - [-1, 2, C3k2, [256, False, 0.25]] - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8 - [-1, 2, C3k2, [512, False, 0.25]] - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16 - [-1, 2, C3k2, [512, True]] - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32 - [-1, 2, C3k2, [1024, True]] - [-1, 1, SPPF, [1024, 5, 3, True]] # 9 - [-1, 2, C2PSA_EMA, [1024]] # 10 # YOLO26n head head: - [-1, 1, nn.Upsample, [None, 2, nearest]] - [[-1, 6], 1, Concat, [1]] # cat backbone P4 - [-1, 2, C3k2, [512, True]] # 13 - [-1, 1, nn.Upsample, [None, 2, nearest]] - [[-1, 4], 1, Concat, [1]] # cat backbone P3 - [-1, 2, C3k2, [256, True]] # 16 (P3/8-small) - [-1, 1, Conv, [256, 3, 2]] - [[-1, 13], 1, Concat, [1]] # cat head P4 - [-1, 2, C3k2, [512, True]] # 19 (P4/16-medium) - [-1, 1, Conv, [512, 3, 2]] - [[-1, 10], 1, Concat, [1]] # cat head P5 - [-1, 1, C3k2, [1024, True, 0.5, True]] # 22 (P5/32-large) - [[16, 19, 22], 1, Detect, [nc]] # Detect(P3, P4, P5)5.1.2 yaml文件2训练信息YOLO26-Att-EMA summary: 267 layers, 2,506,316 parameters, 2,506,316 gradients, 5.8 GFLOPs# Ultralytics AGPL-3.0 License - https://ultralytics.com/license # Ultralytics YOLO26 object detection model with P3/8 - P5/32 outputs # Model docs: https://docs.ultralytics.com/models/yolo26 # Task docs: https://docs.ultralytics.com/tasks/detect # Parameters nc: 80 # number of classes end2end: True # whether to use end-to-end mode reg_max: 1 # DFL bins scales: # model compound scaling constants, i.e. modelyolo26n.yaml will call yolo26.yaml with scale n # [depth, width, max_channels] n: [0.50, 0.25, 1024] # summary: 260 layers, 2,572,280 parameters, 2,572,280 gradients, 6.1 GFLOPs s: [0.50, 0.50, 1024] # summary: 260 layers, 10,009,784 parameters, 10,009,784 gradients, 22.8 GFLOPs m: [0.50, 1.00, 512] # summary: 280 layers, 21,896,248 parameters, 21,896,248 gradients, 75.4 GFLOPs l: [1.00, 1.00, 512] # summary: 392 layers, 26,299,704 parameters, 26,299,704 gradients, 93.8 GFLOPs x: [1.00, 1.50, 512] # summary: 392 layers, 58,993,368 parameters, 58,993,368 gradients, 209.5 GFLOPs # YOLO26n backbone backbone: # [from, repeats, module, args] - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2 - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4 - [-1, 2, C3k2, [256, False, 0.25]] - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8 - [-1, 2, C3k2, [512, False, 0.25]] - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16 - [-1, 2, C3k2, [512, True]] - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32 - [-1, 2, C3k2, [1024, True]] - [-1, 1, SPPF, [1024, 5, 3, True]] # 9 - [-1, 2, C2PSA, [1024]] # 10 # YOLO26n head head: - [-1, 1, nn.Upsample, [None, 2, nearest]] - [[-1, 6], 1, Concat, [1]] # cat backbone P4 - [-1, 2, C3k2, [512, True]] # 13 - [-1, 1, nn.Upsample, [None, 2, nearest]] - [[-1, 4], 1, Concat, [1]] # cat backbone P3 - [-1, 2, C3k2, [256, True]] # 16 (P3/8-small) - [-1, 1, Conv, [256, 3, 2]] - [[-1, 13], 1, Concat, [1]] # cat head P4 - [-1, 2, C3k2, [512, True]] # 19 (P4/16-medium) - [-1, 1, Conv, [512, 3, 2]] - [[-1, 10], 1, Concat, [1]] # cat head P5 - [-1, 1, C3k2, [1024, True, 0.5, True]] # 22 (P5/32-large) - [16, 1, EMA, []] # 23 # - [19, 1, EMA, []] # 24 # - [22, 1, EMA, []] # 25 # 此处的使用说法注释: 其中上面的三个注意力机制目前仅使用了23层如果你想使用24层那么就取消掉代码注释 # 并将下面检测头中的19改为24,如果想使用第25层注意力机制同理将下面检测头中的22改为25即可。 # 此处用法比较复杂如过不会联系Snu77博主获取视频教程 - [[23, 19, 22], 1, Detect, [nc]] # Detect(P3, P4, P5)5.2 训练代码大家可以创建一个py文件将我给的代码复制粘贴进去配置好自己的文件路径即可运行。import warnings warnings.filterwarnings(ignore) from ultralytics import YOLO if __name__ __main__: model YOLO(模型配置文件地址,也就是5.1你保存到本地文件的地址) # 如何切换模型版本, 上面的ymal文件可以改为 yolo26s.yaml就是使用的26s, # 类似某个改进的yaml文件名称为yolo26-XXX.yaml那么如果想使用其它版本就把上面的名称改为yolo26l-XXX.yaml即可改的是上面YOLO中间的名字不是配置文件的 # model.load(yolo26n.pt) # 是否加载预训练权重,科研不建议大家加载否则很难提升精度 model.train( datar数据集文件地址, # 如果大家任务是其它的ultralytics/cfg/default.yaml找到这里修改task可以改成detect, segment, classify, pose cacheFalse, imgsz640, epochs20, single_clsFalse, # 是否是单类别检测 batch16, close_mosaic0, workers0, device0, optimizerMuSGD, # using SGD/MuSGD # resume, # 这里是填写last.pt地址 ampTrue, # 如果出现训练损失为Nan可以关闭amp projectruns/train, nameexp, )5.3 训练过程截图​五、本文总结到此本文的正式分享内容就结束了在这里给大家推荐我的YOLOv26改进有效涨点专栏本专栏目前为新开的平均质量分98分后期我会根据各种最新的前沿顶会进行论文复现也会对一些老的改进机制进行补充如果大家觉得本文帮助到你了订阅本专栏关注后续更多的更新~专栏链接YOLOv26有效涨点专栏包含Conv、注意力机制、主干/Backbone、损失函数、优化器、后处理等改进机制​​​​​