DefaultOptimWrapperConstructor¶
- class mmengine.optim.DefaultOptimWrapperConstructor(optim_wrapper_cfg, paramwise_cfg=None)[源代码]¶
Default constructor for optimizers.
By default, each parameter share the same optimizer settings, and we provide an argument
paramwise_cfgto specify parameter-wise settings. It is a dict and may contain the following fields:custom_keys(dict): Specified parameters-wise settings by keys. If one of the keys incustom_keysis a substring of the name of one parameter, then the setting of the parameter will be specified bycustom_keys[key]and other setting likebias_lr_multetc. will be ignored. It should be noted that the aforementionedkeyis the longest key that is a substring of the name of the parameter. If there are multiple matched keys with the same length, then the key with lower alphabet order will be chosen.custom_keys[key]should be a dict and may contain fieldslr_multanddecay_mult. See Example 2 below.bias_lr_mult(float): It will be multiplied to the learning rate for all bias parameters (except for those in normalization layers and offset layers of DCN).bias_decay_mult(float): It will be multiplied to the weight decay for all bias parameters (except for those in normalization layers, depthwise conv layers, offset layers of DCN).norm_decay_mult(float): It will be multiplied to the weight decay for all weight and bias parameters of normalization layers.flat_decay_mult(float): It will be multiplied to the weight decay for all one-dimensional parametersdwconv_decay_mult(float): It will be multiplied to the weight decay for all weight and bias parameters of depthwise conv layers.dcn_offset_lr_mult(float): It will be multiplied to the learning rate for parameters of offset layer in the deformable convs of a model.bypass_duplicate(bool): If true, the duplicate parameters would not be added into optimizer. Defaults to False.
备注
1. If the option
dcn_offset_lr_multis used, the constructor will override the effect ofbias_lr_multin the bias of offset layer. So be careful when using bothbias_lr_multanddcn_offset_lr_mult. If you wish to apply both of them to the offset layer in deformable convs, setdcn_offset_lr_multto the originaldcn_offset_lr_mult*bias_lr_mult.2. If the option
dcn_offset_lr_multis used, the constructor will apply it to all the DCN layers in the model. So be careful when the model contains multiple DCN layers in places other than backbone.- 参数:
optim_wrapper_cfg (dict) –
The config dict of the optimizer wrapper.
Required fields of
optim_wrapper_cfgaretype: class name of the OptimizerWrapperoptimizer: The configuration of optimizer.
Optional fields of
optim_wrapper_cfgareany arguments of the corresponding optimizer wrapper type, e.g., accumulative_counts, clip_grad, etc.
Required fields of
optimizeraretype: class name of the optimizer.
Optional fields of
optimizerareany arguments of the corresponding optimizer type, e.g., lr, weight_decay, momentum, etc.
paramwise_cfg (dict, optional) – Parameter-wise options.
- Example 1:
>>> model = torch.nn.modules.Conv1d(1, 1, 1) >>> optim_wrapper_cfg = dict( >>> dict(type='OptimWrapper', optimizer=dict(type='SGD', lr=0.01, >>> momentum=0.9, weight_decay=0.0001)) >>> paramwise_cfg = dict(norm_decay_mult=0.) >>> optim_wrapper_builder = DefaultOptimWrapperConstructor( >>> optim_wrapper_cfg, paramwise_cfg) >>> optim_wrapper = optim_wrapper_builder(model)
- Example 2:
>>> # assume model have attribute model.backbone and model.cls_head >>> optim_wrapper_cfg = dict(type='OptimWrapper', optimizer=dict( >>> type='SGD', lr=0.01, weight_decay=0.95)) >>> paramwise_cfg = dict(custom_keys={ >>> 'backbone': dict(lr_mult=0.1, decay_mult=0.9)}) >>> optim_wrapper_builder = DefaultOptimWrapperConstructor( >>> optim_wrapper_cfg, paramwise_cfg) >>> optim_wrapper = optim_wrapper_builder(model) >>> # Then the `lr` and `weight_decay` for model.backbone is >>> # (0.01 * 0.1, 0.95 * 0.9). `lr` and `weight_decay` for >>> # model.cls_head is (0.01, 0.95).
- add_params(params, module, prefix='', is_dcn_module=None)[源代码]¶
Add all parameters of module to the params list.
The parameters of the given module will be added to the list of param groups, with specific rules defined by paramwise_cfg.
- 参数:
params (list[dict]) – A list of param groups, it will be modified in place.
module (nn.Module) – The module to be added.
prefix (str) – The prefix of the module
is_dcn_module (int|float|None) – If the current module is a submodule of DCN, is_dcn_module will be passed to control conv_offset layer’s learning rate. Defaults to None.
- 返回类型:
None