Skip to main content

ClearML 参数搜索命令行界面 (HPO)

使用clearml-param-search CLI工具启动ClearML的自动超参数优化(HPO)。此过程找到实验超参数的最佳值,从而产生性能最佳的模型。

clearml-param-search 是如何工作的?

  1. 执行 clearml-param-search,指定其参数将被优化的基础任务,以及一组要测试的参数值和/或范围。这将创建一个管理整个优化过程的优化任务。
  2. clearml-param-search 创建基础任务的多个克隆:每个克隆的参数设置为指定参数空间中的值。
  3. 每个克隆都由ClearML Agent排队执行。

优化任务记录并监控克隆任务的配置和执行细节,并以表格和图表的形式返回优化结果的摘要。

执行配置

命令行选项

NameDescriptionOptional
--argsList of <argument>=<value> strings to pass to the remote execution. Currently only argparse/click/hydra/fire arguments are supported. Example: --args lr=0.003 batch_size=64Yes
--compute-time-limitThe maximum compute time in minutes that experiment can consume. If this time limit is exceeded, all jobs are aborted.Yes
--max-iteration-per-jobThe maximum iterations (of the objective metric) per single job. When iteration maximum is exceeded, the job is aborted.Yes
--max-number-of-concurrent-tasksThe maximum number of concurrent Tasks (experiments) running at the same timeYes
--min-iteration-per-jobThe minimum iterations (of the objective metric) per single job.Yes
--localIf set, run the experiments locally. Notice that no new python environment will be created. The --script parameter must point to a local file entry point and all arguments must be passed with --argsYes
--objective-metric-seriesObjective metric series to maximize/minimize (e.g. 'loss').No
--objective-metric-signOptimization target, whether to maximize or minimize the value of the objective metric specified. Possible values: "min", "max", "min_global", "max_global". For more information, see Optimization Objective.No
--objective-metric-titleObjective metric title to maximize/minimize (e.g. 'validation').No
--optimization-time-limitThe maximum time (minutes) for the optimization to run. The default is None, indicating no time limit.Yes
--optimizer-classThe optimizer to use. Possible values are: OptimizerOptuna (default), OptimizerBOHB, GridSearch, RandomSearch. For more information, see Supported Optimizers.No
--params-searchParameters space for optimization. See more information in Specifying the Parameter Space.No
--params-overrideAdditional parameters of the base task to override for this parameter search. Use the following JSON format for each parameter: {"name": "param_name", "value": <new_value>}. Windows users, see JSON format note.Yes
--pool-period-minThe time between two consecutive polls (minutes).Yes
--project-nameName of the project in which the optimization task will be created. If the project does not exist, it is created. If unspecified, the repository name is used.Yes
--queueQueue to enqueue the experiments on.Yes
--save-top-k-tasks-onlyKeep only the top <k> performing tasks, and archive the rest of the experiments. Input -1 to keep all tasks. Default: 10.Yes
--scriptScript to run the parameter search on. Required unless --task-id is specified.Yes
--task-idID of a ClearML task whose hyperparameters will be optimized. Required unless --script is specified.Yes
--task-nameName of the optimization task. If unspecified, the base Python script's file name is used.Yes
--time-limit-per-jobMaximum execution time per single job in minutes. When the time limit is exceeded, the job is aborted. Default: no time limit.Yes
--total-max-jobsThe total maximum jobs for the optimization process. The default value is None for unlimited.Yes

指定参数空间

要在超参数优化过程中配置要测试的参数值,请通过--params-search选项传递参数搜索规范,作为参数定义的列表。

为每个参数使用以下JSON格式:

{
"name": str, # Name of the parameter you want to optimize
"type": Union["LogUniformParameterRange", "UniformParameterRange", "UniformIntegerParameterRange", "DiscreteParameterRange"],
# Additional fields depending on type - see below
}

以下是参数类型选项及其对应的字段:

  • LogUniformParameterRange

    • "min_value": float - 用于对数均匀随机采样的最小指数样本
    • "max_value": float - 用于对数均匀随机采样的最大指数样本
    • "base": Optional[float] - 用于提升采样指数的基数。默认值:10
    • "step_size": Optional[float] - 值采样的步长(量化)。默认值:None
    • "include_max_value": Optional[bool] - 是否在范围内包含max_value。默认值:True
  • UniformParameterRange

    • "min_value": float - 用于均匀随机采样的最小值
    • "max_value": float - 用于均匀随机采样的最大值
    • "step_size": Optional[float] - 值采样的步长(量化)。默认值:None
    • "include_max_value": Optional[bool] - 是否在范围内包含max_value。默认值:True
  • UniformIntegerParameterRange

    • "min_value": float - 用于均匀随机采样的最小值
    • "max_value": float - 用于均匀随机采样的最大值
    • "step_size": Optional[int] - 默认值: 1
    • "include_max_value": Optional[bool] - 是否在范围内包含max_value。默认值: True
  • DiscreteParameterRange

    • "values": List[Any]- 从中采样的有效参数值列表

例如:要指定在128到512之间(以128为步长)的layer_1和layer_2大小的均匀范围内进行参数搜索,并使用96、128和160的不同批量大小,请使用以下命令:

clearml-param-search --script keras_simple.py --params-search '{"type": "UniformIntegerParameterRange", "name": "General/layer_1", "min_value": 128, "max_value": 512, "step_size": 128}' '{"type": "UniformIntegerParameterRange", "name": "General/layer_2", "min_value": 128, "max_value": 512, "step_size": 128}' '{"type": "DiscreteParameterRange", "name": "General/batch_size", "values": [96, 128, 160]}' --params-override '{"name": "epochs", "value": 30}'  --objective-metric-title validation --objective-metric-series epoch_accuracy --objective-metric-sign max --optimizer-class OptimizerOptuna --queue default
JSON format for Windows Users

Windows 用户在 JSON 格式输入中使用引号 (") 时必须添加转义符 (\)。例如:

clearml-param-search --script base_template_keras_simple.py --params-search "{\"type\": \"UniformIntegerParameterRange\", \"name\": \"General/layer_1\", \"min_value\": 128, \"max_value\": 512, \"step_size\": 128}" "{\"type\": \"UniformIntegerParameterRange\", \"name\": \"General/layer_2\", \"min_value\": 128, \"max_value\": 512, \"step_size\": 128}" "{\"type\": \"DiscreteParameterRange\", \"name\": \"General/batch_size\", \"values\": [96, 128, 160]}" --params-override "{\"name\": \"epochs\", \"value\": 30}"  --objective-metric-title validation --objective-metric-series epoch_accuracy --objective-metric-sign max --optimizer-class OptimizerOptuna --max-iteration-per-job 30 --queue default

优化目标

使用 --objective-metric-sign 来指定您的优化过程应使用的最佳值。选项包括:

  • min - 实验结束时报告的指定目标指标的最小值
  • max - 实验结束时报告的指定目标指标的最大值
  • min_global - 实验中任何时候报告的目标指标的最小值
  • max_global - 实验中任何时候报告的指定目标指标的最大值