Skip to main content

ClearML 服务命令行界面

clearml-serving 实用程序是一个用于模型部署和编排的 CLI 工具。

以下页面提供了clearml-serving的CLI命令的参考:

  • list - 列出正在运行的Serving服务
  • create - 创建一个新的服务
  • metrics - 配置推理指标服务
  • config - 配置一个新的服务
  • model - 为运行中的服务配置模型端点

全局参数

clearml-serving [-h] [--debug] [--yes] [--id ID] {list,create,metrics,config,model} 
NameDescriptionOptional
--idServing Service (Control plane) Task ID to configure (if not provided, automatically detect the running control plane Task)No
--debugPrint debug messagesYes
--yesAlways answer YES on interactive inputsYes
Service ID

执行metricsconfigmodel命令需要Serving Service的ID(--id)。

列表

列出正在运行的服务服务。

clearml-serving list [-h]

创建

创建一个新的服务服务。

clearml-serving create [-h] [--name NAME] [--tags TAGS [TAGS ...]] [--project PROJECT]

参数

NameDescriptionOptional
--nameServing service's name. Default: Serving-ServiceNo
--projectServing service's project. Default: DevOpsNo
--tagsServing service's user tags. The serving service can be labeled, which can be useful for organizingYes

指标

配置推理指标服务。

clearml-serving metrics [-h] {add,remove,list}

添加

为特定端点添加/修改指标。

clearml-serving metrics add [-h] --endpoint ENDPOINT [--log-freq LOG_FREQ]
[--variable-scalar VARIABLE_SCALAR [VARIABLE_SCALAR ...]]
[--variable-enum VARIABLE_ENUM [VARIABLE_ENUM ...]]
[--variable-value VARIABLE_VALUE [VARIABLE_VALUE ...]]

参数

NameDescriptionOptional
--endpointMetric endpoint name including version (e.g. "model/1" or a prefix "model/*"). Notice: it will override any previous endpoint logged metricsNo
--log-freqLogging request frequency, between 0.0 to 1.0. Example: 1.0 means all requests are logged, 0.5 means half of the requests are logged if not specified. To use global logging frequency, see config --metric-log-freqYes
--variable-scalarAdd float (scalar) argument to the metric logger, <name>=<histogram>. Example: with specific buckets: "x1=0,0.2,0.4,0.6,0.8,1" or with min/max/num_buckets "x1=0.0/1.0/5". Notice: In cases where 1000s of requests per second reach the serving, it makes no sense to display every datapoint. So scalars can be divided in buckets, and for each minute for example. Then it's possible to calculate what % of the total traffic fell in bucket 1, bucket 2, bucket 3 etc. The Y axis represents the buckets, color is the value in % of traffic in that bucket, and X is time.Yes
--variable-enumAdd enum (string) argument to the metric logger, <name>=<optional_values>. Example: "detect=cat,dog,sheep"Yes
--variable-valueAdd non-samples scalar argument to the metric logger, <name>. Example: "latency"Yes

删除

从特定端点移除指标。

clearml-serving metrics remove [-h] [--endpoint ENDPOINT]
[--variable VARIABLE [VARIABLE ...]]

参数

NameDescriptionOptional
--endpointMetric endpoint name including version (e.g. "model/1" or a prefix "model/*")No
--variableRemove (scalar/enum) argument from the metric logger, <name> example: "x1"Yes

列表

列出所有端点上记录的指标。

clearml-serving metrics list [-h]

配置

配置一个新的服务服务。

clearml-serving config [-h] [--base-serving-url BASE_SERVING_URL]
[--triton-grpc-server TRITON_GRPC_SERVER]
[--kafka-metric-server KAFKA_METRIC_SERVER]
[--metric-log-freq METRIC_LOG_FREQ]

参数

NameDescriptionOptional
--base-serving-urlExternal base serving service url. Example: http://127.0.0.1:8080/serveYes
--triton-grpc-serverExternal ClearML-Triton serving container gRPC address. Example: 127.0.0.1:9001Yes
--kafka-metric-serverExternal Kafka service url. Example: 127.0.0.1:9092Yes
--metric-log-freqSet default metric logging frequency between 0.0 to 1.0. 1.0 means that 100% of all requests are loggedYes

模型

为已经运行的服务配置模型端点。

clearml-serving model [-h] {list,remove,upload,canary,auto-update,add}

列表

列出当前模型。

clearml-serving model list [-h]

删除

通过其端点名称删除模型。

clearml-serving model remove [-h] [--endpoint ENDPOINT]

参数

NameDescriptionOptional
--endpointModel endpoint nameNo

上传

上传并注册模型文件/文件夹。

clearml-serving model upload [-h] --name NAME [--tags TAGS [TAGS ...]] --project PROJECT
[--framework {tensorflow,tensorflowjs,tensorflowlite,pytorch,torchscript,caffe,caffe2,onnx,keras,mknet,cntk,torch,darknet,paddlepaddle,scikitlearn,xgboost,lightgbm,parquet,megengine,catboost,tensorrt,openvino,custom}]
[--publish] [--path PATH] [--url URL]
[--destination DESTINATION]

参数

NameDescriptionOptional
--nameSpecifying the model name to be registered inNo
--tagsAdd tags to the newly created modelYes
--projectSpecify the project for the model to be registered inNo
--frameworkSpecify the model framework. Options are: 'tensorflow', 'tensorflowjs', 'tensorflowlite', 'pytorch', 'torchscript', 'caffe', 'caffe2', 'onnx', 'keras', 'mknet', 'cntk', 'torch', 'darknet', 'paddlepaddle', 'scikitlearn', 'xgboost', 'lightgbm', 'parquet', 'megengine', 'catboost', 'tensorrt', 'openvino', 'custom'Yes
--publishPublish the newly created model (change model state to "published" (i.e. locked and ready to deploy)Yes
--pathSpecify a model file/folder to be uploaded and registeredYes
--urlSpecify an already uploaded model url (e.g. s3://bucket/model.bin, gs://bucket/model.bin)Yes
--destinationSpecify the target destination for the model to be uploaded. For example: s3://bucket/folder/, s3://host_addr:port/bucket (for non-AWS S3-like services like MinIO), gs://bucket-name/folder, azure://<account name>.blob.core.windows.net/path/to/fileYes

金丝雀

添加模型Canary/A/B端点。

clearml-serving model canary [-h] [--endpoint ENDPOINT] [--weights WEIGHTS [WEIGHTS ...]]
[--input-endpoints INPUT_ENDPOINTS [INPUT_ENDPOINTS ...]]
[--input-endpoint-prefix INPUT_ENDPOINT_PREFIX]

参数

NameDescriptionOptional
--endpointModel canary serving endpoint name (e.g. my_model/latest)Yes
--weightsModel canary weights (order matching model ep), (e.g. 0.2 0.8)Yes
--input-endpointsModel endpoint prefixes, can also include version (e.g. my_model, my_model/v1)Yes
--input-endpoint-prefixModel endpoint prefix, lexicographic order or by version <int> (e.g. my_model/1, my_model/v1), where the first weight matches the last version.Yes

自动更新

添加/修改模型自动更新服务。

clearml-serving model auto-update [-h] [--endpoint ENDPOINT] --engine ENGINE
[--max-versions MAX_VERSIONS] [--name NAME]
[--tags TAGS [TAGS ...]] [--project PROJECT]
[--published] [--preprocess PREPROCESS]
[--input-size INPUT_SIZE [INPUT_SIZE ...]]
[--input-type INPUT_TYPE] [--input-name INPUT_NAME]
[--output-size OUTPUT_SIZE [OUTPUT_SIZE ...]]
[--output_type OUTPUT_TYPE] [--output-name OUTPUT_NAME]
[--aux-config AUX_CONFIG [AUX_CONFIG ...]]

参数

NameDescriptionOptional
--endpointBase model endpoint (must be unique)No
--engineModel endpoint serving engine (triton, sklearn, xgboost, lightgbm)No
--max-versionsMax versions to store (and create endpoints) for the model. Highest number is the latest versionYes
--nameSpecify model name to be selected and auto-updated (notice regexp selection use "$name^" for exact match)Yes
--tagsSpecify tags to be selected and auto-updatedYes
--projectSpecify model project to be selected and auto-updatedYes
--publishedOnly select published model for auto-updateYes
--preprocessSpecify Pre/Post processing code to be used with the model (point to local file / folder) - this should hold for all the modelsYes
--input-sizeSpecify the model matrix input size [Rows x Columns X Channels etc ...]Yes
--input-typeSpecify the model matrix input type. Examples: uint8, float32, int16, float16 etc.Yes
--input-nameSpecify the model layer pushing input into. Example: layer_0Yes
--output-sizeSpecify the model matrix output size [Rows x Columns X Channels etc ...]Yes
--output_typeSpecify the model matrix output type. Examples: uint8, float32, int16, float16 etc.Yes
--output-nameSpecify the model layer pulling results from. Examples: layer_99Yes
--aux-configSpecify additional engine specific auxiliary configuration in the form of key=value. Example: platform=onnxruntime_onnx response_cache.enable=true max_batch_size=8. Notice: you can also pass a full configuration file (e.g. Triton "config.pbtxt")Yes

添加

添加/更新模型。

clearml-serving model add [-h] --engine ENGINE --endpoint ENDPOINT [--version VERSION]
[--model-id MODEL_ID] [--preprocess PREPROCESS]
[--input-size INPUT_SIZE [INPUT_SIZE ...]]
[--input-type INPUT_TYPE] [--input-name INPUT_NAME]
[--output-size OUTPUT_SIZE [OUTPUT_SIZE ...]]
[--output-type OUTPUT_TYPE] [--output-name OUTPUT_NAME]
[--aux-config AUX_CONFIG [AUX_CONFIG ...]] [--name NAME]
[--tags TAGS [TAGS ...]] [--project PROJECT] [--published]

参数

NameDescriptionOptional
--engineModel endpoint serving engine (triton, sklearn, xgboost, lightgbm)No
--endpointBase model endpoint (must be unique)No
--versionModel endpoint version (default: None)Yes
--model-idSpecify a model ID to be servedNo
--preprocessSpecify Pre/Post processing code to be used with the model (point to local file / folder) - this should hold for all the modelsYes
--input-sizeSpecify the model matrix input size [Rows x Columns X Channels etc ...]Yes
--input-typeSpecify the model matrix input type. Examples: uint8, float32, int16, float16 etc.Yes
--input-nameSpecify the model layer pushing input into. Example: layer_0Yes
--output-sizeSpecify the model matrix output size [Rows x Columns X Channels etc ...]Yes
--output_typeSpecify the model matrix output type. Examples: uint8, float32, int16, float16 etc.Yes
--output-nameSpecify the model layer pulling results from. Examples: layer_99Yes
--aux-configSpecify additional engine specific auxiliary configuration in the form of key=value. Example: platform=onnxruntime_onnx response_cache.enable=true max_batch_size=8. Notice: you can also pass a full configuration file (e.g. Triton "config.pbtxt")Yes
--nameInstead of specifying --model-id select based on model nameYes
--tagsSpecify tags to be selected and auto-updatedYes
--projectInstead of specifying --model-id select based on model projectYes
--publishedInstead of specifying --model-id select based on model publishedYes