FastAPI + Docker¶
设置¶
In [ ]:
Copied!
pip install ydf -U
pip install ydf -U
In [1]:
Copied!
import ydf
import pandas as pd
import ydf
import pandas as pd
In [2]:
Copied!
dataset_path = "https://raw.githubusercontent.com/google/yggdrasil-decision-forests/main/yggdrasil_decision_forests/test_data/dataset"
dataset = pd.read_csv(f"{dataset_path}/adult_train.csv")
dataset.head(5)
dataset_path = "https://raw.githubusercontent.com/google/yggdrasil-decision-forests/main/yggdrasil_decision_forests/test_data/dataset"
dataset = pd.read_csv(f"{dataset_path}/adult_train.csv")
dataset.head(5)
Out[2]:
age | workclass | fnlwgt | education | education_num | marital_status | occupation | relationship | race | sex | capital_gain | capital_loss | hours_per_week | native_country | income | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 44 | Private | 228057 | 7th-8th | 4 | Married-civ-spouse | Machine-op-inspct | Wife | White | Female | 0 | 0 | 40 | Dominican-Republic | <=50K |
1 | 20 | Private | 299047 | Some-college | 10 | Never-married | Other-service | Not-in-family | White | Female | 0 | 0 | 20 | United-States | <=50K |
2 | 40 | Private | 342164 | HS-grad | 9 | Separated | Adm-clerical | Unmarried | White | Female | 0 | 0 | 37 | United-States | <=50K |
3 | 30 | Private | 361742 | Some-college | 10 | Married-civ-spouse | Exec-managerial | Husband | White | Male | 0 | 0 | 50 | United-States | <=50K |
4 | 67 | Self-emp-inc | 171564 | HS-grad | 9 | Married-civ-spouse | Prof-specialty | Wife | White | Female | 20051 | 0 | 30 | England | >50K |
我们使用默认参数训练模型:
In [3]:
Copied!
model = ydf.GradientBoostedTreesLearner(label="income").train(dataset)
model = ydf.GradientBoostedTreesLearner(label="income").train(dataset)
Train model on 22792 examples Model trained in 0:00:01.420861
我们可以生成预测,以确保模型正常工作:
注意,model.predict
以一批示例作为输入(即,一个示例列表)。如果我们只有一个示例,我们需要为每个特征创建一个只有一个值的列表。
In [4]:
Copied!
model.predict({'age': [44],
'workclass': ['Private'],
'fnlwgt': [228057],
'education': ['7th-8th'],
'education_num': [4],
'marital_status': ['Married-civ-spouse'],
'occupation': ['Machine-op-inspct'],
'relationship': ['Wife'],
'race': ['White'],
'sex': ['Female'],
'capital_gain': [0],
'capital_loss': [0],
'hours_per_week': [40],
'native_country': ['Dominican-Republic']})
model.predict({'age': [44],
'workclass': ['Private'],
'fnlwgt': [228057],
'education': ['7th-8th'],
'education_num': [4],
'marital_status': ['Married-civ-spouse'],
'occupation': ['Machine-op-inspct'],
'relationship': ['Wife'],
'race': ['White'],
'sex': ['Female'],
'capital_gain': [0],
'capital_loss': [0],
'hours_per_week': [40],
'native_country': ['Dominican-Republic']})
Out[4]:
array([0.02801839], dtype=float32)
对于一个二元分类模型(即一个可以预测两个类别之一的模型),输出是正类的概率:
In [5]:
Copied!
model.label_classes()[True]
model.label_classes()[True]
Out[5]:
'>50K'
将模型打包为Docker¶
model.to_docker(path)
将模型导出为Docker。
In [6]:
Copied!
model.to_docker("my_docker_model")
model.to_docker("my_docker_model")
您可以查看Docker内容。在某些高级情况下,您可能想要更新一些自动生成的文件。
In [8]:
Copied!
!ls -l my_docker_model
!ls -l my_docker_model
total 4 -rw-rw-r-- 1 gbm primarygroup 288 Jul 26 13:39 deploy_in_google_cloud.sh -rw-rw-r-- 1 gbm primarygroup 211 Jul 26 13:39 Dockerfile -rw-rw-r-- 1 gbm primarygroup 1313 Jul 26 13:39 main.py drwxrwxr-x 1 gbm primarygroup 0 Jul 26 13:39 model -rw-rw-r-- 1 gbm primarygroup 360 Jul 26 13:39 readme.txt -rw-rw-r-- 1 gbm primarygroup 21 Jul 26 13:39 requirements.txt -rw-rw-r-- 1 gbm primarygroup 485 Jul 26 13:39 test_locally.sh
Docker可以在本地部署和测试,命令如下:
docker build -t ydf_predict_image ./my_docker_model
docker run --rm -p 8080:8080 -d ydf_predict_image
注意: 为了运行此命令,您需要安装 Docker。
在生成的docker目录中,test_locally.sh
脚本展示了如何生成本地请求。
最后,Docker可以在Google Cloud上部署,命令如下:
gcloud run deploy ydf-predict --source ./my_docker_model
已部署的模型可以通过 Google Cloud Console 进行监控。
注意: 为了运行此命令,您需要安装 Google Cloud CLI 并设置一个项目。