Utilities
verbose
Sets the verbose level of YDF.
The verbose levels are
0 or False: Print no logs. 1 or True: Print a few logs in a colab or notebook cell. Print all the logs in the console. This is the default verbose level. 2: Prints all the logs on all surfaces.
Usage example:
import ydf
save_verbose = ydf.verbose(0) # Hide all logs
learner = ydf.RandomForestLearner(label="label")
model = learner.train(pd.DataFrame({"feature": [0, 1], "label": [0, 1]}))
ydf.verbose(save_verbose) # Restore verbose level
Parameters:
Name | Type | Description | Default |
---|---|---|---|
level
|
Union[int, bool]
|
New verbose level. |
2
|
Returns:
Type | Description |
---|---|
int
|
The previous verbose level. |
load_model
load_model(directory: str, advanced_options: ModelIOOptions = ModelIOOptions()) -> ModelType
Load a YDF model from disk.
Usage example:
import pandas as pd
import ydf
# Create a model
dataset = pd.DataFrame({"feature": [0, 1], "label": [0, 1]})
learner = ydf.RandomForestLearner(label="label")
model = learner.train(dataset)
# Save model
model.save("/tmp/my_model")
# Load model
loaded_model = ydf.load_model("/tmp/my_model")
# Make predictions
model.predict(dataset)
loaded_model.predict(dataset)
If a directory contains multiple YDF models, the models are uniquely identified by their prefix. The prefix to use can be specified in the advanced options. If the directory only contains a single model, the correct prefix is detected automatically.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
directory
|
str
|
Directory containing the model. |
required |
advanced_options
|
ModelIOOptions
|
Advanced options for model loading. |
ModelIOOptions()
|
Returns:
Type | Description |
---|---|
ModelType
|
Model to use for inference, evaluation or inspection |
deserialize_model
deserialize_model(data: bytes) -> ModelType
Loads a serialized YDF model.
Usage example:
import pandas as pd
import ydf
# Create a model
dataset = pd.DataFrame({"feature": [0, 1], "label": [0, 1]})
learner = ydf.RandomForestLearner(label="label")
model = learner.train(dataset)
# Serialize model
# Note: serialized_model is a bytes.
serialized_model = model.serialize()
# Deserialize model
deserialized_model = ydf.deserialize_model(serialized_model)
# Make predictions
model.predict(dataset)
deserialized_model.predict(dataset)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
bytes
|
Serialized model. |
required |
Returns:
Type | Description |
---|---|
ModelType
|
Model to use for inference, evaluation or inspection |
Feature
dataclass
Feature(name: str, semantic: Optional[Semantic] = None, max_vocab_count: Optional[int] = None, min_vocab_frequency: Optional[int] = None, num_discretized_numerical_bins: Optional[int] = None, monotonic: MonotonicConstraint = None)
Bases: object
Semantic and parameters for a single column.
This class allows to
- Limit the input features of the model.
- Manually specify the semantic of a feature.
- Specify feature specific hyper-parameters.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
The name of the column or feature. |
semantic |
Optional[Semantic]
|
Semantic of the column. If None, the semantic is automatically determined. The semantic controls how a column is interpreted by a model. Using the wrong semantic (e.g. numerical instead of categorical) will hurt your model"s quality. |
max_vocab_count |
Optional[int]
|
For CATEGORICAL and CATEGORICAL_SET columns only. Number of unique categorical values stored as string. If more categorical values are present, the least frequent values are grouped into a Out-of-vocabulary item. Reducing the value can improve or hurt the model. If max_vocab_count = -1, the number of values in the column is not limited. |
min_vocab_frequency |
Optional[int]
|
For CATEGORICAL and CATEGORICAL_SET columns only. Minimum number of occurrence of a categorical value. Values present less than "min_vocab_frequency" times in the training dataset are treated as "Out-of-vocabulary". |
num_discretized_numerical_bins |
Optional[int]
|
For DISCRETIZED_NUMERICAL columns only. Number of bins used to discretize DISCRETIZED_NUMERICAL columns. |
monotonic |
MonotonicConstraint
|
Monotonic constraints between the feature and the model output.
Use |
normalized_monotonic
property
normalized_monotonic: Optional[Monotonic]
Returns the normalized version of the "monotonic" attribute.
from_column_def
classmethod
Converts a ColumnDef to a Column.
Column
dataclass
Column(name: str, semantic: Optional[Semantic] = None, max_vocab_count: Optional[int] = None, min_vocab_frequency: Optional[int] = None, num_discretized_numerical_bins: Optional[int] = None, monotonic: MonotonicConstraint = None)
Bases: object
Semantic and parameters for a single column.
This class allows to
- Limit the input features of the model.
- Manually specify the semantic of a feature.
- Specify feature specific hyper-parameters.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
The name of the column or feature. |
semantic |
Optional[Semantic]
|
Semantic of the column. If None, the semantic is automatically determined. The semantic controls how a column is interpreted by a model. Using the wrong semantic (e.g. numerical instead of categorical) will hurt your model"s quality. |
max_vocab_count |
Optional[int]
|
For CATEGORICAL and CATEGORICAL_SET columns only. Number of unique categorical values stored as string. If more categorical values are present, the least frequent values are grouped into a Out-of-vocabulary item. Reducing the value can improve or hurt the model. If max_vocab_count = -1, the number of values in the column is not limited. |
min_vocab_frequency |
Optional[int]
|
For CATEGORICAL and CATEGORICAL_SET columns only. Minimum number of occurrence of a categorical value. Values present less than "min_vocab_frequency" times in the training dataset are treated as "Out-of-vocabulary". |
num_discretized_numerical_bins |
Optional[int]
|
For DISCRETIZED_NUMERICAL columns only. Number of bins used to discretize DISCRETIZED_NUMERICAL columns. |
monotonic |
MonotonicConstraint
|
Monotonic constraints between the feature and the model output.
Use |
normalized_monotonic
property
normalized_monotonic: Optional[Monotonic]
Returns the normalized version of the "monotonic" attribute.
from_column_def
classmethod
Converts a ColumnDef to a Column.
Task
Bases: Enum
Task solved by a model.
Usage example:
learner = ydf.RandomForestLearner(label="income",
task=ydf.Task.CLASSIFICATION)
model = learner.train(dataset)
assert model.task() == ydf.Task.CLASSIFICATION
Attributes:
Name | Type | Description |
---|---|---|
CLASSIFICATION |
Predict a categorical label i.e., an item of an enumeration. |
|
REGRESSION |
Predict a numerical label i.e., a quantity. |
|
RANKING |
Rank items by label values. When using default NDCG settings, the label is expected to be between 0 and 4 with NDCG semantic (0: completely unrelated, 4: perfect match). |
|
CATEGORICAL_UPLIFT |
Predicts the incremental impact of a treatment on a categorical outcome. |
|
NUMERICAL_UPLIFT |
Predicts the incremental impact of a treatment on a numerical outcome. |
|
ANOMALY_DETECTION |
Predicts if an instance is similar to the majority of the training data or anomalous (a.k.a. an outlier). An anomaly detection prediction is a value between 0 and 1, where 0 indicates the possible most normal instance and 1 indicates the most possible anomalous instance. |
Semantic
Bases: Enum
Semantic (e.g. numerical, categorical) of a column.
Determines how a column is interpreted by the model. Similar to the "ColumnType" of YDF's DataSpecification.
Attributes:
Name | Type | Description |
---|---|---|
NUMERICAL |
Numerical value. Generally for quantities or counts with full ordering. For example, the age of a person, or the number of items in a bag. Can be a float or an integer. Missing values are represented by math.nan. |
|
CATEGORICAL |
A categorical value. Generally for a type/class in finite set of possible values without ordering. For example, the color RED in the set {RED, BLUE, GREEN}. Can be a string or an integer. Missing values are represented by "" (empty sting) or value -2. An out-of-vocabulary value (i.e. a value that was never seen in training) is represented by any new string value or the value -1. Integer categorical values: (1) The training logic and model representation is optimized with the assumption that values are dense. (2) Internally, the value is stored as int32. The values should be <~2B. (3) The number of possible values is computed automatically from the training dataset. During inference, integer values greater than any value seen during training will be treated as out-of-vocabulary. (4) Minimum frequency and maximum vocabulary size constraints do not apply. |
|
HASH |
The hash of a string value. Used when only the equality between values is important (not the value itself). Currently, only used for groups in ranking problems e.g. the query in a query/document problem. The hashing is computed with Google's farmhash and stored as an uint64. |
|
CATEGORICAL_SET |
Set of categorical values. Great to represent tokenized texts. Can be a string. Unlike CATEGORICAL, the number of items in a CATEGORICAL_SET can change between examples. The order of values inside a feature values does not matter. |
|
BOOLEAN |
Boolean value. Can be a float or an integer. Missing values are represented by math.nan. If a numerical tensor contains multiple values, its size should be constant, and each dimension isthreaded independently (and each dimension should always have the same "meaning"). |
|
DISCRETIZED_NUMERICAL |
Numerical values automatically discretized into bins. Discretized numerical columns are faster to train than (non-discretized) numerical columns. If the number of unique values of these columns is lower than the number of bins, the discretization is lossless from the point of view of the model. If the number of unique values of this columns is greater than the number of bins, the discretization is lossy from the point of view of the model. Lossy discretization can reduce and sometime increase (due to regularization) the quality of the model. |
start_worker
Starts a worker locally on the given port.
The addresses of workers are passed to learners with the workers
argument.
Usage example:
# On worker machine #0 at address 192.168.0.1
ydf.start_worker(9000)
# On worker machine #1 at address 192.168.0.2
ydf.start_worker(9000)
# On manager
learner = ydf.DistributedGradientBoostedTreesLearner(
label = "my_label",
working_dir = "/shared/working_dir,
resume_training = True,
workers = ["192.168.0.1:9000", "192.168.0.2:9000"],
).train(dataset)
Example with non-blocking call:
# On worker machine
stop_worker = start_worker(blocking=False)
# Do some work with the worker
stop_worker() # Stops the worker
Parameters:
Name | Type | Description | Default |
---|---|---|---|
port
|
int
|
TCP port of the worker. |
required |
blocking
|
bool
|
If true (default), the function is blocking until the worker is stopped (e.g., error, interruption by the manager). If false, the function is non-blocking and returns a callable that, when called, will stop the worker. |
True
|
Returns:
Type | Description |
---|---|
Optional[Callable[[], None]]
|
Callable to stop the worker. Only returned if |
strict
strict(value: bool = True) -> None
Sets the strict mode.
When strict mode is enabled, more warnings are displayed.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
value
|
bool
|
New value for the strict mode. |
True
|
ModelIOOptions
dataclass
Advanced options for saving and loading YDF models.
Attributes:
Name | Type | Description |
---|---|---|
file_prefix |
Optional[str]
|
Optional prefix for the model. File prefixes allow multiple models to exist in the same folder. Doing so is heavily DISCOURAGED outside of edge cases. When loading a model, the prefix, if not specified, is auto-detected if possible. When saving a model, the empty string is used as file prefix unless it is explicitly specified. |
create_vertical_dataset
create_vertical_dataset(data: InputDataset, columns: ColumnDefs = None, include_all_columns: bool = False, max_vocab_count: int = 2000, min_vocab_frequency: int = 5, discretize_numerical_columns: bool = False, num_discretized_numerical_bins: int = 255, max_num_scanned_rows_to_infer_semantic: int = 100000, max_num_scanned_rows_to_compute_statistics: int = 100000, data_spec: Optional[DataSpecification] = None, required_columns: Optional[Sequence[str]] = None, dont_unroll_columns: Optional[Sequence[str]] = None, label: Optional[str] = None) -> VerticalDataset
Creates a VerticalDataset from various sources of data.
The feature semantics are automatically determined and can be explicitly
set with the columns
argument. The semantics of a dataset (or model) are
available its data_spec.
Note that the CATEGORICAL_SET semantic is not automatically inferred when reading from file. When reading from CSV files, setting the CATEGORICAL_SET semantic for a feature will have YDF tokenize the feature. When reading from in-memory datasets (e.g. pandas), YDF only accepts lists of lists for CATEGORICAL_SET features.
Usage example:
import pandas as pd
import ydf
df = pd.read_csv("my_dataset.csv")
# Loads all the columns
ds = ydf.create_vertical_dataset(df)
# Only load columns "a" and "b". Ensure "b" is interpreted as a categorical
# feature.
ds = ydf.create_vertical_dataset(df,
columns=[
"a",
("b", ydf.semantic.categorical),
])
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
InputDataset
|
Source dataset. Supported formats: VerticalDataset, (typed) path, list of (typed) paths, Pandas DataFrame, Xarray Dataset, TensorFlow Dataset, PyGrain DataLoader and Dataset (experimental, Linux only), dictionary of string to NumPy array or lists. If the data is already a VerticalDataset, it is returned unchanged. |
required |
columns
|
ColumnDefs
|
If None, all columns are imported. The semantic of the columns is
determined automatically. Otherwise, if include_all_columns=False
(default) only the column listed in |
None
|
include_all_columns
|
bool
|
See |
False
|
max_vocab_count
|
int
|
Maximum size of the vocabulary of CATEGORICAL and CATEGORICAL_SET columns stored as strings. If more unique values exist, only the most frequent values are kept, and the remaining values are considered as out-of-vocabulary. If max_vocab_count = -1, the number of values in the column is not limited (not recommended). |
2000
|
min_vocab_frequency
|
int
|
Minimum number of occurrence of a value for CATEGORICAL
and CATEGORICAL_SET columns. Value observed less than
|
5
|
discretize_numerical_columns
|
bool
|
If true, discretize all the numerical columns
before training. Discretized numerical columns are faster to train with,
but they can have a negative impact on the model quality. Using
|
False
|
num_discretized_numerical_bins
|
int
|
Number of bins used when disretizing numerical columns. |
255
|
max_num_scanned_rows_to_infer_semantic
|
int
|
Number of rows to scan when inferring the column's semantic if it is not explicitly specified. Only used when reading from file, in-memory datasets are always read in full. Setting this to a lower number will speed up dataset reading, but might result in incorrect column semantics. Set to -1 to scan the entire dataset. |
100000
|
max_num_scanned_rows_to_compute_statistics
|
int
|
Number of rows to scan when computing a column's statistics. Only used when reading from file, in-memory datasets are always read in full. A column's statistics include the dictionary for categorical features and the mean / min / max for numerical features. Setting this to a lower number will speed up dataset reading, but skew statistics in the dataspec, which can hurt model quality (e.g. if an important category of a categorical feature is considered OOV). Set to -1 to scan the entire dataset. |
100000
|
data_spec
|
Optional[DataSpecification]
|
Dataspec to be used for this dataset. If a data spec is given,
all other arguments except |
None
|
required_columns
|
Optional[Sequence[str]]
|
List of columns required in the data. If None, all columns
mentioned in the data spec or |
None
|
dont_unroll_columns
|
Optional[Sequence[str]]
|
List of columns that cannot be unrolled. If one such column needs to be unrolled, raise an error. |
None
|
label
|
Optional[str]
|
Name of the label column, if any. |
None
|
Returns:
Type | Description |
---|---|
VerticalDataset
|
Dataset to be ingested by the learner algorithms. |
Raises:
Type | Description |
---|---|
ValueError
|
If the dataset has an unsupported type. |
ModelMetadata
dataclass
ModelMetadata(owner: Optional[str] = None, created_date: Optional[int] = None, uid: Optional[int] = None, framework: Optional[str] = None)
Metadata information stored in the model.
Attributes:
Name | Type | Description |
---|---|---|
owner |
Optional[str]
|
Owner of the model, defaults to empty string for the open-source build of YDF. |
created_date |
Optional[int]
|
Unix timestamp of the model training (in seconds). |
uid |
Optional[int]
|
Unique identifier of the model. |
framework |
Optional[str]
|
Framework used to create the model. Defaults to "Python YDF" for models trained with the Python API. |
from_tensorflow_decision_forests
from_tensorflow_decision_forests(directory: str) -> ModelType
Load a TensorFlow Decision Forests model from disk.
Usage example:
import pandas as pd
import ydf
# Import TF-DF model
loaded_model = ydf.from_tensorflow_decision_forests("/tmp/my_tfdf_model")
# Make predictions
dataset = pd.read_csv("my_dataset.csv")
model.predict(dataset)
# Show details about the model
model.describe()
The imported model creates the same predictions as the original TF-DF model.
Only TensorFlow Decision Forests models containing a single Decision Forest and nothing else are supported. That is, combined neural network / decision forest models cannot be imported. Unfortunately, importing such models may succeed but result in incorrect predictions, so check for prediction equality after importing.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
directory
|
str
|
Directory containing the TF-DF model. |
required |
Returns:
Type | Description |
---|---|
ModelType
|
Model to use for inference, evaluation or inspection |
from_sklearn
from_sklearn(sklearn_model: Any, label_name: str = 'label', feature_name: str = 'features') -> GenericModel
Converts a tree-based scikit-learn model to a YDF model.
Usage example:
import ydf
from sklearn import datasets
from sklearn import tree
# Train a SKLearn model
X, y = datasets.make_classification()
skl_model = tree.DecisionTreeClassifier().fit(X, y)
# Convert the SKLearn model to a YDF model
ydf_model = ydf.from_sklearn(skl_model)
# Make predictions with the YDF model
ydf_predictions = ydf_model.predict({"features": X})
# Analyse the YDF model
ydf_model.analyze({"features": X})
Currently supported models are: * sklearn.tree.DecisionTreeClassifier * sklearn.tree.DecisionTreeRegressor * sklearn.tree.ExtraTreeClassifier * sklearn.tree.ExtraTreeRegressor * sklearn.ensemble.RandomForestClassifier * sklearn.ensemble.RandomForestRegressor * sklearn.ensemble.ExtraTreesClassifier * sklearn.ensemble.ExtraTreesRegressor * sklearn.ensemble.GradientBoostingRegressor * sklearn.ensemble.IsolationForest
Unlike YDF, Scikit-learn does not name features and labels. Use the fields
label_name
and feature_name
to specify the name of the columns in the YDF
model.
Additionally, only single-label classification and scalar regression are supported (e.g. multivariate regression models will not convert).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sklearn_model
|
Any
|
the scikit-learn tree based model to be converted. |
required |
label_name
|
str
|
Name of the multi-dimensional feature in the output YDF model. |
'label'
|
feature_name
|
str
|
Name of the label in the output YDF model. |
'features'
|
Returns:
Type | Description |
---|---|
GenericModel
|
a YDF Model that emulates the provided scikit-learn model. |
NodeFormat
Bases: Enum
Serialization format for a model.
Determines the storage format for nodes.
Attributes:
Name | Type | Description |
---|---|---|
BLOB_SEQUENCE |
Default format for the public version of YDF. |
RegressionLoss
dataclass
RegressionLoss(activation: Activation, initial_predictions: Callable[[NDArray[float32], NDArray[float32]], float32], loss: Callable[[NDArray[float32], NDArray[float32], NDArray[float32]], float32], gradient_and_hessian: Callable[[NDArray[float32], NDArray[float32]], Tuple[NDArray[float32], NDArray[float32]]], may_trigger_gc: bool = True)
Bases: AbstractCustomLoss
A user-provided loss function for regression problems.
Loss functions may never reference their arguments outside after returning: Bad:
mylabels = None
def initial_predictions(labels, weights):
nonlocal mylabels
mylabels = labels # labels is now referenced outside the function
mylabels = None
def initial_predictions(labels, weights):
nonlocal mylabels
mylabels = np.copy(labels) # mylabels is a copy, not a reference.
The bias / initial predictions of the GBT model. Receives
the label values and the weights, outputs the initial prediction as a float.
loss: The loss function controls the early stopping. The loss function
receives the labels, the current predictions and the current weights and
must output the loss as a float. Note that the predictions provided to the
loss functions have not yet had an activation function applied to them.
gradient_and_hessian: Gradient and hessian of the current predictions. Note
that only the diagonal of the hessian must be provided. Receives as input
the labels and the current predictions (without activation) and returns a
tuple of the gradient and the hessian.
activation: Activation function to be applied to the model. Regression models
are expected to return a value in the same space as the labels after
applying the activation function.
may_trigger_gc: If True (default), YDF may trigger Python's garbage collection
to determine if a Numpy array that is backed by YDF-internal data is used
after its lifetime has ended. If False, checks for illegal memory accesses
are disabled. This can be useful when training many small models or if the
observed impact of triggering GC is large. If may_trigger_gc=False
, it is
very important that the user validate manuallythat no memory leakage occurs.
BinaryClassificationLoss
dataclass
BinaryClassificationLoss(activation: Activation, initial_predictions: Callable[[NDArray[int32], NDArray[float32]], float32], loss: Callable[[NDArray[int32], NDArray[float32], NDArray[float32]], float32], gradient_and_hessian: Callable[[NDArray[int32], NDArray[float32]], Tuple[NDArray[float32], NDArray[float32]]], may_trigger_gc: bool = True)
Bases: AbstractCustomLoss
A user-provided loss function for binary classification problems.
Note that the labels are binary but 1-based, i.e. the positive class is 2, the negative class is 1.
Loss functions may never reference their arguments outside after returning: Bad:
mylabels = None
def initial_predictions(labels, weights):
nonlocal mylabels
mylabels = labels # labels is now referenced outside the function
mylabels = None
def initial_predictions(labels, weights):
nonlocal mylabels
mylabels = np.copy(labels) # mylabels is a copy, not a reference.
The bias / initial predictions of the GBT model. Receives
the label values and the weights, outputs the initial prediction as a float.
loss: The loss function controls the early stopping. The loss function receives the labels, the current predictions and the current weights and must output the loss as a float. Note that the predictions provided to the loss functions have not yet had an activation function applied to them. gradient_and_hessian: Gradient and hessian of the current predictions. Note that only the diagonal of the hessian must be provided. Receives as input the labels and the current predictions (without activation). Returns a tuple of the gradient and the hessian. activation: Activation function to be applied to the model. Binary classification models are expected to return a probability after applying the activation function. may_trigger_gc: If True (default), YDF may trigger Python's garbage collection to determine if an Numpy array that is backed by YDF-internal data is used after its lifetime has ended. If False, checks for illegal memory accesses are disabled. Setting this parameter to False is dangerous, since illegal memory accesses will no longer be detected.
MultiClassificationLoss
dataclass
MultiClassificationLoss(activation: Activation, initial_predictions: Callable[[NDArray[int32], NDArray[float32]], NDArray[float32]], loss: Callable[[NDArray[int32], NDArray[float32], NDArray[float32]], float32], gradient_and_hessian: Callable[[NDArray[int32], NDArray[float32]], Tuple[NDArray[float32], NDArray[float32]]], may_trigger_gc: bool = True)
Bases: AbstractCustomLoss
A user-provided loss function for multi-class problems.
Note that the labels are but 1-based. Predictions are given in an 2D array with one row per example. Initial predictions, gradient and hessian are expected for each class, e.g. for a 3-class classification problem, output 3 gradients and hessians per class.
Loss functions may never reference their arguments outside after returning: Bad:
mylabels = None
def initial_predictions(labels, weights):
nonlocal mylabels
mylabels = labels # labels is now referenced outside the function
mylabels = None
def initial_predictions(labels, weights):
nonlocal mylabels
mylabels = np.copy(labels) # mylabels is a copy, not a reference.
initial_predictions: The bias / initial predictions of the GBT model. Receives the label values and the weights, outputs the initial prediction as an array of floats (one initial prediction per class). loss: The loss function controls the early stopping. The loss function receives the labels, the current predictions and the current weights and must output the loss as a float. Note that the predictions provided to the loss functions have not yet had an activation function applied to them. gradient_and_hessian: Gradient and hessian of the current predictions with respect to each class. Note that only the diagonal of the hessian must be provided. Receives as input the labels and the current predictions (without activation). Returns a tuple of the gradient and the hessian. Both gradient and hessian must be arrays of shape (num_classes, num_examples). activation: Activation function to be applied to the model. Multi-class classification models are expected to return a probability distribution over the classes after applying the activation function. may_trigger_gc: If True (default), YDF may trigger Python's garbage collection to determine if an Numpy array that is backed by YDF-internal data is used after its lifetime has ended. If False, checks for illegal memory accesses are disabled. Setting this parameter to False is dangerous, since illegal memory accesses will no longer be detected.
Activation
Bases: Enum
Activation functions for custom losses.
Not all activation functions are supported for all custom losses. Activation function IDENTITY (i.e., no activation function applied) is always supported.