`langchain_community.utilities.tensorflow_datasets`.TensorflowDatasets¶

class langchain_community.utilities.tensorflow_datasets.TensorflowDatasets[source]¶

Bases: BaseModel

访问 TensorFlow 数据集。

当前实现仅适用于适合内存的数据集。

TensorFlow 数据集是一个准备好供 TensorFlow 或其他 Python 机器学习框架（如 Jax）使用的数据集集合。所有数据集都暴露为 tf.data.Datasets。要开始，请查看指南：https://www.tensorflow.org/datasets/overview 和数据集列表：https://www.tensorflow.org/datasets/catalog/overview#all_datasets

您必须提供 sample_to_document_function：一个将数据集特定格式的样本转换为文档的函数。

属性：

dataset_name：要加载的数据集的名称 split_name：要加载的拆分的名称。默认为“train”。 load_max_docs：加载文档数量的限制。默认为 100。 sample_to_document_function：将数据集样本转换为文档的函数

示例：

from langchain_community.utilities import TensorflowDatasets

def mlqaen_example_to_document(example: dict) -> Document:
    return Document(
        page_content=decode_to_str(example["context"]),
        metadata={
            "id": decode_to_str(example["id"]),
            "title": decode_to_str(example["title"]),
            "question": decode_to_str(example["question"]),
            "answer": decode_to_str(example["answers"]["text"][0]),
        },
    )

tsds_client = TensorflowDatasets(
        dataset_name="mlqa/en",
        split_name="train",
        load_max_docs=MAX_DOCS,
        sample_to_document_function=mlqaen_example_to_document,
    )

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

param dataset_name: str = ''¶

param load_max_docs: int = 100¶

param sample_to_document_function: Optional[Callable[[Dict], langchain_core.documents.base.Document]] = None¶

param split_name: str = 'train'¶

classmethod construct(_fields_set: Optional[SetStr] = None, **values: Any) → Model¶

Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = ‘allow’ was set since it adds all passed values

Parameters

_fields_set (Optional[SetStr]) –
values (Any) –

Return type

Model

copy(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, update: Optional[DictStrAny] = None, deep: bool = False) → Model¶

Duplicate a model, optionally choose which fields to include, exclude and change.

Parameters

include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) – fields to include in new model
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) – fields to exclude from new model, as with values this takes precedence over include
update (Optional[DictStrAny]) – values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data
deep (bool) – set to True to make a deep copy of the model
self (Model) –

Returns

new model instance

Return type

Model

dict(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) → DictStrAny¶

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

Parameters

include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
by_alias (bool) –
skip_defaults (Optional[bool]) –
exclude_unset (bool) –
exclude_defaults (bool) –
exclude_none (bool) –

Return type

DictStrAny

classmethod from_orm(obj: Any) → Model¶

Parameters: obj (Any) –
Return type: Model

json(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, encoder: Optional[Callable[[Any], Any]] = None, models_as_dict: bool = True, **dumps_kwargs: Any) → unicode¶

Generate a JSON representation of the model, include and exclude arguments as per dict().

encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().

Parameters

include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
by_alias (bool) –
skip_defaults (Optional[bool]) –
exclude_unset (bool) –
exclude_defaults (bool) –
exclude_none (bool) –
encoder (Optional[Callable[[Any], Any]]) –
models_as_dict (bool) –
dumps_kwargs (Any) –

Return type

unicode

lazy_load() → Iterator[Document][source]¶

下载所选数据集的懒加载方式。

返回：一个文档的迭代器。

Return type: Iterator[Document]

load() → List[Document][source]¶

下载所选数据集。

返回：文档列表。

Return type: List[Document]

classmethod parse_file(path: Union[str, Path], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) → Model¶

Parameters

path (Union[str, Path]) –
content_type (unicode) –
encoding (unicode) –
proto (Protocol) –
allow_pickle (bool) –

Return type

Model

classmethod parse_obj(obj: Any) → Model¶

Parameters: obj (Any) –
Return type: Model

classmethod parse_raw(b: Union[str, bytes], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) → Model¶

Parameters

b (Union[str, bytes]) –
content_type (unicode) –
encoding (unicode) –
proto (Protocol) –
allow_pickle (bool) –

Return type

Model

classmethod schema(by_alias: bool = True, ref_template: unicode = '#/definitions/{model}') → DictStrAny¶

Parameters

by_alias (bool) –
ref_template (unicode) –

Return type

DictStrAny

classmethod schema_json(*, by_alias: bool = True, ref_template: unicode = '#/definitions/{model}', **dumps_kwargs: Any) → unicode¶

Parameters

by_alias (bool) –
ref_template (unicode) –
dumps_kwargs (Any) –

Return type

unicode

classmethod update_forward_refs(**localns: Any) → None¶

Try to update ForwardRefs on fields based on this Model, globalns and localns.

Parameters: localns (Any) –
Return type: None

classmethod validate(value: Any) → Model¶

Parameters: value (Any) –
Return type: Model

langchain_community.utilities.tensorflow_datasets.TensorflowDatasets¶

`langchain_community.utilities.tensorflow_datasets`.TensorflowDatasets¶