langchain_community.document_transformers.doctran_text_extract.DoctranPropertyExtractor

class langchain_community.document_transformers.doctran_text_extract.DoctranPropertyExtractor(properties: List[dict], openai_api_key: Optional[str] = None, openai_api_model: Optional[str] = None)[source]

从文本文档中使用doctran提取属性。

参数:

properties: 要提取的属性列表。 openai_api_key: OpenAI API密钥。也可以通过环境变量``OPENAI_API_KEY``指定。

示例:
from langchain_community.document_transformers import DoctranPropertyExtractor

properties = [
    {
        "name": "category",
        "description": "这是什么类型的电子邮件。",
        "type": "string",
        "enum": ["update", "action_item", "customer_feedback", "announcement", "other"],
        "required": True,
    },
    {
        "name": "mentions",
        "description": "此电子邮件中提到的所有人的列表。",
        "type": "array",
        "items": {
            "name": "full_name",
            "description": "被提及人的全名。",
            "type": "string",
        },
        "required": True,
    },
    {
        "name": "eli5",
        "description": "用5岁小孩的语言解释这封电子邮件。",
        "type": "string",
        "required": True,
    },
]

# 传入openai_api_key或设置环境变量OPENAI_API_KEY
property_extractor = DoctranPropertyExtractor(properties)
transformed_document = await qa_transformer.atransform_documents(documents)

Methods

__init__(properties[, openai_api_key, ...])

atransform_documents(documents, **kwargs)

使用doctran从文本文档中提取属性。

transform_documents(documents, **kwargs)

使用doctran从文本文档中提取属性。

Parameters
  • properties (List[dict]) –

  • openai_api_key (Optional[str]) –

  • openai_api_model (Optional[str]) –

Return type

None

__init__(properties: List[dict], openai_api_key: Optional[str] = None, openai_api_model: Optional[str] = None) None[source]
Parameters
  • properties (List[dict]) –

  • openai_api_key (Optional[str]) –

  • openai_api_model (Optional[str]) –

Return type

None

async atransform_documents(documents: Sequence[Document], **kwargs: Any) Sequence[Document][source]

使用doctran从文本文档中提取属性。

Parameters
  • documents (Sequence[Document]) –

  • kwargs (Any) –

Return type

Sequence[Document]

transform_documents(documents: Sequence[Document], **kwargs: Any) Sequence[Document][source]

使用doctran从文本文档中提取属性。

Parameters
  • documents (Sequence[Document]) –

  • kwargs (Any) –

Return type

Sequence[Document]

Examples using DoctranPropertyExtractor