`langchain_community.utilities.arxiv`.ArxivAPIWrapper¶

class langchain_community.utilities.arxiv.ArxivAPIWrapper[source]¶

Bases: BaseModel

封装了ArxivAPI。

要使用，您应该已安装``arxiv`` python包。 https://lukasschwab.me/arxiv.py/index.html 此封装将使用Arxiv API 进行搜索并获取文档摘要。默认情况下，它将返回前k个结果的文档摘要。如果查询以arxiv标识符的形式存在 (参见https://info.arxiv.org/help/find/index.html)，它将返回与arxiv标识符对应的论文。通过doc_content_chars_max限制文档内容。如果不想限制内容大小，请将doc_content_chars_max设置为None。

属性:

top_k_results: 用于arxiv工具的前k个评分最高的文档数量 ARXIV_MAX_QUERY_LENGTH: 用于arxiv工具的查询的截断限制。 continue_on_failure (bool): 如果为True，在失败时继续加载其他URL。 load_max_docs: 加载文档数量的限制 load_all_available_meta:

如果为True: 加载的文档的“metadata”包含所有可用的元信息 (参见https://lukasschwab.me/arxiv.py/index.html#Result)，如果为False: “metadata”仅包含发布日期、标题、作者和摘要。

doc_content_chars_max: 文档内容长度的可选截断限制

示例:

from langchain_community.utilities.arxiv import ArxivAPIWrapper
arxiv = ArxivAPIWrapper(
    top_k_results = 3,
    ARXIV_MAX_QUERY_LENGTH = 300,
    load_max_docs = 3,
    load_all_available_meta = False,
    doc_content_chars_max = 40000
)
arxiv.run("tree of thought llm")

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

param ARXIV_MAX_QUERY_LENGTH: int = 300¶

param arxiv_exceptions: Any = None¶

param continue_on_failure: bool = False¶

param doc_content_chars_max: Optional[int] = 4000¶

param load_all_available_meta: bool = False¶

param load_max_docs: int = 100¶

param top_k_results: int = 3¶

classmethod construct(_fields_set: Optional[SetStr] = None, **values: Any) → Model¶

Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = ‘allow’ was set since it adds all passed values

Parameters

_fields_set (Optional[SetStr]) –
values (Any) –

Return type

Model

copy(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, update: Optional[DictStrAny] = None, deep: bool = False) → Model¶

Duplicate a model, optionally choose which fields to include, exclude and change.

Parameters

include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) – fields to include in new model
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) – fields to exclude from new model, as with values this takes precedence over include
update (Optional[DictStrAny]) – values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data
deep (bool) – set to True to make a deep copy of the model
self (Model) –

Returns

new model instance

Return type

Model

dict(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) → DictStrAny¶

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

Parameters

include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
by_alias (bool) –
skip_defaults (Optional[bool]) –
exclude_unset (bool) –
exclude_defaults (bool) –
exclude_none (bool) –

Return type

DictStrAny

classmethod from_orm(obj: Any) → Model¶

Parameters: obj (Any) –
Return type: Model

get_summaries_as_docs(query: str) → List[Document][source]¶

执行arxiv搜索并返回文档列表，其中摘要作为内容。

如果发生错误或未找到文档，则返回错误文本。https://lukasschwab.me/arxiv.py/index.html#Search的包装器

参数：: query：纯文本搜索查询

Parameters: query (str) –
Return type: List[Document]

is_arxiv_identifier(query: str) → bool[source]¶

检查查询是否为arXiv标识符。

Parameters: query (str) –
Return type: bool

json(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, encoder: Optional[Callable[[Any], Any]] = None, models_as_dict: bool = True, **dumps_kwargs: Any) → unicode¶

Generate a JSON representation of the model, include and exclude arguments as per dict().

encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().

Parameters

include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
by_alias (bool) –
skip_defaults (Optional[bool]) –
exclude_unset (bool) –
exclude_defaults (bool) –
exclude_none (bool) –
encoder (Optional[Callable[[Any], Any]]) –
models_as_dict (bool) –
dumps_kwargs (Any) –

Return type

unicode

lazy_load(query: str) → Iterator[Document][source]¶

运行Arxiv搜索并获取文章文本以及文章元信息。参见https://lukasschwab.me/arxiv.py/index.html#Search

返回：文档以文本格式的document.page_content返回

执行Arxiv搜索，下载前k个结果作为PDF，将它们加载为Documents，并返回它们。

参数：: query: 明文搜索查询

Parameters: query (str) –
Return type: Iterator[Document]

load(query: str) → List[Document][source]¶

运行Arxiv搜索并获取文章文本以及文章元信息。请参阅https://lukasschwab.me/arxiv.py/index.html#Search

返回：包含文档页面内容的文本格式文档列表

执行Arxiv搜索，下载前k个结果作为PDF，将它们加载为文档，并以列表形式返回。

参数：: query：纯文本搜索查询

Parameters: query (str) –
Return type: List[Document]

classmethod parse_file(path: Union[str, Path], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) → Model¶

Parameters

path (Union[str, Path]) –
content_type (unicode) –
encoding (unicode) –
proto (Protocol) –
allow_pickle (bool) –

Return type

Model

classmethod parse_obj(obj: Any) → Model¶

Parameters: obj (Any) –
Return type: Model

classmethod parse_raw(b: Union[str, bytes], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) → Model¶

Parameters

b (Union[str, bytes]) –
content_type (unicode) –
encoding (unicode) –
proto (Protocol) –
allow_pickle (bool) –

Return type

Model

run(query: str) → str[source]¶

执行arxiv搜索并返回一个字符串，其中包含每篇文章的发布日期、标题、作者和摘要，每篇文章之间用两个换行符分隔。

如果发生错误或未找到任何文档，则返回错误文本。这是https://lukasschwab.me/arxiv.py/index.html#Search的包装器。

参数：: query：一个纯文本搜索查询。

Parameters: query (str) –
Return type: str

classmethod schema(by_alias: bool = True, ref_template: unicode = '#/definitions/{model}') → DictStrAny¶

Parameters

by_alias (bool) –
ref_template (unicode) –

Return type

DictStrAny

classmethod schema_json(*, by_alias: bool = True, ref_template: unicode = '#/definitions/{model}', **dumps_kwargs: Any) → unicode¶

Parameters

by_alias (bool) –
ref_template (unicode) –
dumps_kwargs (Any) –

Return type

unicode

classmethod update_forward_refs(**localns: Any) → None¶

Try to update ForwardRefs on fields based on this Model, globalns and localns.

Parameters: localns (Any) –
Return type: None

classmethod validate(value: Any) → Model¶

Parameters: value (Any) –
Return type: Model

Examples using ArxivAPIWrapper¶

arxiv.md

langchain_community.utilities.arxiv.ArxivAPIWrapper¶

Examples using ArxivAPIWrapper¶

`langchain_community.utilities.arxiv`.ArxivAPIWrapper¶