langchain_community.utilities.arxiv
.ArxivAPIWrapper¶
- class langchain_community.utilities.arxiv.ArxivAPIWrapper[source]¶
Bases:
BaseModel
封装了ArxivAPI。
要使用,您应该已安装``arxiv`` python包。 https://lukasschwab.me/arxiv.py/index.html 此封装将使用Arxiv API 进行搜索并获取文档摘要。默认情况下,它将返回前k个结果的文档摘要。 如果查询以arxiv标识符的形式存在 (参见https://info.arxiv.org/help/find/index.html),它将返回与arxiv标识符对应的论文。 通过doc_content_chars_max限制文档内容。 如果不想限制内容大小,请将doc_content_chars_max设置为None。
- 属性:
top_k_results: 用于arxiv工具的前k个评分最高的文档数量 ARXIV_MAX_QUERY_LENGTH: 用于arxiv工具的查询的截断限制。 continue_on_failure (bool): 如果为True,在失败时继续加载其他URL。 load_max_docs: 加载文档数量的限制 load_all_available_meta:
如果为True: 加载的文档的“metadata”包含所有可用的元信息 (参见https://lukasschwab.me/arxiv.py/index.html#Result), 如果为False: “metadata”仅包含发布日期、标题、作者和摘要。
doc_content_chars_max: 文档内容长度的可选截断限制
- 示例:
from langchain_community.utilities.arxiv import ArxivAPIWrapper arxiv = ArxivAPIWrapper( top_k_results = 3, ARXIV_MAX_QUERY_LENGTH = 300, load_max_docs = 3, load_all_available_meta = False, doc_content_chars_max = 40000 ) arxiv.run("tree of thought llm")
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- param ARXIV_MAX_QUERY_LENGTH: int = 300¶
- param arxiv_exceptions: Any = None¶
- param continue_on_failure: bool = False¶
- param doc_content_chars_max: Optional[int] = 4000¶
- param load_all_available_meta: bool = False¶
- param load_max_docs: int = 100¶
- param top_k_results: int = 3¶
- classmethod construct(_fields_set: Optional[SetStr] = None, **values: Any) Model ¶
Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = ‘allow’ was set since it adds all passed values
- Parameters
_fields_set (Optional[SetStr]) –
values (Any) –
- Return type
Model
- copy(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, update: Optional[DictStrAny] = None, deep: bool = False) Model ¶
Duplicate a model, optionally choose which fields to include, exclude and change.
- Parameters
include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) – fields to include in new model
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) – fields to exclude from new model, as with values this takes precedence over include
update (Optional[DictStrAny]) – values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data
deep (bool) – set to True to make a deep copy of the model
self (Model) –
- Returns
new model instance
- Return type
Model
- dict(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) DictStrAny ¶
Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
- Parameters
include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
by_alias (bool) –
skip_defaults (Optional[bool]) –
exclude_unset (bool) –
exclude_defaults (bool) –
exclude_none (bool) –
- Return type
DictStrAny
- classmethod from_orm(obj: Any) Model ¶
- Parameters
obj (Any) –
- Return type
Model
- get_summaries_as_docs(query: str) List[Document] [source]¶
执行arxiv搜索并返回文档列表,其中摘要作为内容。
如果发生错误或未找到文档,则返回错误文本。https://lukasschwab.me/arxiv.py/index.html#Search的包装器
- 参数:
query:纯文本搜索查询
- Parameters
query (str) –
- Return type
List[Document]
- is_arxiv_identifier(query: str) bool [source]¶
检查查询是否为arXiv标识符。
- Parameters
query (str) –
- Return type
bool
- json(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, encoder: Optional[Callable[[Any], Any]] = None, models_as_dict: bool = True, **dumps_kwargs: Any) unicode ¶
Generate a JSON representation of the model, include and exclude arguments as per dict().
encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().
- Parameters
include (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
exclude (Optional[Union[AbstractSetIntStr, MappingIntStrAny]]) –
by_alias (bool) –
skip_defaults (Optional[bool]) –
exclude_unset (bool) –
exclude_defaults (bool) –
exclude_none (bool) –
encoder (Optional[Callable[[Any], Any]]) –
models_as_dict (bool) –
dumps_kwargs (Any) –
- Return type
unicode
- lazy_load(query: str) Iterator[Document] [source]¶
运行Arxiv搜索并获取文章文本以及文章元信息。 参见https://lukasschwab.me/arxiv.py/index.html#Search
返回:文档以文本格式的document.page_content返回
执行Arxiv搜索,下载前k个结果作为PDF,将它们加载为Documents,并返回它们。
- 参数:
query: 明文搜索查询
- Parameters
query (str) –
- Return type
Iterator[Document]
- load(query: str) List[Document] [source]¶
运行Arxiv搜索并获取文章文本以及文章元信息。 请参阅https://lukasschwab.me/arxiv.py/index.html#Search
返回:包含文档页面内容的文本格式文档列表
执行Arxiv搜索,下载前k个结果作为PDF,将它们加载为文档,并以列表形式返回。
- 参数:
query:纯文本搜索查询
- Parameters
query (str) –
- Return type
List[Document]
- classmethod parse_file(path: Union[str, Path], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ¶
- Parameters
path (Union[str, Path]) –
content_type (unicode) –
encoding (unicode) –
proto (Protocol) –
allow_pickle (bool) –
- Return type
Model
- classmethod parse_obj(obj: Any) Model ¶
- Parameters
obj (Any) –
- Return type
Model
- classmethod parse_raw(b: Union[str, bytes], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ¶
- Parameters
b (Union[str, bytes]) –
content_type (unicode) –
encoding (unicode) –
proto (Protocol) –
allow_pickle (bool) –
- Return type
Model
- run(query: str) str [source]¶
执行arxiv搜索并返回一个字符串,其中包含每篇文章的发布日期、标题、作者和摘要,每篇文章之间用两个换行符分隔。
如果发生错误或未找到任何文档,则返回错误文本。这是https://lukasschwab.me/arxiv.py/index.html#Search的包装器。
- 参数:
query:一个纯文本搜索查询。
- Parameters
query (str) –
- Return type
str
- classmethod schema(by_alias: bool = True, ref_template: unicode = '#/definitions/{model}') DictStrAny ¶
- Parameters
by_alias (bool) –
ref_template (unicode) –
- Return type
DictStrAny
- classmethod schema_json(*, by_alias: bool = True, ref_template: unicode = '#/definitions/{model}', **dumps_kwargs: Any) unicode ¶
- Parameters
by_alias (bool) –
ref_template (unicode) –
dumps_kwargs (Any) –
- Return type
unicode
- classmethod update_forward_refs(**localns: Any) None ¶
Try to update ForwardRefs on fields based on this Model, globalns and localns.
- Parameters
localns (Any) –
- Return type
None
- classmethod validate(value: Any) Model ¶
- Parameters
value (Any) –
- Return type
Model