langchain_community.document_loaders.parsers.grobid
.GrobidParser¶
- class langchain_community.document_loaders.parsers.grobid.GrobidParser(segment_sentences: bool, grobid_server: str = 'http://localhost:8070/api/processFulltextDocument')[source]¶
使用`Grobid`加载文章的`PDF`文件。
Methods
__init__
(segment_sentences[, grobid_server])lazy_parse
(blob)懒解析接口。
parse
(blob)将blob急切地解析为一个文档或多个文档。
process_xml
(file_path, xml_data, ...)处理来自Grobin的XML文件。
- Parameters
segment_sentences (bool) –
grobid_server (str) –
- Return type
None
- __init__(segment_sentences: bool, grobid_server: str = 'http://localhost:8070/api/processFulltextDocument') None [source]¶
- Parameters
segment_sentences (bool) –
grobid_server (str) –
- Return type
None