分词器#
- class langchain_text_splitters.base.Tokenizer(chunk_overlap: int, tokens_per_chunk: int, decode: Callable[[List[int]], str], encode: Callable[[str], List[int]])[source]#
分词器数据类。
属性
方法
__init__
(chunk_overlap, tokens_per_chunk, ...)- Parameters:
chunk_overlap (int)
tokens_per_chunk (int)
decode (可调用[[列表[整数]], 字符串])
encode (Callable[[str], List[int]])
- __init__(chunk_overlap: int, tokens_per_chunk: int, decode: Callable[[List[int]], str], encode: Callable[[str], List[int]]) None #
- Parameters:
chunk_overlap (int)
tokens_per_chunk (int)
decode (Callable[[List[int]], str])
encode (可调用[[str], 列表[int]])
- Return type:
无