分词器#

class langchain_text_splitters.base.Tokenizer(chunk_overlap: int, tokens_per_chunk: int, decode: Callable[[List[int]], str], encode: Callable[[str], List[int]])[source]#

分词器数据类。

属性

方法

__init__(chunk_overlap, tokens_per_chunk, ...)

Parameters:
  • chunk_overlap (int)

  • tokens_per_chunk (int)

  • decode (可调用[[列表[整数]], 字符串])

  • encode (Callable[[str], List[int]])

__init__(chunk_overlap: int, tokens_per_chunk: int, decode: Callable[[List[int]], str], encode: Callable[[str], List[int]]) None#
Parameters:
  • chunk_overlap (int)

  • tokens_per_chunk (int)

  • decode (Callable[[List[int]], str])

  • encode (可调用[[str], 列表[int]])

Return type: