翻译

pipeline

翻译管道用于在不同语言之间翻译文本。它支持超过100种语言。内置了自动源语言检测功能。该管道会检测每行输入文本的语言，加载源语言和目标语言组合的模型，并将文本翻译为目标语言。

示例

以下展示了一个使用此管道的简单示例。

from txtai.pipeline import Translation

# 创建并运行管道
translate = Translation()
translate("This is a test translation into Spanish", "es")

请参阅以下链接以获取更详细的示例。

笔记本	描述
在语言之间翻译文本	简化机器翻译和语言检测

配置驱动的示例

管道可以通过Python或配置运行。管道可以通过配置使用管道的类名的小写形式进行实例化。配置驱动的管道可以通过工作流或API运行。

config.yml

# 使用类名的小写形式创建管道
translation:

# 使用工作流运行管道
workflow:
  translate:
    tasks:
      - action: translation
        args: ["es"]

使用工作流运行

from txtai import Application

# 使用工作流创建并运行管道
app = Application("config.yml")
list(app.workflow("translate", ["This is a test translation into Spanish"]))

使用API运行

CONFIG=config.yml uvicorn "txtai.api:app" &

curl \
  -X POST "http://localhost:8000/workflow" \
  -H "Content-Type: application/json" \
  -d '{"name":"translate", "elements":["This is a test translation into Spanish"]}'

方法

管道的Python文档。

`init(path=None, quantize=False, gpu=True, batch=64, langdetect=None, findmodels=True)`

Constructs a new language translation pipeline.

Parameters:

Name	Description	Default
`path`	optional path to model, accepts Hugging Face model hub id or local path, uses default model for task if not provided	`None`
`quantize`	if model should be quantized, defaults to False	`False`
`gpu`	True/False if GPU should be enabled, also supports a GPU device id	`True`
`batch`	batch size used to incrementally process content	`64`
`langdetect`	set a custom language detection function, method must take a list of strings and return language codes for each, uses default language detector if not provided	`None`
`findmodels`	True/False if the Hugging Face Hub will be searched for source-target translation models	`True`

Source code in txtai/pipeline/text/translation.py

def __init__(self, path=None, quantize=False, gpu=True, batch=64, langdetect=None, findmodels=True):
    """
    Constructs a new language translation pipeline.

    Args:
        path: optional path to model, accepts Hugging Face model hub id or local path,
              uses default model for task if not provided
        quantize: if model should be quantized, defaults to False
        gpu: True/False if GPU should be enabled, also supports a GPU device id
        batch: batch size used to incrementally process content
        langdetect: set a custom language detection function, method must take a list of strings and return
                    language codes for each, uses default language detector if not provided
        findmodels: True/False if the Hugging Face Hub will be searched for source-target translation models
    """

    # Call parent constructor
    super().__init__(path if path else "facebook/m2m100_418M", quantize, gpu, batch)

    # Language detection
    self.detector = None
    self.langdetect = langdetect
    self.findmodels = findmodels

    # Language models
    self.models = {}
    self.ids = self.modelids()

`call(texts, target='en', source=None, showmodels=False)`

Translates text from source language into target language.

This method supports texts as a string or a list. If the input is a string, the return type is string. If text is a list, the return type is a list.