Transformers 文档

管道

Transformers

管道

管道是一种使用模型进行推理的简单而有效的方法。这些管道对象抽象了库中的大部分复杂代码，提供了一个专门用于多种任务的简单API，包括命名实体识别、掩码语言建模、情感分析、特征提取和问答。请参阅任务摘要以获取使用示例。

需要注意的管道抽象有两类：

pipeline() 是最强大的对象，封装了所有其他管道。
特定任务的管道可用于音频、计算机视觉、自然语言处理和多模态任务。

管道抽象

pipeline 抽象是所有其他可用管道的包装器。它可以像其他管道一样实例化，但可以提供额外的便利性。

单个项目的简单调用：

>>> pipe = pipeline("text-classification")
>>> pipe("This restaurant is awesome")
[{'label': 'POSITIVE', 'score': 0.9998743534088135}]

如果你想使用hub上的特定模型，如果该模型已经在hub上定义了任务，你可以忽略该任务：

>>> pipe = pipeline(model="FacebookAI/roberta-large-mnli")
>>> pipe("This restaurant is awesome")
[{'label': 'NEUTRAL', 'score': 0.7313136458396912}]

要在多个项目上调用管道，你可以使用列表来调用它。

>>> pipe = pipeline("text-classification")
>>> pipe(["This restaurant is awesome", "This restaurant is awful"])
[{'label': 'POSITIVE', 'score': 0.9998743534088135},
 {'label': 'NEGATIVE', 'score': 0.9996669292449951}]

为了遍历完整的数据集，建议直接使用dataset。这意味着你不需要一次性分配整个数据集，也不需要自己进行批处理。这在GPU上的速度应该与自定义循环一样快。如果不行，请不要犹豫创建一个问题。

import datasets
from transformers import pipeline
from transformers.pipelines.pt_utils import KeyDataset
from tqdm.auto import tqdm

pipe = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h", device=0)
dataset = datasets.load_dataset("superb", name="asr", split="test")

# KeyDataset (only *pt*) will simply return the item in the dict returned by the dataset item
# as we're not interested in the *target* part of the dataset. For sentence pair use KeyPairDataset
for out in tqdm(pipe(KeyDataset(dataset, "file"))):
    print(out)
    # {"text": "NUMBER TEN FRESH NELLY IS WAITING ON YOU GOOD NIGHT HUSBAND"}
    # {"text": ....}
    # ....

为了方便使用，也可以使用生成器：

from transformers import pipeline

pipe = pipeline("text-classification")


def data():
    while True:
        # This could come from a dataset, a database, a queue or HTTP request
        # in a server
        # Caveat: because this is iterative, you cannot use `num_workers > 1` variable
        # to use multiple threads to preprocess data. You can still have 1 thread that
        # does the preprocessing while the main runs the big inference
        yield "This is a test"


for out in pipe(data()):
    print(out)
    # {"text": "NUMBER TEN FRESH NELLY IS WAITING ON YOU GOOD NIGHT HUSBAND"}
    # {"text": ....}
    # ....

Transformers

管道

管道抽象

transformers.pipeline

管道批处理

管道分块批处理

Pipeline FP16 推理

管道自定义代码

实现一个管道

音频

音频分类管道

类 transformers.AudioClassificationPipeline

__call__

自动语音识别管道

类 transformers.AutomaticSpeechRecognitionPipeline

__call__

TextToAudioPipeline

类 transformers.TextToAudioPipeline

__call__

ZeroShotAudioClassificationPipeline

类 transformers.ZeroShotAudioClassificationPipeline

__call__

计算机视觉

深度估计管道

类 transformers.DepthEstimationPipeline

__call__

图像分类管道

类 transformers.ImageClassificationPipeline

__call__

图像分割管道

类 transformers.ImageSegmentationPipeline

__call__

ImageToImagePipeline

类 transformers.ImageToImagePipeline

__call__

ObjectDetectionPipeline

类 transformers.ObjectDetectionPipeline

__call__

VideoClassificationPipeline

类 transformers.VideoClassificationPipeline

__call__

ZeroShotImageClassificationPipeline

类 transformers.ZeroShotImageClassificationPipeline

__call__

ZeroShotObjectDetectionPipeline

类 transformers.ZeroShotObjectDetectionPipeline

__call__

自然语言处理

FillMaskPipeline

类 transformers.FillMaskPipeline

__call__

QuestionAnsweringPipeline

类 transformers.QuestionAnsweringPipeline

__call__

create_sample

span_to_answer

SummarizationPipeline

类 transformers.SummarizationPipeline

__call__

TableQuestionAnsweringPipeline

类 transformers.TableQuestionAnsweringPipeline

__call__

文本分类管道

类 transformers.TextClassificationPipeline

__call__

文本生成管道

类 transformers.TextGenerationPipeline

__call__

Text2TextGenerationPipeline

类 transformers.Text2TextGenerationPipeline

__call__

check_inputs

TokenClassificationPipeline

类 transformers.TokenClassificationPipeline

__call__

aggregate_words

gather_pre_entities

group_entities

group_sub_entities

翻译管道

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call