Transformers

GPT-J

概述

GPT-J模型由Ben Wang和Aran Komatsuzaki在kingoflolz/mesh-transformer-jax仓库中发布。它是一个类似于GPT-2的因果语言模型，基于the Pile数据集进行训练。

使用提示

要加载GPT-J的float32版本，至少需要2倍模型大小的RAM：1倍用于初始权重，另1倍用于加载检查点。因此，对于GPT-J来说，仅加载模型就需要至少48GB的RAM。为了减少RAM的使用，有几种选择。torch_dtype参数可以用于仅在CUDA设备上以半精度初始化模型。还有一个fp16分支，它存储了fp16权重，可以进一步最小化RAM的使用：

>>> from transformers import GPTJForCausalLM
>>> import torch

>>> device = "cuda"
>>> model = GPTJForCausalLM.from_pretrained(
...     "EleutherAI/gpt-j-6B",
...     revision="float16",
...     torch_dtype=torch.float16,
... ).to(device)

模型应适合在16GB GPU上进行推理。对于训练/微调，它将需要更多的GPU内存。例如，Adam优化器会创建模型的四个副本：模型、梯度、梯度的平均值和梯度的平方平均值。因此，即使使用混合精度，由于梯度更新是在fp32中进行的，它至少需要4倍模型大小的GPU内存。这还不包括激活和数据批次，这些将再次需要更多的GPU内存。因此，应该探索诸如DeepSpeed之类的解决方案来训练/微调模型。另一个选择是使用原始代码库在TPU上训练/微调模型，然后将模型转换为Transformers格式进行推理。相关说明可以在这里找到。
尽管嵌入矩阵的大小为50400，但GPT-2分词器仅使用了50257个条目。这些额外的标记是为了在TPU上提高效率而添加的。为了避免嵌入矩阵大小与词汇表大小之间的不匹配，GPT-J的分词器包含了143个额外的标记<|extratoken_1|>... <|extratoken_143|>，因此分词器的vocab_size也变为50400。

使用示例

generate() 方法可用于使用 GPT-J 模型生成文本。

>>> from transformers import AutoModelForCausalLM, AutoTokenizer

>>> model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-j-6B")
>>> tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B")

>>> prompt = (
...     "In a shocking finding, scientists discovered a herd of unicorns living in a remote, "
...     "previously unexplored valley, in the Andes Mountains. Even more surprising to the "
...     "researchers was the fact that the unicorns spoke perfect English."
... )

>>> input_ids = tokenizer(prompt, return_tensors="pt").input_ids

>>> gen_tokens = model.generate(
...     input_ids,
...     do_sample=True,
...     temperature=0.9,
...     max_length=100,
... )
>>> gen_text = tokenizer.batch_decode(gen_tokens)[0]

…或者使用float16精度：

>>> from transformers import GPTJForCausalLM, AutoTokenizer
>>> import torch

>>> device = "cuda"
>>> model = GPTJForCausalLM.from_pretrained("EleutherAI/gpt-j-6B", torch_dtype=torch.float16).to(device)
>>> tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B")

>>> prompt = (
...     "In a shocking finding, scientists discovered a herd of unicorns living in a remote, "
...     "previously unexplored valley, in the Andes Mountains. Even more surprising to the "
...     "researchers was the fact that the unicorns spoke perfect English."
... )

>>> input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)

>>> gen_tokens = model.generate(
...     input_ids,
...     do_sample=True,
...     temperature=0.9,
...     max_length=100,
... )
>>> gen_text = tokenizer.batch_decode(gen_tokens)[0]

资源

一份官方的Hugging Face和社区（由🌎表示）资源列表，帮助您开始使用GPT-J。如果您有兴趣提交资源以包含在此处，请随时打开一个Pull Request，我们将对其进行审核！理想情况下，资源应展示一些新内容，而不是重复现有资源。

Text Generation

GPT-J的描述。
一篇关于如何使用Hugging Face Transformers和Amazon SageMaker部署GPT-J 6B进行推理的博客。
一篇关于如何在GPU上使用DeepSpeed-Inference加速GPT-J推理的博客。
一篇博客文章介绍了GPT-J-6B: 6B JAX-Based Transformer。🌎
一个用于GPT-J-6B推理演示的笔记本。🌎
另一个笔记本展示了使用GPT-J-6B进行推理。
Causal language modeling 🤗 Hugging Face 课程的章节。
GPTJForCausalLM 由这个因果语言建模示例脚本、文本生成示例脚本和笔记本支持。
TFGPTJForCausalLM 由这个因果语言建模示例脚本和 notebook 支持。
FlaxGPTJForCausalLM 由这个因果语言建模示例脚本和 notebook 支持。

文档资源

Transformers

GPT-J

概述

使用提示

使用示例

资源

GPTJConfig

类 transformers.GPTJConfig

GPTJModel

类 transformers.GPTJModel

前进

GPTJForCausalLM

类 transformers.GPTJForCausalLM

前进

GPTJForSequenceClassification

类 transformers.GPTJForSequenceClassification

前进

GPTJForQuestionAnswering

类 transformers.GPTJForQuestionAnswering

前进

TFGPTJModel

类 transformers.TFGPTJModel

调用

TFGPTJForCausalLM

类 transformers.TFGPTJForCausalLM

调用

TFGPTJForSequenceClassification

类 transformers.TFGPTJForSequenceClassification

调用

TFGPTJForQuestionAnswering

类 transformers.TFGPTJForQuestionAnswering

调用

FlaxGPTJModel

类 transformers.FlaxGPTJModel

__call__

FlaxGPTJForCausalLM

类 transformers.FlaxGPTJForCausalLM

__call__

call

call