Anthropic Haiku Cookbook¶

Anthropic发布了Claude 3 Haiku。本笔记本为您提供了一个快速入门，使用Haiku模型。它可以帮助您探索模型在文本和视觉任务上的能力。

安装步骤¶

In [ ]:

Copied!

!pip install llama-index
!pip install llama-index-llms-anthropic
!pip install llama-index-multi-modal-llms-anthropic
!pip install llama-index
!pip install llama-index-llms-anthropic
!pip install llama-index-multi-modal-llms-anthropic

In [ ]:

Copied!

from llama_index.llms.anthropic import Anthropic
from llama_index.multi_modal_llms.anthropic import AnthropicMultiModal
from llama_index.llms.anthropic import Anthropic
from llama_index.multi_modal_llms.anthropic import AnthropicMultiModal

Set API keys¶

In [ ]:

Copied!

import os

os.environ["ANTHROPIC_API_KEY"] = "YOUR ANTHROPIC API KEY"
import os

os.environ["ANTHROPIC_API_KEY"] = "YOUR ANTHROPIC API KEY"

使用模型进行聊天/完成¶

In [ ]:

Copied!

llm = Anthropic(model="claude-3-haiku-20240307")
llm = Anthropic(model="claude-3-haiku-20240307")

In [ ]:

Copied!

response = llm.complete("LlamaIndex is ")
print(response)
response = llm.complete("LlamaIndex is ")
print(response)

LlamaIndex is an open-source library that provides a set of tools and interfaces for building knowledge-based applications using large language models (LLMs) like GPT-3, GPT-J, and GPT-Neo. It is designed to make it easier to work with LLMs by providing a high-level API for tasks such as:

1. **Data Ingestion**: LlamaIndex supports ingesting a variety of data sources, including text files, PDFs, web pages, and databases, and organizing them into a knowledge graph.

2. **Query Handling**: LlamaIndex provides a simple and intuitive interface for querying the knowledge graph, allowing users to ask questions and get relevant information from the underlying data.

3. **Retrieval and Ranking**: LlamaIndex uses advanced retrieval and ranking algorithms to identify the most relevant information for a given query, leveraging the capabilities of the underlying LLM.

4. **Summarization and Synthesis**: LlamaIndex can generate summaries and synthesize new information based on the content of the knowledge graph, helping users to quickly understand and extract insights from large amounts of data.

5. **Extensibility**: LlamaIndex is designed to be highly extensible, allowing developers to integrate custom data sources, retrieval algorithms, and other functionality as needed.

The primary goal of LlamaIndex is to make it easier for developers to build knowledge-based applications that leverage the power of large language models, without having to worry about the low-level details of working with these models directly. By providing a high-level API and a set of reusable components, LlamaIndex aims to accelerate the development of a wide range of applications, from chatbots and virtual assistants to knowledge management systems and research tools.

使用多模态模型¶

在这个示例中，我们将展示如何使用多模态模型。多模态模型是一种能够处理多种类型输入数据（如图像、文本、音频等）的模型。在这个示例中，我们将展示如何使用多模态模型来处理图像和文本数据。

下载图片¶

In [ ]:

Copied!

!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/images/prometheus_paper_card.png' -O 'prometheus_paper_card.png'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/images/prometheus_paper_card.png' -O 'prometheus_paper_card.png'

--2024-03-14 03:27:01--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/images/prometheus_paper_card.png
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 2606:50c0:8000::154, 2606:50c0:8001::154, 2606:50c0:8002::154, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|2606:50c0:8000::154|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1002436 (979K) [image/png]
Saving to: 'prometheus_paper_card.png'

prometheus_paper_ca 100%[===================>] 978.94K  --.-KB/s    in 0.07s   

2024-03-14 03:27:01 (13.3 MB/s) - 'prometheus_paper_card.png' saved [1002436/1002436]

In [ ]:

Copied!

from PIL import Image
import matplotlib.pyplot as plt

img = Image.open("prometheus_paper_card.png")
plt.imshow(img)
from PIL import Image
import matplotlib.pyplot as plt

img = Image.open("prometheus_paper_card.png")
plt.imshow(img)

Out[ ]:

<matplotlib.image.AxesImage at 0x167e83290>

No description has been provided for this image

Load the image¶

In [ ]:

Copied!





from llama_index.core import SimpleDirectoryReader

# 在这里放入你的本地目录
image_documents = SimpleDirectoryReader(
    input_files=["prometheus_paper_card.png"]
).load_data()

# 初始化Anthropic MultiModal类
anthropic_mm_llm = AnthropicMultiModal(
    model="claude-3-haiku-20240307", max_tokens=300
)
from llama_index.core import SimpleDirectoryReader

# 在这里放入你的本地目录
image_documents = SimpleDirectoryReader(
    input_files=["prometheus_paper_card.png"]
).load_data()

# 初始化Anthropic MultiModal类
anthropic_mm_llm = AnthropicMultiModal(
    model="claude-3-haiku-20240307", max_tokens=300
)

图像查询测试¶

In [ ]:

Copied!





response = anthropic_mm_llm.complete(
    prompt="Describe the images as an alternative text",
    image_documents=image_documents,
)

print(response)
response = anthropic_mm_llm.complete(
    prompt="Describe the images as an alternative text",
    image_documents=image_documents,
)

print(response)

The image is a poster titled "Prometheus: Inducing Fine-Grained Evaluation Capability In Language Models". It provides information about the Prometheus project, which is an open-source LLM (LLama2) evaluator specializing in fine-grained evaluations using custom rubrics.

The poster is divided into three main sections: Contributions, Results, and Technical Bits.

The Contributions section introduces Prometheus as an open-source LLM evaluator that uses custom rubrics for fine-grained evaluations. The Feedback Collection section describes a dataset designed for fine-tuning evaluator LLMs with custom, fine-grained score rubrics.

The Results section highlights three key findings: 1) Prometheus matches or outperforms GPT-4 on three datasets, and its written feedback was preferred over GPT-4 by human annotators 58.6% of the time; 2) Prometheus can function as a reward model, achieving high levels of agreement with human evaluators when re-purposed for ranking/grading tasks; and 3) reference answers are crucial for LLM evaluations, as excluding them and then using feedback distillation led to performance degradations against all other considered factors.

The Technical Bits section provides a visual overview of the Feedback Collection process, which involves using GPT-4 to generate score rubrics and

让我们比较不同模型的响应速度¶

我们将随机生成10个提示，并检查平均响应时间。

生成随机的10个提示¶

In [ ]:

Copied!





import random

# 潜在主题和动作的列表
subjects = ["一只猫", "一名宇航员", "一名老师", "一个机器人", "一个海盗"]
actions = [
    "正在探索一个神秘的洞穴",
    "发现了一个隐藏的宝藏",
    "解决了一个复杂的谜题",
    "发明了一个新的小工具",
    "发现了一个新的行星",
]

prompts = []
# 生成10个随机提示
for _ in range(10):
    subject = random.choice(subjects)
    action = random.choice(actions)
    prompt = f"{subject} {action}"
    prompts.append(prompt)
import random

# 潜在主题和动作的列表
subjects = ["一只猫", "一名宇航员", "一名老师", "一个机器人", "一个海盗"]
actions = [
    "正在探索一个神秘的洞穴",
    "发现了一个隐藏的宝藏",
    "解决了一个复杂的谜题",
    "发明了一个新的小工具",
    "发现了一个新的行星",
]

prompts = []
# 生成10个随机提示
for _ in range(10):
    subject = random.choice(subjects)
    action = random.choice(actions)
    prompt = f"{subject} {action}"
    prompts.append(prompt)

In [ ]:

Copied!





# 计算模型和提示的平均响应时间
def average_response_time(model, prompts):
    total_time_taken = 0
    llm = Anthropocentric(model=model, max_tokens=300)
    for prompt in prompts:
        start_time = time.time()
        _ = llm.complete(prompt)
        end_time = time.time()
        total_time_taken = total_time_taken + end_time - start_time

    return total_time_taken / len(prompts)

# 计算模型和提示的平均响应时间
def average_response_time(model, prompts):
    total_time_taken = 0
    llm = Anthropocentric(model=model, max_tokens=300)
    for prompt in prompts:
        start_time = time.time()
        _ = llm.complete(prompt)
        end_time = time.time()
        total_time_taken = total_time_taken + end_time - start_time

    return total_time_taken / len(prompts)

In [ ]:

Copied!

haiku_avg_response_time = average_response_time(
    "claude-3-haiku-20240307", prompts
)
haiku_avg_response_time = average_response_time(
    "claude-3-haiku-20240307", prompts
)

In [ ]:

Copied!

opus_avg_response_time = average_response_time(
    "claude-3-opus-20240229", prompts
)
opus_avg_response_time = average_response_time(
    "claude-3-opus-20240229", prompts
)

In [ ]:

Copied!

sonnet_avg_response_time = average_response_time(
    "claude-3-sonnet-20240229", prompts
)
sonnet_avg_response_time = average_response_time(
    "claude-3-sonnet-20240229", prompts
)

In [ ]:

Copied!

print(f"Avg. time taken by Haiku model: {haiku_avg_response_time} seconds")
print(f"Avg. time taken by Opus model: {opus_avg_response_time} seconds")
print(f"Avg. time taken by Sonnet model: {sonnet_avg_response_time} seconds")
print(f"Avg. time taken by Haiku model: {haiku_avg_response_time} seconds")
print(f"Avg. time taken by Opus model: {opus_avg_response_time} seconds")
print(f"Avg. time taken by Sonnet model: {sonnet_avg_response_time} seconds")

Avg. time taken by Haiku model: 3.87667396068573 seconds
Avg. time taken by Opus model: 18.772309136390685 seconds
Avg. time taken by Sonnet model: 47.86884641647339 seconds