使用 LlamaIndex 进行 LlaVa 演示¶

在这个示例中，我们将演示如何使用 LlaVa 进行以下任务：

检索增强图像字幕
Pydantic 结构化输出
使用 Llava-13b 进行多模态检索增强生成（RAG）

LLaVA 的背景：大型语言和视觉助手

网站
论文
Github
LLaVA 13b 现在在 Replicate 中得到支持：点击这里

关于 LlamaIndex： LlaVa+Replicate 让我们能够在本地运行图像理解，并将多模态知识与我们的基于 RAG 知识的系统相结合。

使用Llava-13b进行检索增强图像字幕¶

使用复制服务通过LlamaIndex提供LLaVa模型¶

In [ ]:

Copied!

%pip install llama-index-vector-stores-qdrant
%pip install llama-index-readers-file
%pip install llama-index-multi-modal-llms-replicate
%pip install llama-index-vector-stores-qdrant
%pip install llama-index-readers-file
%pip install llama-index-multi-modal-llms-replicate

In [ ]:

Copied!





%pip install unstructured replicate
%pip install llama_index ftfy regex tqdm
%pip install git+https://github.com/openai/CLIP.git
%pip install torch torchvision
%pip install matplotlib scikit-image
%pip install -U qdrant_client
%pip install unstructured replicate
%pip install llama_index ftfy regex tqdm
%pip install git+https://github.com/openai/CLIP.git
%pip install torch torchvision
%pip install matplotlib scikit-image
%pip install -U qdrant_client

UsageError: Line magic function `%` not found.

In [ ]:

Copied!

import os

REPLICATE_API_TOKEN = "..."  # 在此处填入你的Relicate API令牌
os.environ["REPLICATE_API_TOKEN"] = REPLICATE_API_TOKEN

import os

REPLICATE_API_TOKEN = "..."  # 在此处填入你的Relicate API令牌
os.environ["REPLICATE_API_TOKEN"] = REPLICATE_API_TOKEN

从特斯拉10K文件中提取数据¶

在这些部分中，我们使用Unstructured来解析表格和非表格元素。

提取元素¶

我们使用Unstructured来从10-K报告中提取表格和非表格元素。

In [ ]:

Copied!

!wget "https://www.dropbox.com/scl/fi/mlaymdy1ni1ovyeykhhuk/tesla_2021_10k.htm?rlkey=qf9k4zn0ejrbm716j0gg7r802&dl=1" -O tesla_2021_10k.htm
!wget "https://docs.google.com/uc?export=download&id=1UU0xc3uLXs-WG0aDQSXjGacUkp142rLS" -O texas.jpg
!wget "https://www.dropbox.com/scl/fi/mlaymdy1ni1ovyeykhhuk/tesla_2021_10k.htm?rlkey=qf9k4zn0ejrbm716j0gg7r802&dl=1" -O tesla_2021_10k.htm
!wget "https://docs.google.com/uc?export=download&id=1UU0xc3uLXs-WG0aDQSXjGacUkp142rLS" -O texas.jpg

In [ ]:

Copied!





from llama_index.readers.file import FlatReader
from pathlib import Path
from llama_index.core.node_parser import UnstructuredElementNodeParser

reader = FlatReader()
docs_2021 = reader.load_data(Path("tesla_2021_10k.htm"))
node_parser = UnstructuredElementNodeParser()
from llama_index.readers.file import FlatReader
from pathlib import Path
from llama_index.core.node_parser import UnstructuredElementNodeParser

reader = FlatReader()
docs_2021 = reader.load_data(Path("tesla_2021_10k.htm"))
node_parser = UnstructuredElementNodeParser()

In [ ]:

Copied!

import openai

OPENAI_API_KEY = "..."
openai.api_key = OPENAI_API_KEY  # 在这里添加你的openai api密钥
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
import openai

OPENAI_API_KEY = "..."
openai.api_key = OPENAI_API_KEY  # 在这里添加你的openai api密钥
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

In [ ]:

Copied!





import os
import pickle

if not os.path.exists("2021_nodes.pkl"):
    raw_nodes_2021 = node_parser.get_nodes_from_documents(docs_2021)
    pickle.dump(raw_nodes_2021, open("2021_nodes.pkl", "wb"))
else:
    raw_nodes_2021 = pickle.load(open("2021_nodes.pkl", "rb"))
import os
import pickle

if not os.path.exists("2021_nodes.pkl"):
    raw_nodes_2021 = node_parser.get_nodes_from_documents(docs_2021)
    pickle.dump(raw_nodes_2021, open("2021_nodes.pkl", "wb"))
else:
    raw_nodes_2021 = pickle.load(open("2021_nodes.pkl", "rb"))

In [ ]:

Copied!

nodes_2021, objects_2021 = node_parser.get_nodes_and_objects(raw_nodes_2021)
nodes_2021, objects_2021 = node_parser.get_nodes_and_objects(raw_nodes_2021)

设置可组合的检索器¶

现在我们已经提取了表格及其摘要，我们可以在LlamaIndex中设置一个可组合的检索器来查询这些表格。

构建检索器¶

In [ ]:

Copied!

from llama_index.core import VectorStoreIndex

# 构建顶层向量索引 + 查询引擎
vector_index = VectorStoreIndex(nodes=nodes_2021, objects=objects_2021)
query_engine = vector_index.as_query_engine(similarity_top_k=5, verbose=True)
from llama_index.core import VectorStoreIndex

# 构建顶层向量索引 + 查询引擎
vector_index = VectorStoreIndex(nodes=nodes_2021, objects=objects_2021)
query_engine = vector_index.as_query_engine(similarity_top_k=5, verbose=True)

In [ ]:

Copied!





from PIL import Image
import matplotlib.pyplot as plt

imageUrl = "./texas.jpg"
image = Image.open(imageUrl).convert("RGB")

plt.figure(figsize=(16, 5))
plt.imshow(image)
from PIL import Image
import matplotlib.pyplot as plt

imageUrl = "./texas.jpg"
image = Image.open(imageUrl).convert("RGB")

plt.figure(figsize=(16, 5))
plt.imshow(image)

Out[ ]:

<matplotlib.image.AxesImage at 0x7f1b2e09b790>

No description has been provided for this image

使用Replicate通过LlamaIndex运行LLaVa模型进行图像理解¶

LLaVa模型是一种用于图像理解的模型，可以通过Replicate和LlamaIndex来运行。Replicate是一个用于复制和并行化任务的工具，而LlamaIndex是一个用于图像理解的工具。通过结合使用这两个工具，可以有效地运行LLaVa模型并进行图像理解。

In [ ]:

Copied!





from llama_index.multi_modal_llms.replicate import ReplicateMultiModal
from llama_index.core.schema import ImageDocument
from llama_index.multi_modal_llms.replicate.base import (
    REPLICATE_MULTI_MODAL_LLM_MODELS,
)

print(imageUrl)

llava_multi_modal_llm = ReplicateMultiModal(
    model=REPLICATE_MULTI_MODAL_LLM_MODELS["llava-13b"],
    max_new_tokens=200,
    temperature=0.1,
)

prompt = "which Tesla factory is shown in the image? Please answer just the name of the factory."

llava_response = llava_multi_modal_llm.complete(
    prompt=prompt,
    image_documents=[ImageDocument(image_path=imageUrl)],
)
from llama_index.multi_modal_llms.replicate import ReplicateMultiModal
from llama_index.core.schema import ImageDocument
from llama_index.multi_modal_llms.replicate.base import (
    REPLICATE_MULTI_MODAL_LLM_MODELS,
)

print(imageUrl)

llava_multi_modal_llm = ReplicateMultiModal(
    model=REPLICATE_MULTI_MODAL_LLM_MODELS["llava-13b"],
    max_new_tokens=200,
    temperature=0.1,
)

prompt = "which Tesla factory is shown in the image? Please answer just the name of the factory."

llava_response = llava_multi_modal_llm.complete(
    prompt=prompt,
    image_documents=[ImageDocument(image_path=imageUrl)],
)

./texas.jpg

In [ ]:

Copied!

print(llava_response.text)
print(llava_response.text)

Gigafactory

从LlamaIndex知识库中检索与LLaVa图像理解相关的信息，以增强“图像字幕生成”功能¶

In [ ]:

Copied!

rag_response = query_engine.query(llava_response.text)
rag_response = query_engine.query(llava_response.text)

In [ ]:

Copied!

print(rag_response)
print(rag_response)

Gigafactory is a term used by Tesla to describe its expansive manufacturing facilities that are strategically located in various regions worldwide. These factories are specifically designed to produce a range of Tesla products, including electric vehicles, battery cells, and energy storage solutions. Currently, Tesla operates Gigafactories in Nevada, New York, Shanghai, and Berlin, with plans to establish another one in Texas. The primary objective of these Gigafactories is to significantly enhance Tesla's production capabilities, drive down costs, and optimize operational efficiency across its manufacturing operations.

使用LLaVa进行多模态Pydantic程序¶

初始化Instagram广告Pydantic类¶

In [ ]:

Copied!

input_image_path = Path("instagram_images")
if not input_image_path.exists():
    Path.mkdir(input_image_path)
input_image_path = Path("instagram_images")
if not input_image_path.exists():
    Path.mkdir(input_image_path)

In [ ]:

Copied!

!wget "https://docs.google.com/uc?export=download&id=12ZpBBFkYu-jzz1iz356U5kMikn4uN9ww" -O ./instagram_images/jordan.png
!wget "https://docs.google.com/uc?export=download&id=12ZpBBFkYu-jzz1iz356U5kMikn4uN9ww" -O ./instagram_images/jordan.png

Will not apply HSTS. The HSTS database must be a regular and non-world-writable file.
ERROR: could not open HSTS store at '/home/loganm/.wget-hsts'. HSTS will be disabled.
--2024-01-15 14:39:59--  https://docs.google.com/uc?export=download&id=12ZpBBFkYu-jzz1iz356U5kMikn4uN9ww
Resolving docs.google.com (docs.google.com)... 142.251.32.78, 2607:f8b0:400b:807::200e
Connecting to docs.google.com (docs.google.com)|142.251.32.78|:443... connected.
HTTP request sent, awaiting response... 303 See Other
Location: https://drive.usercontent.google.com/download?id=12ZpBBFkYu-jzz1iz356U5kMikn4uN9ww&export=download [following]
--2024-01-15 14:40:00--  https://drive.usercontent.google.com/download?id=12ZpBBFkYu-jzz1iz356U5kMikn4uN9ww&export=download
Resolving drive.usercontent.google.com (drive.usercontent.google.com)... 142.251.32.65, 2607:f8b0:400b:802::2001
Connecting to drive.usercontent.google.com (drive.usercontent.google.com)|142.251.32.65|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2722061 (2.6M) [image/png]
Saving to: ‘./instagram_images/jordan.png’

./instagram_images/ 100%[===================>]   2.60M  10.7MB/s    in 0.2s    

2024-01-15 14:40:01 (10.7 MB/s) - ‘./instagram_images/jordan.png’ saved [2722061/2722061]

In [ ]:

Copied!





from pydantic import BaseModel


class InsAds(BaseModel):
    """Ins广告的数据模型。"""

    account: str
    brand: str
    product: str
    category: str
    discount: str
    price: str
    comments: str
    review: str
    description: str

from pydantic import BaseModel


class InsAds(BaseModel):
    """Ins广告的数据模型。"""

    account: str
    brand: str
    product: str
    category: str
    discount: str
    price: str
    comments: str
    review: str
    description: str

In [ ]:

Copied!





from PIL import Image
import matplotlib.pyplot as plt

ins_imageUrl = "./instagram_images/jordan.png"
image = Image.open(ins_imageUrl).convert("RGB")

plt.figure(figsize=(16, 5))
plt.imshow(image)
from PIL import Image
import matplotlib.pyplot as plt

ins_imageUrl = "./instagram_images/jordan.png"
image = Image.open(ins_imageUrl).convert("RGB")

plt.figure(figsize=(16, 5))
plt.imshow(image)

Out[ ]:

<matplotlib.image.AxesImage at 0x7f1b2e0fe710>

使用多模态Pydantic程序使用Llava-13b生成结构化输出¶

In [ ]:

Copied!





from llama_index.multi_modal_llms.replicate import ReplicateMultiModal
from llama_index.core.program import MultiModalLLMCompletionProgram
from llama_index.core.output_parsers import PydanticOutputParser
from llama_index.multi_modal_llms.replicate.base import (
    REPLICATE_MULTI_MODAL_LLM_MODELS,
)

prompt_template_str = """\
    你能总结一下图片中的内容，并以json格式返回答案\
"""


def pydantic_llava(
    model_name, output_class, image_documents, prompt_template_str
):
    mm_llm = ReplicateMultiModal(
        model=REPLICATE_MULTI_MODAL_LLM_MODELS["llava-13b"],
        max_new_tokens=1000,
    )

    llm_program = MultiModalLLMCompletionProgram.from_defaults(
        output_parser=PydanticOutputParser(output_class),
        image_documents=image_documents,
        prompt_template_str=prompt_template_str,
        multi_modal_llm=mm_llm,
        verbose=True,
    )

    response = llm_program()
    print(f"模型：{model_name}")
    for res in response:
        print(res)
    return response
from llama_index.multi_modal_llms.replicate import ReplicateMultiModal
from llama_index.core.program import MultiModalLLMCompletionProgram
from llama_index.core.output_parsers import PydanticOutputParser
from llama_index.multi_modal_llms.replicate.base import (
    REPLICATE_MULTI_MODAL_LLM_MODELS,
)

prompt_template_str = """\
    你能总结一下图片中的内容，并以json格式返回答案\
"""


def pydantic_llava(
    model_name, output_class, image_documents, prompt_template_str
):
    mm_llm = ReplicateMultiModal(
        model=REPLICATE_MULTI_MODAL_LLM_MODELS["llava-13b"],
        max_new_tokens=1000,
    )

    llm_program = MultiModalLLMCompletionProgram.from_defaults(
        output_parser=PydanticOutputParser(output_class),
        image_documents=image_documents,
        prompt_template_str=prompt_template_str,
        multi_modal_llm=mm_llm,
        verbose=True,
    )

    response = llm_program()
    print(f"模型：{model_name}")
    for res in response:
        print(res)
    return response

输出结构化的Pydantic输出¶

In [ ]:

Copied!

from llama_index.core import SimpleDirectoryReader

ins_image_documents = SimpleDirectoryReader("./instagram_images").load_data()

pydantic_response = pydantic_llava(
    "llava-13b", InsAds, ins_image_documents, prompt_template_str
)
from llama_index.core import SimpleDirectoryReader

ins_image_documents = SimpleDirectoryReader("./instagram_images").load_data()

pydantic_response = pydantic_llava(
    "llava-13b", InsAds, ins_image_documents, prompt_template_str
)

Model: llava-13b
('account', 'airjordan')
('brand', 'Air Jordan')
('product', 'Air Jordan 2')
('category', 'Sneakers')
('discount', '15%')
('price', '$175')
('comments', '')
('review', '')
('description', 'Italy.')

In [ ]:

Copied!

print(pydantic_response.brand)
print(pydantic_response.brand)

Air Jordan

使用GPT4V和多模态索引/检索器/查询引擎进行高级多模态检索¶

该项目旨在利用GPT4V和多模态索引/检索器/查询引擎进行高级多模态检索。

从原始文件[Wikipedia]下载文本、图像数据，用于多模态索引/检索¶

In [ ]:

Copied!





from pathlib import Path
import requests

wiki_titles = [
    "batman",
    "Vincent van Gogh",
    "San Francisco",
    "iPhone",
    "Tesla Model S",
    "BTS",
    "Air Jordan",
]


data_path = Path("data_wiki")


for title in wiki_titles:
    response = requests.get(
        "https://en.wikipedia.org/w/api.php",
        params={
            "action": "query",
            "format": "json",
            "titles": title,
            "prop": "extracts",
            "explaintext": True,
        },
    ).json()
    page = next(iter(response["query"]["pages"].values()))
    wiki_text = page["extract"]

    if not data_path.exists():
        Path.mkdir(data_path)

    with open(data_path / f"{title}.txt", "w") as fp:
        fp.write(wiki_text)
from pathlib import Path
import requests

wiki_titles = [
    "batman",
    "Vincent van Gogh",
    "San Francisco",
    "iPhone",
    "Tesla Model S",
    "BTS",
    "Air Jordan",
]


data_path = Path("data_wiki")


for title in wiki_titles:
    response = requests.get(
        "https://en.wikipedia.org/w/api.php",
        params={
            "action": "query",
            "format": "json",
            "titles": title,
            "prop": "extracts",
            "explaintext": True,
        },
    ).json()
    page = next(iter(response["query"]["pages"].values()))
    wiki_text = page["extract"]

    if not data_path.exists():
        Path.mkdir(data_path)

    with open(data_path / f"{title}.txt", "w") as fp:
        fp.write(wiki_text)

In [ ]:

Copied!





import wikipedia
import urllib.request

image_path = Path("data_wiki")
image_uuid = 0
# image_metadata_dict 存储包括图像uuid、文件名和路径在内的图像元数据
image_metadata_dict = {}
MAX_IMAGES_PER_WIKI = 30

wiki_titles = [
    "Air Jordan",
    "San Francisco",
    "Batman",
    "Vincent van Gogh",
    "iPhone",
    "Tesla Model S",
    "BTS band",
]

# 仅为图像创建文件夹
if not image_path.exists():
    Path.mkdir(image_path)

# 为维基页面下载图像
# 为每个图像分配UUID
for title in wiki_titles:
    images_per_wiki = 0
    print(title)
    try:
        page_py = wikipedia.page(title)
        list_img_urls = page_py.images
        for url in list_img_urls:
            if url.endswith(".jpg") or url.endswith(".png"):
                image_uuid += 1
                image_file_name = title + "_" + url.split("/")[-1]

                # img_path 可能是指向未来原始图像文件的s3路径
                image_metadata_dict[image_uuid] = {
                    "filename": image_file_name,
                    "img_path": "./" + str(image_path / f"{image_uuid}.jpg"),
                }
                urllib.request.urlretrieve(
                    url, image_path / f"{image_uuid}.jpg"
                )
                images_per_wiki += 1
                # 限制每个维基页面下载的图像数量为15
                if images_per_wiki > MAX_IMAGES_PER_WIKI:
                    break
    except:
        print(str(Exception("未在维基百科页面找到图像：")) + title)
        continue
import wikipedia
import urllib.request

image_path = Path("data_wiki")
image_uuid = 0
# image_metadata_dict 存储包括图像uuid、文件名和路径在内的图像元数据
image_metadata_dict = {}
MAX_IMAGES_PER_WIKI = 30

wiki_titles = [
    "Air Jordan",
    "San Francisco",
    "Batman",
    "Vincent van Gogh",
    "iPhone",
    "Tesla Model S",
    "BTS band",
]

# 仅为图像创建文件夹
if not image_path.exists():
    Path.mkdir(image_path)

# 为维基页面下载图像
# 为每个图像分配UUID
for title in wiki_titles:
    images_per_wiki = 0
    print(title)
    try:
        page_py = wikipedia.page(title)
        list_img_urls = page_py.images
        for url in list_img_urls:
            if url.endswith(".jpg") or url.endswith(".png"):
                image_uuid += 1
                image_file_name = title + "_" + url.split("/")[-1]

                # img_path 可能是指向未来原始图像文件的s3路径
                image_metadata_dict[image_uuid] = {
                    "filename": image_file_name,
                    "img_path": "./" + str(image_path / f"{image_uuid}.jpg"),
                }
                urllib.request.urlretrieve(
                    url, image_path / f"{image_uuid}.jpg"
                )
                images_per_wiki += 1
                # 限制每个维基页面下载的图像数量为15
                if images_per_wiki > MAX_IMAGES_PER_WIKI:
                    break
    except:
        print(str(Exception("未在维基百科页面找到图像：")) + title)
        continue

构建多模态索引和向量存储，以索引文本和图像¶

In [ ]:

Copied!





import qdrant_client
from llama_index.core import SimpleDirectoryReader
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.core.indices import MultiModalVectorStoreIndex

# 创建一个本地的Qdrant向量存储
client = qdrant_client.QdrantClient(path="qdrant_mm_db")

text_store = QdrantVectorStore(
    client=client, collection_name="text_collection"
)
image_store = QdrantVectorStore(
    client=client, collection_name="image_collection"
)
storage_context = StorageContext.from_defaults(
    vector_store=text_store, image_store=image_store
)

# 创建多模态索引
documents = SimpleDirectoryReader("./data_wiki/").load_data()
index = MultiModalVectorStoreIndex.from_documents(
    documents,
    storage_context=storage_context,
)
import qdrant_client
from llama_index.core import SimpleDirectoryReader
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.core.indices import MultiModalVectorStoreIndex

# 创建一个本地的Qdrant向量存储
client = qdrant_client.QdrantClient(path="qdrant_mm_db")

text_store = QdrantVectorStore(
    client=client, collection_name="text_collection"
)
image_store = QdrantVectorStore(
    client=client, collection_name="image_collection"
)
storage_context = StorageContext.from_defaults(
    vector_store=text_store, image_store=image_store
)

# 创建多模态索引
documents = SimpleDirectoryReader("./data_wiki/").load_data()
index = MultiModalVectorStoreIndex.from_documents(
    documents,
    storage_context=storage_context,
)

In [ ]:

Copied!





from PIL import Image
import matplotlib.pyplot as plt
import os


def plot_images(image_metadata_dict):
    original_images_urls = []
    images_shown = 0
    for image_id in image_metadata_dict:
        img_path = image_metadata_dict[image_id]["img_path"]
        if os.path.isfile(img_path):
            filename = image_metadata_dict[image_id]["filename"]
            image = Image.open(img_path).convert("RGB")

            plt.subplot(8, 8, len(original_images_urls) + 1)
            plt.imshow(image)
            plt.xticks([])
            plt.yticks([])

            original_images_urls.append(filename)
            images_shown += 1
            if images_shown >= 64:
                break

    plt.tight_layout()


plot_images(image_metadata_dict)
from PIL import Image
import matplotlib.pyplot as plt
import os


def plot_images(image_metadata_dict):
    original_images_urls = []
    images_shown = 0
    for image_id in image_metadata_dict:
        img_path = image_metadata_dict[image_id]["img_path"]
        if os.path.isfile(img_path):
            filename = image_metadata_dict[image_id]["filename"]
            image = Image.open(img_path).convert("RGB")

            plt.subplot(8, 8, len(original_images_urls) + 1)
            plt.imshow(image)
            plt.xticks([])
            plt.yticks([])

            original_images_urls.append(filename)
            images_shown += 1
            if images_shown >= 64:
                break

    plt.tight_layout()


plot_images(image_metadata_dict)

使用LlaVa pydantic结构化输出进行多模态RAG检索和查询¶

In [ ]:

Copied!





# 生成检索结果
# 创建一个检索器
retriever = index.as_retriever(similarity_top_k=3, image_similarity_top_k=5)
# 进行检索
retrieval_results = retriever.retrieve(pydantic_response.brand)
# 生成检索结果
# 创建一个检索器
retriever = index.as_retriever(similarity_top_k=3, image_similarity_top_k=5)
# 进行检索
retrieval_results = retriever.retrieve(pydantic_response.brand)

In [ ]:

Copied!





from llama_index.core.response.notebook_utils import (
    display_source_node,
    display_image_uris,
)
from llama_index.core.schema import ImageNode

retrieved_image = []
for res_node in retrieval_results:
    if isinstance(res_node.node, ImageNode):
        retrieved_image.append(res_node.node.metadata["file_path"])
    else:
        display_source_node(res_node, source_length=200)


display_image_uris(retrieved_image)
from llama_index.core.response.notebook_utils import (
    display_source_node,
    display_image_uris,
)
from llama_index.core.schema import ImageNode

retrieved_image = []
for res_node in retrieval_results:
    if isinstance(res_node.node, ImageNode):
        retrieved_image.append(res_node.node.metadata["file_path"])
    else:
        display_source_node(res_node, source_length=200)


display_image_uris(retrieved_image)

Node ID: 88152a5f-b72d-4b97-915d-f4d27a233bc2
Similarity: 0.8671320192132904
Text: Air Jordan is a line of basketball shoes produced by Nike, Inc. Related apparel and accessories are marketed under Jordan Brand. The first Air Jordan shoe was produced for basketball player Michael...

Node ID: b11f9bd5-ad86-4a53-8622-85418e6bb70b
Similarity: 0.8659507837211329
Text: === Jordan "6 Rings" shoe === The Jordan 6 Rings (aka Jordan Six Rings, Or Montells) is a combination of the seven Air Jordan shoes that Michael Jordan wore during his 6 championship seasons. That ...

Node ID: 0d7d5283-d4d9-476b-a4cd-81d9949cc60b
Similarity: 0.8545030450835027
Text: === Sneaker collecting ===

The shoes have had a large impact on the rise of "sneakerhead" culture. In the 1980s, collecting sneakers became more common, as well as trading and reselling them. As n...

使用检索到的文本和图像合成RAG结果¶

In [ ]:

Copied!





from llama_index.core import PromptTemplate
from llama_index.core.query_engine import SimpleMultiModalQueryEngine

qa_tmpl_str = (
    "Context information is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Given the context information and not prior knowledge, "
    "answer the query.\n"
    "Query: {query_str}\n"
    "Answer: "
)
qa_tmpl = PromptTemplate(qa_tmpl_str)

query_engine = index.as_query_engine(
    multi_modal_llm=llava_multi_modal_llm,
    text_qa_template=qa_tmpl,
    similarity_top_k=2,
    image_similarity_top_k=1,
)

query_str = "Tell me more about the " + pydantic_response.brand + " brand."
response = query_engine.query(query_str)
from llama_index.core import PromptTemplate
from llama_index.core.query_engine import SimpleMultiModalQueryEngine

qa_tmpl_str = (
    "Context information is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Given the context information and not prior knowledge, "
    "answer the query.\n"
    "Query: {query_str}\n"
    "Answer: "
)
qa_tmpl = PromptTemplate(qa_tmpl_str)

query_engine = index.as_query_engine(
    multi_modal_llm=llava_multi_modal_llm,
    text_qa_template=qa_tmpl,
    similarity_top_k=2,
    image_similarity_top_k=1,
)

query_str = "Tell me more about the " + pydantic_response.brand + " brand."
response = query_engine.query(query_str)

In [ ]:

Copied!

print(response)
print(response)

The Air Jordan brand is a line of basketball shoes produced by Nike, Inc. It was created for Michael Jordan, a basketball player who played for the Chicago Bulls during the 1980s and 1990s. The first Air Jordan shoe was released in 1985, and it has since become one of the most iconic and successful shoe lines in history. The shoes are known for their distinctive design, high-quality materials, and innovative technology, which has helped to establish the Air Jordan brand as a leader in the athletic footwear industry. The brand has also expanded to include apparel, accessories, and other products, and has become a cultural phenomenon, with a significant impact on fashion, music, and popular culture.