模型缓存

本笔记本介绍了如何使用不同的缓存来缓存单个LLM调用的结果。

首先，让我们安装一些依赖项

%pip install -qU langchain-openai langchain-community

import os
from getpass import getpass

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass()

from langchain.globals import set_llm_cache
from langchain_openai import OpenAI

# To make the caching really obvious, lets use a slower and older model.
# Caching supports newer chat models as well.
llm = OpenAI(model="gpt-3.5-turbo-instruct", n=2, best_of=2)

API Reference:set_llm_cache | OpenAI

`In Memory` 缓存

from langchain_community.cache import InMemoryCache

set_llm_cache(InMemoryCache())

API Reference:InMemoryCache

%%time
# The first time, it is not yet in cache, so it should take longer
llm.invoke("Tell me a joke")

CPU times: user 7.57 ms, sys: 8.22 ms, total: 15.8 ms
Wall time: 649 ms

"\n\nWhy couldn't the bicycle stand up by itself? Because it was two-tired!"

%%time
# The second time it is, so it goes faster
llm.invoke("Tell me a joke")

CPU times: user 551 µs, sys: 221 µs, total: 772 µs
Wall time: 1.23 ms

"\n\nWhy couldn't the bicycle stand up by itself? Because it was two-tired!"

`SQLite` 缓存

!rm .langchain.db

# We can do the same thing with a SQLite cache
from langchain_community.cache import SQLiteCache

set_llm_cache(SQLiteCache(database_path=".langchain.db"))

API Reference:SQLiteCache

%%time
# The first time, it is not yet in cache, so it should take longer
llm.invoke("Tell me a joke")

CPU times: user 12.6 ms, sys: 3.51 ms, total: 16.1 ms
Wall time: 486 ms

"\n\nWhy couldn't the bicycle stand up by itself? Because it was two-tired!"

%%time
# The second time it is, so it goes faster
llm.invoke("Tell me a joke")

CPU times: user 52.6 ms, sys: 57.7 ms, total: 110 ms
Wall time: 113 ms

"\n\nWhy couldn't the bicycle stand up by itself? Because it was two-tired!"

`Upstash Redis` 缓存

标准缓存

使用 Upstash Redis 通过无服务器 HTTP API 缓存提示和响应。

%pip install -qU upstash_redis

import langchain
from langchain_community.cache import UpstashRedisCache
from upstash_redis import Redis

URL = "<UPSTASH_REDIS_REST_URL>"
TOKEN = "<UPSTASH_REDIS_REST_TOKEN>"

langchain.llm_cache = UpstashRedisCache(redis_=Redis(url=URL, token=TOKEN))

API Reference:UpstashRedisCache

%%time
# The first time, it is not yet in cache, so it should take longer
llm.invoke("Tell me a joke")

CPU times: user 7.56 ms, sys: 2.98 ms, total: 10.5 ms
Wall time: 1.14 s

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'

%%time
# The second time it is, so it goes faster
llm.invoke("Tell me a joke")

CPU times: user 2.78 ms, sys: 1.95 ms, total: 4.73 ms
Wall time: 82.9 ms

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'

语义缓存

使用Upstash Vector进行语义相似性搜索，并将最相似的响应缓存在数据库中。向量化在创建Upstash Vector数据库时由所选的嵌入模型自动完成。

%pip install upstash-semantic-cache

from langchain.globals import set_llm_cache
from upstash_semantic_cache import SemanticCache

API Reference:set_llm_cache

UPSTASH_VECTOR_REST_URL = "<UPSTASH_VECTOR_REST_URL>"
UPSTASH_VECTOR_REST_TOKEN = "<UPSTASH_VECTOR_REST_TOKEN>"

cache = SemanticCache(
    url=UPSTASH_VECTOR_REST_URL, token=UPSTASH_VECTOR_REST_TOKEN, min_proximity=0.7
)

set_llm_cache(cache)

%%time
llm.invoke("Which city is the most crowded city in the USA?")

CPU times: user 28.4 ms, sys: 3.93 ms, total: 32.3 ms
Wall time: 1.89 s

'\n\nNew York City is the most crowded city in the USA.'

%%time
llm.invoke("Which city has the highest population in the USA?")

CPU times: user 3.22 ms, sys: 940 μs, total: 4.16 ms
Wall time: 97.7 ms

'\n\nNew York City is the most crowded city in the USA.'

`Redis` 缓存

详情请参阅主要的Redis缓存文档。

标准缓存

使用 Redis 来缓存提示和响应。

%pip install -qU redis

# We can do the same thing with a Redis cache
# (make sure your local Redis instance is running first before running this example)
from langchain_community.cache import RedisCache
from redis import Redis

set_llm_cache(RedisCache(redis_=Redis()))

API Reference:RedisCache

%%time
# The first time, it is not yet in cache, so it should take longer
llm.invoke("Tell me a joke")

CPU times: user 6.88 ms, sys: 8.75 ms, total: 15.6 ms
Wall time: 1.04 s

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'

%%time
# The second time it is, so it goes faster
llm.invoke("Tell me a joke")

CPU times: user 1.59 ms, sys: 610 µs, total: 2.2 ms
Wall time: 5.58 ms

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'

语义缓存

使用Redis来缓存提示和响应，并根据语义相似性评估命中率。

%pip install -qU redis

from langchain_community.cache import RedisSemanticCache
from langchain_openai import OpenAIEmbeddings

set_llm_cache(
    RedisSemanticCache(redis_url="redis://localhost:6379", embedding=OpenAIEmbeddings())
)

API Reference:RedisSemanticCache | OpenAIEmbeddings

%%time
# The first time, it is not yet in cache, so it should take longer
llm.invoke("Tell me a joke")

CPU times: user 351 ms, sys: 156 ms, total: 507 ms
Wall time: 3.37 s

"\n\nWhy don't scientists trust atoms?\nBecause they make up everything."

%%time
# The second time, while not a direct hit, the question is semantically similar to the original question,
# so it uses the cached result!
llm.invoke("Tell me one joke")

CPU times: user 6.25 ms, sys: 2.72 ms, total: 8.97 ms
Wall time: 262 ms

"\n\nWhy don't scientists trust atoms?\nBecause they make up everything."

`GPTCache`

我们可以使用GPTCache进行精确匹配缓存或基于语义相似性缓存结果

让我们首先从一个精确匹配的例子开始

%pip install -qU gptcache

import hashlib

from gptcache import Cache
from gptcache.manager.factory import manager_factory
from gptcache.processor.pre import get_prompt
from langchain_community.cache import GPTCache


def get_hashed_name(name):
    return hashlib.sha256(name.encode()).hexdigest()


def init_gptcache(cache_obj: Cache, llm: str):
    hashed_llm = get_hashed_name(llm)
    cache_obj.init(
        pre_embedding_func=get_prompt,
        data_manager=manager_factory(manager="map", data_dir=f"map_cache_{hashed_llm}"),
    )


set_llm_cache(GPTCache(init_gptcache))

API Reference:GPTCache

%%time
# The first time, it is not yet in cache, so it should take longer
llm.invoke("Tell me a joke")

CPU times: user 21.5 ms, sys: 21.3 ms, total: 42.8 ms
Wall time: 6.2 s

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'

%%time
# The second time it is, so it goes faster
llm.invoke("Tell me a joke")

CPU times: user 571 µs, sys: 43 µs, total: 614 µs
Wall time: 635 µs

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'

现在让我们展示一个相似性缓存的例子

import hashlib

from gptcache import Cache
from gptcache.adapter.api import init_similar_cache
from langchain_community.cache import GPTCache

def get_hashed_name(name):
    return hashlib.sha256(name.encode()).hexdigest()

def init_gptcache(cache_obj: Cache, llm: str):
    hashed_llm = get_hashed_name(llm)
    init_similar_cache(cache_obj=cache_obj, data_dir=f"similar_cache_{hashed_llm}")

set_llm_cache(GPTCache(init_gptcache))

API Reference:GPTCache

%%time
# The first time, it is not yet in cache, so it should take longer
llm.invoke("Tell me a joke")

CPU times: user 1.42 s, sys: 279 ms, total: 1.7 s
Wall time: 8.44 s

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

%%time
# This is an exact match, so it finds it in the cache
llm.invoke("Tell me a joke")

CPU times: user 866 ms, sys: 20 ms, total: 886 ms
Wall time: 226 ms

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

%%time
# This is not an exact match, but semantically within distance so it hits!
llm.invoke("Tell me joke")

CPU times: user 853 ms, sys: 14.8 ms, total: 868 ms
Wall time: 224 ms

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

`MongoDB Atlas` 缓存

MongoDB Atlas 是一个完全托管的云数据库，可在 AWS、Azure 和 GCP 中使用。它对 MongoDB 文档数据的向量搜索有原生支持。使用 MongoDB Atlas 向量搜索来语义缓存提示和响应。

标准缓存

标准缓存是MongoDB中的一个简单缓存。它不使用语义缓存，也不需要在生成之前在集合上创建索引。

要导入此缓存，首先安装所需的依赖项：

%pip install -qU langchain-mongodb

from langchain_mongodb.cache import MongoDBCache

API Reference:MongoDBCache

要将此缓存与您的LLMs一起使用：

from langchain_core.globals import set_llm_cache

# use any embedding provider...
from tests.integration_tests.vectorstores.fake_embeddings import FakeEmbeddings

mongodb_atlas_uri = "<YOUR_CONNECTION_STRING>"
COLLECTION_NAME="<YOUR_CACHE_COLLECTION_NAME>"
DATABASE_NAME="<YOUR_DATABASE_NAME>"

set_llm_cache(MongoDBCache(
    connection_string=mongodb_atlas_uri,
    collection_name=COLLECTION_NAME,
    database_name=DATABASE_NAME,
))

API Reference:set_llm_cache

语义缓存

语义缓存允许基于用户输入与先前缓存结果之间的语义相似性来检索缓存的提示。在底层，它将MongoDBAtlas同时作为缓存和向量存储使用。 MongoDBAtlasSemanticCache继承自MongoDBAtlasVectorSearch，并且需要一个定义的Atlas向量搜索索引才能工作。请查看使用示例以了解如何设置索引。

要导入此缓存：

from langchain_mongodb.cache import MongoDBAtlasSemanticCache

API Reference:MongoDBAtlasSemanticCache

要将此缓存与您的LLMs一起使用：

from langchain_core.globals import set_llm_cache

# use any embedding provider...
from tests.integration_tests.vectorstores.fake_embeddings import FakeEmbeddings

mongodb_atlas_uri = "<YOUR_CONNECTION_STRING>"
COLLECTION_NAME="<YOUR_CACHE_COLLECTION_NAME>"
DATABASE_NAME="<YOUR_DATABASE_NAME>"

set_llm_cache(MongoDBAtlasSemanticCache(
    embedding=FakeEmbeddings(),
    connection_string=mongodb_atlas_uri,
    collection_name=COLLECTION_NAME,
    database_name=DATABASE_NAME,
))

API Reference:set_llm_cache

要查找更多关于使用MongoDBSemanticCache的资源，请访问这里

`Momento` 缓存

使用 Momento 来缓存提示和响应。

需要安装 momento 包：

%pip install -qU momento

你需要获取一个Momento认证令牌来使用这个类。如果你想要直接实例化momento.CacheClient，可以将它作为命名参数auth_token传递给MomentoChatMessageHistory.from_client_params，或者直接设置为环境变量MOMENTO_AUTH_TOKEN。

from datetime import timedelta

from langchain_community.cache import MomentoCache

cache_name = "langchain"
ttl = timedelta(days=1)
set_llm_cache(MomentoCache.from_client_params(cache_name, ttl))

API Reference:MomentoCache

%%time
# The first time, it is not yet in cache, so it should take longer
llm.invoke("Tell me a joke")

CPU times: user 40.7 ms, sys: 16.5 ms, total: 57.2 ms
Wall time: 1.73 s

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'

%%time
# The second time it is, so it goes faster
# When run in the same region as the cache, latencies are single digit ms
llm.invoke("Tell me a joke")

CPU times: user 3.16 ms, sys: 2.98 ms, total: 6.14 ms
Wall time: 57.9 ms

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'

`SQLAlchemy` 缓存

你可以使用SQLAlchemyCache来缓存任何由SQLAlchemy支持的SQL数据库。

标准缓存

from langchain.cache import SQLAlchemyCache
from sqlalchemy import create_engine

engine = create_engine("postgresql://postgres:postgres@localhost:5432/postgres")
set_llm_cache(SQLAlchemyCache(engine))

API Reference:SQLAlchemyCache

自定义 SQLAlchemy 模式

您可以定义自己的声明式SQLAlchemyCache子类来自定义用于缓存的模式。例如，为了支持使用Postgres进行高速全文提示索引，请使用：

from langchain_community.cache import SQLAlchemyCache
from sqlalchemy import Column, Computed, Index, Integer, Sequence, String, create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy_utils import TSVectorType

Base = declarative_base()


class FulltextLLMCache(Base):  # type: ignore
    """Postgres table for fulltext-indexed LLM Cache"""

    __tablename__ = "llm_cache_fulltext"
    id = Column(Integer, Sequence("cache_id"), primary_key=True)
    prompt = Column(String, nullable=False)
    llm = Column(String, nullable=False)
    idx = Column(Integer)
    response = Column(String)
    prompt_tsv = Column(
        TSVectorType(),
        Computed("to_tsvector('english', llm || ' ' || prompt)", persisted=True),
    )
    __table_args__ = (
        Index("idx_fulltext_prompt_tsv", prompt_tsv, postgresql_using="gin"),
    )


engine = create_engine("postgresql://postgres:postgres@localhost:5432/postgres")
set_llm_cache(SQLAlchemyCache(engine, FulltextLLMCache))

API Reference:SQLAlchemyCache

`Cassandra` 缓存

Apache Cassandra® 是一个面向行的、高度可扩展且高度可用的NoSQL数据库。从5.0版本开始，该数据库提供了向量搜索功能。

你可以使用Cassandra来缓存LLM响应，选择精确匹配的CassandraCache或基于向量相似度的CassandraSemanticCache。

让我们看看两者的实际应用。接下来的单元格将引导您完成（少量）所需的设置，随后的单元格将展示两种可用的缓存类。

所需依赖项：

%pip install -qU "cassio>=0.1.4"

连接到数据库

本页面展示的Cassandra缓存可以与Cassandra以及其他使用CQL（Cassandra查询语言）协议的衍生数据库（如Astra DB）一起使用。

DataStax Astra DB 是一个基于Cassandra构建的托管无服务器数据库，提供相同的接口和优势。

根据您是通过CQL连接到Cassandra集群还是Astra DB，在实例化缓存时（通过初始化CassIO连接）您将提供不同的参数。

连接到 Cassandra 集群

首先，您需要创建一个cassandra.cluster.Session对象，如Cassandra驱动文档中所述。具体细节可能有所不同（例如网络设置和身份验证），但这可能类似于：

from cassandra.cluster import Cluster

cluster = Cluster(["127.0.0.1"])
session = cluster.connect()

您现在可以将会话以及您所需的键空间名称设置为全局的CassIO参数：

import cassio

CASSANDRA_KEYSPACE = input("CASSANDRA_KEYSPACE = ")

cassio.init(session=session, keyspace=CASSANDRA_KEYSPACE)

CASSANDRA_KEYSPACE =  demo_keyspace

通过CQL连接到Astra DB

在这种情况下，您可以使用以下连接参数初始化 CassIO：

数据库ID，例如 01234567-89ab-cdef-0123-456789abcdef
令牌，例如 AstraCS:6gBhNmsk135....（必须是“数据库管理员”令牌）
可选的Keyspace名称（如果省略，将使用数据库的默认名称）

import getpass

ASTRA_DB_ID = input("ASTRA_DB_ID = ")
ASTRA_DB_APPLICATION_TOKEN = getpass.getpass("ASTRA_DB_APPLICATION_TOKEN = ")

desired_keyspace = input("ASTRA_DB_KEYSPACE (optional, can be left empty) = ")
if desired_keyspace:
    ASTRA_DB_KEYSPACE = desired_keyspace
else:
    ASTRA_DB_KEYSPACE = None

ASTRA_DB_ID =  01234567-89ab-cdef-0123-456789abcdef
ASTRA_DB_APPLICATION_TOKEN =  ········
ASTRA_DB_KEYSPACE (optional, can be left empty) =  my_keyspace

import cassio

cassio.init(
    database_id=ASTRA_DB_ID,
    token=ASTRA_DB_APPLICATION_TOKEN,
    keyspace=ASTRA_DB_KEYSPACE,
)

标准缓存

这将避免在提供的提示与已经遇到的提示完全相同时调用LLM：

from langchain_community.cache import CassandraCache
from langchain_core.globals import set_llm_cache

set_llm_cache(CassandraCache())

API Reference:CassandraCache | set_llm_cache

%%time

print(llm.invoke("Why is the Moon always showing the same side?"))

The Moon is tidally locked with the Earth, which means that its rotation on its own axis is synchronized with its orbit around the Earth. This results in the Moon always showing the same side to the Earth. This is because the gravitational forces between the Earth and the Moon have caused the Moon's rotation to slow down over time, until it reached a point where it takes the same amount of time for the Moon to rotate on its axis as it does to orbit around the Earth. This phenomenon is common among satellites in close orbits around their parent planets and is known as tidal locking.
CPU times: user 92.5 ms, sys: 8.89 ms, total: 101 ms
Wall time: 1.98 s

%%time

print(llm.invoke("Why is the Moon always showing the same side?"))

The Moon is tidally locked with the Earth, which means that its rotation on its own axis is synchronized with its orbit around the Earth. This results in the Moon always showing the same side to the Earth. This is because the gravitational forces between the Earth and the Moon have caused the Moon's rotation to slow down over time, until it reached a point where it takes the same amount of time for the Moon to rotate on its axis as it does to orbit around the Earth. This phenomenon is common among satellites in close orbits around their parent planets and is known as tidal locking.
CPU times: user 5.51 ms, sys: 0 ns, total: 5.51 ms
Wall time: 5.78 ms

语义缓存

此缓存将执行语义相似性搜索，并在找到足够相似的缓存条目时返回命中。为此，您需要提供一个您选择的Embeddings实例。

from langchain_openai import OpenAIEmbeddings

embedding = OpenAIEmbeddings()

API Reference:OpenAIEmbeddings

from langchain_community.cache import CassandraSemanticCache
from langchain_core.globals import set_llm_cache

set_llm_cache(
    CassandraSemanticCache(
        embedding=embedding,
        table_name="my_semantic_cache",
    )
)

API Reference:CassandraSemanticCache | set_llm_cache

%%time

print(llm.invoke("Why is the Moon always showing the same side?"))

The Moon is always showing the same side because of a phenomenon called synchronous rotation. This means that the Moon rotates on its axis at the same rate that it orbits around the Earth, which takes approximately 27.3 days. This results in the same side of the Moon always facing the Earth. This is due to the gravitational forces between the Earth and the Moon, which have caused the Moon's rotation to gradually slow down and become synchronized with its orbit. This is a common occurrence among many moons in our solar system.
CPU times: user 49.5 ms, sys: 7.38 ms, total: 56.9 ms
Wall time: 2.55 s

%%time

print(llm.invoke("How come we always see one face of the moon?"))

The Moon is always showing the same side because of a phenomenon called synchronous rotation. This means that the Moon rotates on its axis at the same rate that it orbits around the Earth, which takes approximately 27.3 days. This results in the same side of the Moon always facing the Earth. This is due to the gravitational forces between the Earth and the Moon, which have caused the Moon's rotation to gradually slow down and become synchronized with its orbit. This is a common occurrence among many moons in our solar system.
CPU times: user 21.2 ms, sys: 3.38 ms, total: 24.6 ms
Wall time: 532 ms

归属声明：

Apache Cassandra、Cassandra 和 Apache 是 Apache Software Foundation 在美国和/或其他国家的注册商标或商标。

`Astra DB` 缓存

您可以轻松使用Astra DB作为LLM缓存，无论是“精确”缓存还是“基于语义”的缓存。

确保你有一个正在运行的数据库（必须是一个支持Vector的数据库才能使用语义缓存），并在你的Astra仪表板上获取所需的凭据：

API 端点看起来像 https://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.com
令牌看起来像 AstraCS:6gBhNmsk135....

%pip install -qU langchain_astradb

import getpass

ASTRA_DB_API_ENDPOINT = input("ASTRA_DB_API_ENDPOINT = ")
ASTRA_DB_APPLICATION_TOKEN = getpass.getpass("ASTRA_DB_APPLICATION_TOKEN = ")

ASTRA_DB_API_ENDPOINT =  https://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.com
ASTRA_DB_APPLICATION_TOKEN =  ········

标准缓存

这将避免在提供的提示与已经遇到的提示完全相同时调用LLM：

from langchain.globals import set_llm_cache
from langchain_astradb import AstraDBCache

set_llm_cache(
    AstraDBCache(
        api_endpoint=ASTRA_DB_API_ENDPOINT,
        token=ASTRA_DB_APPLICATION_TOKEN,
    )
)

API Reference:set_llm_cache | AstraDBCache

%%time

print(llm.invoke("Is a true fakery the same as a fake truth?"))

There is no definitive answer to this question as it depends on the interpretation of the terms "true fakery" and "fake truth". However, one possible interpretation is that a true fakery is a counterfeit or imitation that is intended to deceive, whereas a fake truth is a false statement that is presented as if it were true.
CPU times: user 70.8 ms, sys: 4.13 ms, total: 74.9 ms
Wall time: 2.06 s

%%time

print(llm.invoke("Is a true fakery the same as a fake truth?"))

There is no definitive answer to this question as it depends on the interpretation of the terms "true fakery" and "fake truth". However, one possible interpretation is that a true fakery is a counterfeit or imitation that is intended to deceive, whereas a fake truth is a false statement that is presented as if it were true.
CPU times: user 15.1 ms, sys: 3.7 ms, total: 18.8 ms
Wall time: 531 ms

语义缓存

此缓存将执行语义相似性搜索，并在找到足够相似的缓存条目时返回命中。为此，您需要提供一个您选择的Embeddings实例。

from langchain_openai import OpenAIEmbeddings

embedding = OpenAIEmbeddings()

API Reference:OpenAIEmbeddings

from langchain_astradb import AstraDBSemanticCache

set_llm_cache(
    AstraDBSemanticCache(
        api_endpoint=ASTRA_DB_API_ENDPOINT,
        token=ASTRA_DB_APPLICATION_TOKEN,
        embedding=embedding,
        collection_name="demo_semantic_cache",
    )
)

API Reference:AstraDBSemanticCache

%%time

print(llm.invoke("Are there truths that are false?"))

There is no definitive answer to this question since it presupposes a great deal about the nature of truth itself, which is a matter of considerable philosophical debate. It is possible, however, to construct scenarios in which something could be considered true despite being false, such as if someone sincerely believes something to be true even though it is not.
CPU times: user 65.6 ms, sys: 15.3 ms, total: 80.9 ms
Wall time: 2.72 s

%%time

print(llm.invoke("Is is possible that something false can be also true?"))

There is no definitive answer to this question since it presupposes a great deal about the nature of truth itself, which is a matter of considerable philosophical debate. It is possible, however, to construct scenarios in which something could be considered true despite being false, such as if someone sincerely believes something to be true even though it is not.
CPU times: user 29.3 ms, sys: 6.21 ms, total: 35.5 ms
Wall time: 1.03 s

`Azure Cosmos DB` 语义缓存

您可以使用这个集成的向量数据库进行缓存。

from langchain_community.cache import AzureCosmosDBSemanticCache
from langchain_community.vectorstores.azure_cosmos_db import (
    CosmosDBSimilarityType,
    CosmosDBVectorSearchType,
)
from langchain_openai import OpenAIEmbeddings

# Read more about Azure CosmosDB Mongo vCore vector search here https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search

NAMESPACE = "langchain_test_db.langchain_test_collection"
CONNECTION_STRING = (
    "Please provide your azure cosmos mongo vCore vector db connection string"
)

DB_NAME, COLLECTION_NAME = NAMESPACE.split(".")

# Default value for these params
num_lists = 3
dimensions = 1536
similarity_algorithm = CosmosDBSimilarityType.COS
kind = CosmosDBVectorSearchType.VECTOR_IVF
m = 16
ef_construction = 64
ef_search = 40
score_threshold = 0.9
application_name = "LANGCHAIN_CACHING_PYTHON"


set_llm_cache(
    AzureCosmosDBSemanticCache(
        cosmosdb_connection_string=CONNECTION_STRING,
        cosmosdb_client=None,
        embedding=OpenAIEmbeddings(),
        database_name=DB_NAME,
        collection_name=COLLECTION_NAME,
        num_lists=num_lists,
        similarity=similarity_algorithm,
        kind=kind,
        dimensions=dimensions,
        m=m,
        ef_construction=ef_construction,
        ef_search=ef_search,
        score_threshold=score_threshold,
        application_name=application_name,
    )
)

API Reference:AzureCosmosDBSemanticCache | CosmosDBSimilarityType | CosmosDBVectorSearchType | OpenAIEmbeddings

%%time
# The first time, it is not yet in cache, so it should take longer
llm.invoke("Tell me a joke")

CPU times: user 45.6 ms, sys: 19.7 ms, total: 65.3 ms
Wall time: 2.29 s

'\n\nWhy was the math book sad? Because it had too many problems.'

%%time
# The second time it is, so it goes faster
llm.invoke("Tell me a joke")

CPU times: user 9.61 ms, sys: 3.42 ms, total: 13 ms
Wall time: 474 ms

'\n\nWhy was the math book sad? Because it had too many problems.'

`Azure Cosmos DB NoSql` 语义缓存

您可以使用这个集成的向量数据库进行缓存。

from typing import Any, Dict

from azure.cosmos import CosmosClient, PartitionKey
from langchain_community.cache import AzureCosmosDBNoSqlSemanticCache
from langchain_openai import OpenAIEmbeddings

HOST = "COSMOS_DB_URI"
KEY = "COSMOS_DB_KEY"

cosmos_client = CosmosClient(HOST, KEY)


def get_vector_indexing_policy() -> dict:
    return {
        "indexingMode": "consistent",
        "includedPaths": [{"path": "/*"}],
        "excludedPaths": [{"path": '/"_etag"/?'}],
        "vectorIndexes": [{"path": "/embedding", "type": "diskANN"}],
    }


def get_vector_embedding_policy() -> dict:
    return {
        "vectorEmbeddings": [
            {
                "path": "/embedding",
                "dataType": "float32",
                "dimensions": 1536,
                "distanceFunction": "cosine",
            }
        ]
    }


cosmos_container_properties_test = {"partition_key": PartitionKey(path="/id")}
cosmos_database_properties_test: Dict[str, Any] = {}

set_llm_cache(
    AzureCosmosDBNoSqlSemanticCache(
        cosmos_client=cosmos_client,
        embedding=OpenAIEmbeddings(),
        vector_embedding_policy=get_vector_embedding_policy(),
        indexing_policy=get_vector_indexing_policy(),
        cosmos_container_properties=cosmos_container_properties_test,
        cosmos_database_properties=cosmos_database_properties_test,
    )
)

API Reference:AzureCosmosDBNoSqlSemanticCache | OpenAIEmbeddings

%%time
# The first time, it is not yet in cache, so it should take longer
llm.invoke("Tell me a joke")

CPU times: user 374 ms, sys: 34.2 ms, total: 408 ms
Wall time: 3.15 s

"\n\nWhy couldn't the bicycle stand up by itself? Because it was two-tired!"

%%time
# The second time it is, so it goes faster
llm.invoke("Tell me a joke")

CPU times: user 17.7 ms, sys: 2.88 ms, total: 20.6 ms
Wall time: 373 ms

"\n\nWhy couldn't the bicycle stand up by itself? Because it was two-tired!"

`Elasticsearch` 缓存

一个使用Elasticsearch的LLMs缓存层。

首先安装与Elasticsearch集成的LangChain。

%pip install -qU langchain-elasticsearch

标准缓存

使用类 ElasticsearchCache。

简单示例：

from langchain.globals import set_llm_cache
from langchain_elasticsearch import ElasticsearchCache

set_llm_cache(
    ElasticsearchCache(
        es_url="http://localhost:9200",
        index_name="llm-chat-cache",
        metadata={"project": "my_chatgpt_project"},
    )
)

API Reference:set_llm_cache | ElasticsearchCache

index_name 参数也可以接受别名。这允许使用我们建议考虑的 ILM: 管理索引生命周期来管理保留和控制缓存增长。

查看类的文档字符串以获取所有参数。

索引生成的文本

默认情况下，缓存的数据将不可搜索。开发者可以自定义Elasticsearch文档的构建，以便添加索引文本字段，例如，放置由LLM生成的文本。

这可以通过子类化和重写方法来完成。新的缓存类也可以应用于预先存在的缓存索引：

import json
from typing import Any, Dict, List

from langchain.globals import set_llm_cache
from langchain_core.caches import RETURN_VAL_TYPE
from langchain_elasticsearch import ElasticsearchCache


class SearchableElasticsearchCache(ElasticsearchCache):
    @property
    def mapping(self) -> Dict[str, Any]:
        mapping = super().mapping
        mapping["mappings"]["properties"]["parsed_llm_output"] = {
            "type": "text",
            "analyzer": "english",
        }
        return mapping

    def build_document(
        self, prompt: str, llm_string: str, return_val: RETURN_VAL_TYPE
    ) -> Dict[str, Any]:
        body = super().build_document(prompt, llm_string, return_val)
        body["parsed_llm_output"] = self._parse_output(body["llm_output"])
        return body

    @staticmethod
    def _parse_output(data: List[str]) -> List[str]:
        return [
            json.loads(output)["kwargs"]["message"]["kwargs"]["content"]
            for output in data
        ]


set_llm_cache(
    SearchableElasticsearchCache(
        es_url="http://localhost:9200", index_name="llm-chat-cache"
    )
)

API Reference:set_llm_cache | RETURN_VAL_TYPE | ElasticsearchCache

在重写映射和文档构建时，请仅进行添加性修改，保持基础映射不变。

嵌入缓存

用于缓存嵌入的Elasticsearch存储。

from langchain_elasticsearch import ElasticsearchEmbeddingsCache

API Reference:ElasticsearchEmbeddingsCache

LLM特定的可选缓存

你也可以为特定的LLMs关闭缓存。在下面的例子中，即使全局缓存已启用，我们也会为特定的LLM关闭它。

llm = OpenAI(model="gpt-3.5-turbo-instruct", n=2, best_of=2, cache=False)

%%time
llm.invoke("Tell me a joke")

CPU times: user 5.8 ms, sys: 2.71 ms, total: 8.51 ms
Wall time: 745 ms

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'

%%time
llm.invoke("Tell me a joke")

CPU times: user 4.91 ms, sys: 2.64 ms, total: 7.55 ms
Wall time: 623 ms

'\n\nTwo guys stole a calendar. They got six months each.'

链中的可选缓存

你也可以关闭链中特定节点的缓存。请注意，由于某些接口的原因，通常更容易先构建链，然后再编辑LLM。

例如，我们将加载一个摘要器 map-reduce 链。我们将缓存 map 步骤的结果，但不会在 combine 步骤中冻结它。

llm = OpenAI(model="gpt-3.5-turbo-instruct")
no_cache_llm = OpenAI(model="gpt-3.5-turbo-instruct", cache=False)

from langchain_text_splitters import CharacterTextSplitter

text_splitter = CharacterTextSplitter()

API Reference:CharacterTextSplitter

with open("../how_to/state_of_the_union.txt") as f:
    state_of_the_union = f.read()
texts = text_splitter.split_text(state_of_the_union)

from langchain_core.documents import Document

docs = [Document(page_content=t) for t in texts[:3]]
from langchain.chains.summarize import load_summarize_chain

API Reference:Document | load_summarize_chain

chain = load_summarize_chain(llm, chain_type="map_reduce", reduce_llm=no_cache_llm)

%%time
chain.invoke(docs)

CPU times: user 176 ms, sys: 23.2 ms, total: 199 ms
Wall time: 4.42 s

{'input_documents': [Document(page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.  \n\nLast year COVID-19 kept us apart. This year we are finally together again. \n\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n\nWith a duty to one another to the American people to the Constitution. \n\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \n\nSix days ago, Russia’s Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \n\nHe met the Ukrainian people. \n\nFrom President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world. \n\nGroups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland. \n\nIn this struggle as President Zelenskyy said in his speech to the European Parliament “Light will win over darkness.” The Ukrainian Ambassador to the United States is here tonight. \n\nLet each of us here tonight in this Chamber send an unmistakable signal to Ukraine and to the world. \n\nPlease rise if you are able and show that, Yes, we the United States of America stand with the Ukrainian people. \n\nThroughout our history we’ve learned this lesson when dictators do not pay a price for their aggression they cause more chaos.   \n\nThey keep moving.   \n\nAnd the costs and the threats to America and the world keep rising.   \n\nThat’s why the NATO Alliance was created to secure peace and stability in Europe after World War 2. \n\nThe United States is a member along with 29 other nations. \n\nIt matters. American diplomacy matters. American resolve matters. \n\nPutin’s latest attack on Ukraine was premeditated and unprovoked. \n\nHe rejected repeated efforts at diplomacy. \n\nHe thought the West and NATO wouldn’t respond. And he thought he could divide us at home. Putin was wrong. We were ready.  Here is what we did.   \n\nWe prepared extensively and carefully. \n\nWe spent months building a coalition of other freedom-loving nations from Europe and the Americas to Asia and Africa to confront Putin. \n\nI spent countless hours unifying our European allies. We shared with the world in advance what we knew Putin was planning and precisely how he would try to falsely justify his aggression.  \n\nWe countered Russia’s lies with truth.   \n\nAnd now that he has acted the free world is holding him accountable. \n\nAlong with twenty-seven members of the European Union including France, Germany, Italy, as well as countries like the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and many others, even Switzerland. \n\nWe are inflicting pain on Russia and supporting the people of Ukraine. Putin is now isolated from the world more than ever. \n\nTogether with our allies –we are right now enforcing powerful economic sanctions. \n\nWe are cutting off Russia’s largest banks from the international financial system.  \n\nPreventing Russia’s central bank from defending the Russian Ruble making Putin’s $630 Billion “war fund” worthless.   \n\nWe are choking off Russia’s access to technology that will sap its economic strength and weaken its military for years to come.  \n\nTonight I say to the Russian oligarchs and corrupt leaders who have bilked billions of dollars off this violent regime no more. \n\nThe U.S. Department of Justice is assembling a dedicated task force to go after the crimes of Russian oligarchs.  \n\nWe are joining with our European allies to find and seize your yachts your luxury apartments your private jets. We are coming for your ill-begotten gains.'),
  Document(page_content='We are joining with our European allies to find and seize your yachts your luxury apartments your private jets. We are coming for your ill-begotten gains. \n\nAnd tonight I am announcing that we will join our allies in closing off American air space to all Russian flights – further isolating Russia – and adding an additional squeeze –on their economy. The Ruble has lost 30% of its value. \n\nThe Russian stock market has lost 40% of its value and trading remains suspended. Russia’s economy is reeling and Putin alone is to blame. \n\nTogether with our allies we are providing support to the Ukrainians in their fight for freedom. Military assistance. Economic assistance. Humanitarian assistance. \n\nWe are giving more than $1 Billion in direct assistance to Ukraine. \n\nAnd we will continue to aid the Ukrainian people as they defend their country and to help ease their suffering.  \n\nLet me be clear, our forces are not engaged and will not engage in conflict with Russian forces in Ukraine.  \n\nOur forces are not going to Europe to fight in Ukraine, but to defend our NATO Allies – in the event that Putin decides to keep moving west.  \n\nFor that purpose we’ve mobilized American ground forces, air squadrons, and ship deployments to protect NATO countries including Poland, Romania, Latvia, Lithuania, and Estonia. \n\nAs I have made crystal clear the United States and our Allies will defend every inch of territory of NATO countries with the full force of our collective power.  \n\nAnd we remain clear-eyed. The Ukrainians are fighting back with pure courage. But the next few days weeks, months, will be hard on them.  \n\nPutin has unleashed violence and chaos.  But while he may make gains on the battlefield – he will pay a continuing high price over the long run. \n\nAnd a proud Ukrainian people, who have known 30 years  of independence, have repeatedly shown that they will not tolerate anyone who tries to take their country backwards.  \n\nTo all Americans, I will be honest with you, as I’ve always promised. A Russian dictator, invading a foreign country, has costs around the world. \n\nAnd I’m taking robust action to make sure the pain of our sanctions  is targeted at Russia’s economy. And I will use every tool at our disposal to protect American businesses and consumers. \n\nTonight, I can announce that the United States has worked with 30 other countries to release 60 Million barrels of oil from reserves around the world.  \n\nAmerica will lead that effort, releasing 30 Million barrels from our own Strategic Petroleum Reserve. And we stand ready to do more if necessary, unified with our allies.  \n\nThese steps will help blunt gas prices here at home. And I know the news about what’s happening can seem alarming. \n\nBut I want you to know that we are going to be okay. \n\nWhen the history of this era is written Putin’s war on Ukraine will have left Russia weaker and the rest of the world stronger. \n\nWhile it shouldn’t have taken something so terrible for people around the world to see what’s at stake now everyone sees it clearly. \n\nWe see the unity among leaders of nations and a more unified Europe a more unified West. And we see unity among the people who are gathering in cities in large crowds around the world even in Russia to demonstrate their support for Ukraine.  \n\nIn the battle between democracy and autocracy, democracies are rising to the moment, and the world is clearly choosing the side of peace and security. \n\nThis is a real test. It’s going to take time. So let us continue to draw inspiration from the iron will of the Ukrainian people. \n\nTo our fellow Ukrainian Americans who forge a deep bond that connects our two nations we stand with you. \n\nPutin may circle Kyiv with tanks, but he will never gain the hearts and souls of the Ukrainian people. \n\nHe will never extinguish their love of freedom. He will never weaken the resolve of the free world. \n\nWe meet tonight in an America that has lived through two of the hardest years this nation has ever faced.'),
  Document(page_content='We meet tonight in an America that has lived through two of the hardest years this nation has ever faced. \n\nThe pandemic has been punishing. \n\nAnd so many families are living paycheck to paycheck, struggling to keep up with the rising cost of food, gas, housing, and so much more. \n\nI understand. \n\nI remember when my Dad had to leave our home in Scranton, Pennsylvania to find work. I grew up in a family where if the price of food went up, you felt it. \n\nThat’s why one of the first things I did as President was fight to pass the American Rescue Plan.  \n\nBecause people were hurting. We needed to act, and we did. \n\nFew pieces of legislation have done more in a critical moment in our history to lift us out of crisis. \n\nIt fueled our efforts to vaccinate the nation and combat COVID-19. It delivered immediate economic relief for tens of millions of Americans.  \n\nHelped put food on their table, keep a roof over their heads, and cut the cost of health insurance. \n\nAnd as my Dad used to say, it gave people a little breathing room. \n\nAnd unlike the $2 Trillion tax cut passed in the previous administration that benefitted the top 1% of Americans, the American Rescue Plan helped working people—and left no one behind. \n\nAnd it worked. It created jobs. Lots of jobs. \n\nIn fact—our economy created over 6.5 Million new jobs just last year, more jobs created in one year  \nthan ever before in the history of America. \n\nOur economy grew at a rate of 5.7% last year, the strongest growth in nearly 40 years, the first step in bringing fundamental change to an economy that hasn’t worked for the working people of this nation for too long.  \n\nFor the past 40 years we were told that if we gave tax breaks to those at the very top, the benefits would trickle down to everyone else. \n\nBut that trickle-down theory led to weaker economic growth, lower wages, bigger deficits, and the widest gap between those at the top and everyone else in nearly a century. \n\nVice President Harris and I ran for office with a new economic vision for America. \n\nInvest in America. Educate Americans. Grow the workforce. Build the economy from the bottom up  \nand the middle out, not from the top down.  \n\nBecause we know that when the middle class grows, the poor have a ladder up and the wealthy do very well. \n\nAmerica used to have the best roads, bridges, and airports on Earth. \n\nNow our infrastructure is ranked 13th in the world. \n\nWe won’t be able to compete for the jobs of the 21st Century if we don’t fix that. \n\nThat’s why it was so important to pass the Bipartisan Infrastructure Law—the most sweeping investment to rebuild America in history. \n\nThis was a bipartisan effort, and I want to thank the members of both parties who worked to make it happen. \n\nWe’re done talking about infrastructure weeks. \n\nWe’re going to have an infrastructure decade. \n\nIt is going to transform America and put us on a path to win the economic competition of the 21st Century that we face with the rest of the world—particularly with China.  \n\nAs I’ve told Xi Jinping, it is never a good bet to bet against the American people. \n\nWe’ll create good jobs for millions of Americans, modernizing roads, airports, ports, and waterways all across America. \n\nAnd we’ll do it all to withstand the devastating effects of the climate crisis and promote environmental justice. \n\nWe’ll build a national network of 500,000 electric vehicle charging stations, begin to replace poisonous lead pipes—so every child—and every American—has clean water to drink at home and at school, provide affordable high-speed internet for every American—urban, suburban, rural, and tribal communities. \n\n4,000 projects have already been announced. \n\nAnd tonight, I’m announcing that this year we will start fixing over 65,000 miles of highway and 1,500 bridges in disrepair. \n\nWhen we use taxpayer dollars to rebuild America – we are going to Buy American: buy American products to support American jobs.')],
 'output_text': " The speaker addresses the unity and strength of Americans and discusses the recent conflict with Russia and actions taken by the US and its allies. They announce closures of airspace, support for Ukraine, and measures to target corrupt Russian leaders. President Biden reflects on past hardships and highlights efforts to pass the American Rescue Plan. He criticizes the previous administration's policies and shares plans for the economy, including investing in America, education, rebuilding infrastructure, and supporting American jobs. "}

当我们再次运行它时，我们看到它运行得明显更快，但最终答案不同。这是由于在映射步骤中进行了缓存，但在归约步骤中没有进行缓存。

%%time
chain.invoke(docs)

CPU times: user 7 ms, sys: 1.94 ms, total: 8.94 ms
Wall time: 1.06 s

{'input_documents': [Document(page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.  \n\nLast year COVID-19 kept us apart. This year we are finally together again. \n\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n\nWith a duty to one another to the American people to the Constitution. \n\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \n\nSix days ago, Russia’s Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \n\nHe met the Ukrainian people. \n\nFrom President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world. \n\nGroups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland. \n\nIn this struggle as President Zelenskyy said in his speech to the European Parliament “Light will win over darkness.” The Ukrainian Ambassador to the United States is here tonight. \n\nLet each of us here tonight in this Chamber send an unmistakable signal to Ukraine and to the world. \n\nPlease rise if you are able and show that, Yes, we the United States of America stand with the Ukrainian people. \n\nThroughout our history we’ve learned this lesson when dictators do not pay a price for their aggression they cause more chaos.   \n\nThey keep moving.   \n\nAnd the costs and the threats to America and the world keep rising.   \n\nThat’s why the NATO Alliance was created to secure peace and stability in Europe after World War 2. \n\nThe United States is a member along with 29 other nations. \n\nIt matters. American diplomacy matters. American resolve matters. \n\nPutin’s latest attack on Ukraine was premeditated and unprovoked. \n\nHe rejected repeated efforts at diplomacy. \n\nHe thought the West and NATO wouldn’t respond. And he thought he could divide us at home. Putin was wrong. We were ready.  Here is what we did.   \n\nWe prepared extensively and carefully. \n\nWe spent months building a coalition of other freedom-loving nations from Europe and the Americas to Asia and Africa to confront Putin. \n\nI spent countless hours unifying our European allies. We shared with the world in advance what we knew Putin was planning and precisely how he would try to falsely justify his aggression.  \n\nWe countered Russia’s lies with truth.   \n\nAnd now that he has acted the free world is holding him accountable. \n\nAlong with twenty-seven members of the European Union including France, Germany, Italy, as well as countries like the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and many others, even Switzerland. \n\nWe are inflicting pain on Russia and supporting the people of Ukraine. Putin is now isolated from the world more than ever. \n\nTogether with our allies –we are right now enforcing powerful economic sanctions. \n\nWe are cutting off Russia’s largest banks from the international financial system.  \n\nPreventing Russia’s central bank from defending the Russian Ruble making Putin’s $630 Billion “war fund” worthless.   \n\nWe are choking off Russia’s access to technology that will sap its economic strength and weaken its military for years to come.  \n\nTonight I say to the Russian oligarchs and corrupt leaders who have bilked billions of dollars off this violent regime no more. \n\nThe U.S. Department of Justice is assembling a dedicated task force to go after the crimes of Russian oligarchs.  \n\nWe are joining with our European allies to find and seize your yachts your luxury apartments your private jets. We are coming for your ill-begotten gains.'),
  Document(page_content='We are joining with our European allies to find and seize your yachts your luxury apartments your private jets. We are coming for your ill-begotten gains. \n\nAnd tonight I am announcing that we will join our allies in closing off American air space to all Russian flights – further isolating Russia – and adding an additional squeeze –on their economy. The Ruble has lost 30% of its value. \n\nThe Russian stock market has lost 40% of its value and trading remains suspended. Russia’s economy is reeling and Putin alone is to blame. \n\nTogether with our allies we are providing support to the Ukrainians in their fight for freedom. Military assistance. Economic assistance. Humanitarian assistance. \n\nWe are giving more than $1 Billion in direct assistance to Ukraine. \n\nAnd we will continue to aid the Ukrainian people as they defend their country and to help ease their suffering.  \n\nLet me be clear, our forces are not engaged and will not engage in conflict with Russian forces in Ukraine.  \n\nOur forces are not going to Europe to fight in Ukraine, but to defend our NATO Allies – in the event that Putin decides to keep moving west.  \n\nFor that purpose we’ve mobilized American ground forces, air squadrons, and ship deployments to protect NATO countries including Poland, Romania, Latvia, Lithuania, and Estonia. \n\nAs I have made crystal clear the United States and our Allies will defend every inch of territory of NATO countries with the full force of our collective power.  \n\nAnd we remain clear-eyed. The Ukrainians are fighting back with pure courage. But the next few days weeks, months, will be hard on them.  \n\nPutin has unleashed violence and chaos.  But while he may make gains on the battlefield – he will pay a continuing high price over the long run. \n\nAnd a proud Ukrainian people, who have known 30 years  of independence, have repeatedly shown that they will not tolerate anyone who tries to take their country backwards.  \n\nTo all Americans, I will be honest with you, as I’ve always promised. A Russian dictator, invading a foreign country, has costs around the world. \n\nAnd I’m taking robust action to make sure the pain of our sanctions  is targeted at Russia’s economy. And I will use every tool at our disposal to protect American businesses and consumers. \n\nTonight, I can announce that the United States has worked with 30 other countries to release 60 Million barrels of oil from reserves around the world.  \n\nAmerica will lead that effort, releasing 30 Million barrels from our own Strategic Petroleum Reserve. And we stand ready to do more if necessary, unified with our allies.  \n\nThese steps will help blunt gas prices here at home. And I know the news about what’s happening can seem alarming. \n\nBut I want you to know that we are going to be okay. \n\nWhen the history of this era is written Putin’s war on Ukraine will have left Russia weaker and the rest of the world stronger. \n\nWhile it shouldn’t have taken something so terrible for people around the world to see what’s at stake now everyone sees it clearly. \n\nWe see the unity among leaders of nations and a more unified Europe a more unified West. And we see unity among the people who are gathering in cities in large crowds around the world even in Russia to demonstrate their support for Ukraine.  \n\nIn the battle between democracy and autocracy, democracies are rising to the moment, and the world is clearly choosing the side of peace and security. \n\nThis is a real test. It’s going to take time. So let us continue to draw inspiration from the iron will of the Ukrainian people. \n\nTo our fellow Ukrainian Americans who forge a deep bond that connects our two nations we stand with you. \n\nPutin may circle Kyiv with tanks, but he will never gain the hearts and souls of the Ukrainian people. \n\nHe will never extinguish their love of freedom. He will never weaken the resolve of the free world. \n\nWe meet tonight in an America that has lived through two of the hardest years this nation has ever faced.'),
  Document(page_content='We meet tonight in an America that has lived through two of the hardest years this nation has ever faced. \n\nThe pandemic has been punishing. \n\nAnd so many families are living paycheck to paycheck, struggling to keep up with the rising cost of food, gas, housing, and so much more. \n\nI understand. \n\nI remember when my Dad had to leave our home in Scranton, Pennsylvania to find work. I grew up in a family where if the price of food went up, you felt it. \n\nThat’s why one of the first things I did as President was fight to pass the American Rescue Plan.  \n\nBecause people were hurting. We needed to act, and we did. \n\nFew pieces of legislation have done more in a critical moment in our history to lift us out of crisis. \n\nIt fueled our efforts to vaccinate the nation and combat COVID-19. It delivered immediate economic relief for tens of millions of Americans.  \n\nHelped put food on their table, keep a roof over their heads, and cut the cost of health insurance. \n\nAnd as my Dad used to say, it gave people a little breathing room. \n\nAnd unlike the $2 Trillion tax cut passed in the previous administration that benefitted the top 1% of Americans, the American Rescue Plan helped working people—and left no one behind. \n\nAnd it worked. It created jobs. Lots of jobs. \n\nIn fact—our economy created over 6.5 Million new jobs just last year, more jobs created in one year  \nthan ever before in the history of America. \n\nOur economy grew at a rate of 5.7% last year, the strongest growth in nearly 40 years, the first step in bringing fundamental change to an economy that hasn’t worked for the working people of this nation for too long.  \n\nFor the past 40 years we were told that if we gave tax breaks to those at the very top, the benefits would trickle down to everyone else. \n\nBut that trickle-down theory led to weaker economic growth, lower wages, bigger deficits, and the widest gap between those at the top and everyone else in nearly a century. \n\nVice President Harris and I ran for office with a new economic vision for America. \n\nInvest in America. Educate Americans. Grow the workforce. Build the economy from the bottom up  \nand the middle out, not from the top down.  \n\nBecause we know that when the middle class grows, the poor have a ladder up and the wealthy do very well. \n\nAmerica used to have the best roads, bridges, and airports on Earth. \n\nNow our infrastructure is ranked 13th in the world. \n\nWe won’t be able to compete for the jobs of the 21st Century if we don’t fix that. \n\nThat’s why it was so important to pass the Bipartisan Infrastructure Law—the most sweeping investment to rebuild America in history. \n\nThis was a bipartisan effort, and I want to thank the members of both parties who worked to make it happen. \n\nWe’re done talking about infrastructure weeks. \n\nWe’re going to have an infrastructure decade. \n\nIt is going to transform America and put us on a path to win the economic competition of the 21st Century that we face with the rest of the world—particularly with China.  \n\nAs I’ve told Xi Jinping, it is never a good bet to bet against the American people. \n\nWe’ll create good jobs for millions of Americans, modernizing roads, airports, ports, and waterways all across America. \n\nAnd we’ll do it all to withstand the devastating effects of the climate crisis and promote environmental justice. \n\nWe’ll build a national network of 500,000 electric vehicle charging stations, begin to replace poisonous lead pipes—so every child—and every American—has clean water to drink at home and at school, provide affordable high-speed internet for every American—urban, suburban, rural, and tribal communities. \n\n4,000 projects have already been announced. \n\nAnd tonight, I’m announcing that this year we will start fixing over 65,000 miles of highway and 1,500 bridges in disrepair. \n\nWhen we use taxpayer dollars to rebuild America – we are going to Buy American: buy American products to support American jobs.')],
 'output_text': '\n\nThe speaker addresses the unity of Americans and discusses the conflict with Russia and support for Ukraine. The US and allies are taking action against Russia and targeting corrupt leaders. There is also support and assurance for the American people. President Biden reflects on recent hardships and highlights efforts to pass the American Rescue Plan. He also shares plans for economic growth and investment in America. '}

!rm .langchain.db sqlite.db

rm: sqlite.db: No such file or directory

`OpenSearch` 语义缓存

使用OpenSearch作为语义缓存来缓存提示和响应，并根据语义相似性评估命中率。

from langchain_community.cache import OpenSearchSemanticCache
from langchain_openai import OpenAIEmbeddings

set_llm_cache(
    OpenSearchSemanticCache(
        opensearch_url="http://localhost:9200", embedding=OpenAIEmbeddings()
    )
)

API Reference:OpenSearchSemanticCache | OpenAIEmbeddings

%%time
# The first time, it is not yet in cache, so it should take longer
llm.invoke("Tell me a joke")

CPU times: user 39.4 ms, sys: 11.8 ms, total: 51.2 ms
Wall time: 1.55 s

"\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything."

%%time
# The second time, while not a direct hit, the question is semantically similar to the original question,
# so it uses the cached result!
llm.invoke("Tell me one joke")

CPU times: user 4.66 ms, sys: 1.1 ms, total: 5.76 ms
Wall time: 113 ms

"\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything."

`SingleStoreDB` 语义缓存

你可以使用SingleStoreDB作为语义缓存来缓存提示和响应。

from langchain_community.cache import SingleStoreDBSemanticCache
from langchain_openai import OpenAIEmbeddings

set_llm_cache(
    SingleStoreDBSemanticCache(
        embedding=OpenAIEmbeddings(),
        host="root:pass@localhost:3306/db",
    )
)

API Reference:SingleStoreDBSemanticCache | OpenAIEmbeddings

`Memcached` 缓存

你可以使用Memcached作为缓存，通过pymemcache来缓存提示和响应。

此缓存需要安装 pymemcache 依赖项：

%pip install -qU pymemcache

from langchain_community.cache import MemcachedCache
from pymemcache.client.base import Client

set_llm_cache(MemcachedCache(Client("localhost")))

API Reference:MemcachedCache

%%time
# The first time, it is not yet in cache, so it should take longer
llm.invoke("Tell me a joke")

CPU times: user 32.8 ms, sys: 21 ms, total: 53.8 ms
Wall time: 343 ms

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'

%%time
# The second time it is, so it goes faster
llm.invoke("Tell me a joke")

CPU times: user 2.31 ms, sys: 850 µs, total: 3.16 ms
Wall time: 6.43 ms

'\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'

`Couchbase` 缓存

使用 Couchbase 作为提示和响应的缓存。

标准缓存

标准缓存，用于查找用户提示的完全匹配。

%pip install -qU langchain_couchbase couchbase

# Create couchbase connection object
from datetime import timedelta

from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.options import ClusterOptions
from langchain_couchbase.cache import CouchbaseCache
from langchain_openai import ChatOpenAI

COUCHBASE_CONNECTION_STRING = (
    "couchbase://localhost"  # or "couchbases://localhost" if using TLS
)
DB_USERNAME = "Administrator"
DB_PASSWORD = "Password"

auth = PasswordAuthenticator(DB_USERNAME, DB_PASSWORD)
options = ClusterOptions(auth)
cluster = Cluster(COUCHBASE_CONNECTION_STRING, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

API Reference:CouchbaseCache | ChatOpenAI

# Specify the bucket, scope and collection to store the cached documents
BUCKET_NAME = "langchain-testing"
SCOPE_NAME = "_default"
COLLECTION_NAME = "_default"

set_llm_cache(
    CouchbaseCache(
        cluster=cluster,
        bucket_name=BUCKET_NAME,
        scope_name=SCOPE_NAME,
        collection_name=COLLECTION_NAME,
    )
)

%%time
# The first time, it is not yet in the cache, so it should take longer
llm.invoke("Tell me a joke")

CPU times: user 22.2 ms, sys: 14 ms, total: 36.2 ms
Wall time: 938 ms

"\n\nWhy couldn't the bicycle stand up by itself? Because it was two-tired!"

%%time
# The second time, it is in the cache, so it should be much faster
llm.invoke("Tell me a joke")

CPU times: user 25.9 ms, sys: 15.3 ms, total: 41.3 ms
Wall time: 144 ms

"\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything."

缓存条目的生存时间 (TTL)

可以通过在初始化缓存时指定ttl参数来自动删除指定时间后的缓存文档。

from datetime import timedelta

set_llm_cache(
    CouchbaseCache(
        cluster=cluster,
        bucket_name=BUCKET_NAME,
        scope_name=SCOPE_NAME,
        collection_name=COLLECTION_NAME,
        ttl=timedelta(minutes=5),
    )
)

语义缓存

语义缓存允许用户根据用户输入与之前缓存输入之间的语义相似性检索缓存的提示。在底层，它使用Couchbase作为缓存和向量存储。这需要定义一个适当的向量搜索索引才能工作。请查看使用示例以了解如何设置索引。

# Create Couchbase connection object
from datetime import timedelta

from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.options import ClusterOptions
from langchain_couchbase.cache import CouchbaseSemanticCache
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

COUCHBASE_CONNECTION_STRING = (
    "couchbase://localhost"  # or "couchbases://localhost" if using TLS
)
DB_USERNAME = "Administrator"
DB_PASSWORD = "Password"

auth = PasswordAuthenticator(DB_USERNAME, DB_PASSWORD)
options = ClusterOptions(auth)
cluster = Cluster(COUCHBASE_CONNECTION_STRING, options)

# Wait until the cluster is ready for use.
cluster.wait_until_ready(timedelta(seconds=5))

API Reference:CouchbaseSemanticCache | ChatOpenAI | OpenAIEmbeddings

备注：

在使用语义缓存之前，需要定义语义缓存的搜索索引。
可选参数，score_threshold 在语义缓存中，您可以使用它来调整语义搜索的结果。

全文搜索服务索引

如何将索引导入全文搜索服务？

Couchbase Server
- 点击搜索 -> 添加索引 -> 导入
- 在导入屏幕中复制以下索引定义
- 点击创建索引以创建索引。
Couchbase Capella
- 将索引定义复制到新文件 index.json
- 按照文档中的说明在Capella中导入文件。
- 点击创建索引以创建索引。

向量搜索的示例索引：

{
  "type": "fulltext-index",
  "name": "langchain-testing._default.semantic-cache-index",
  "sourceType": "gocbcore",
  "sourceName": "langchain-testing",
  "planParams": {
    "maxPartitionsPerPIndex": 1024,
    "indexPartitions": 16
  },
  "params": {
    "doc_config": {
      "docid_prefix_delim": "",
      "docid_regexp": "",
      "mode": "scope.collection.type_field",
      "type_field": "type"
    },
    "mapping": {
      "analysis": {},
      "default_analyzer": "standard",
      "default_datetime_parser": "dateTimeOptional",
      "default_field": "_all",
      "default_mapping": {
        "dynamic": true,
        "enabled": false
      },
      "default_type": "_default",
      "docvalues_dynamic": false,
      "index_dynamic": true,
      "store_dynamic": true,
      "type_field": "_type",
      "types": {
        "_default.semantic-cache": {
          "dynamic": false,
          "enabled": true,
          "properties": {
            "embedding": {
              "dynamic": false,
              "enabled": true,
              "fields": [
                {
                  "dims": 1536,
                  "index": true,
                  "name": "embedding",
                  "similarity": "dot_product",
                  "type": "vector",
                  "vector_index_optimized_for": "recall"
                }
              ]
            },
            "metadata": {
              "dynamic": true,
              "enabled": true
            },
            "text": {
              "dynamic": false,
              "enabled": true,
              "fields": [
                {
                  "index": true,
                  "name": "text",
                  "store": true,
                  "type": "text"
                }
              ]
            }
          }
        }
      }
    },
    "store": {
      "indexType": "scorch",
      "segmentVersion": 16
    }
  },
  "sourceParams": {}
}

BUCKET_NAME = "langchain-testing"
SCOPE_NAME = "_default"
COLLECTION_NAME = "semantic-cache"
INDEX_NAME = "semantic-cache-index"
embeddings = OpenAIEmbeddings()

cache = CouchbaseSemanticCache(
    cluster=cluster,
    embedding=embeddings,
    bucket_name=BUCKET_NAME,
    scope_name=SCOPE_NAME,
    collection_name=COLLECTION_NAME,
    index_name=INDEX_NAME,
    score_threshold=0.8,
)

set_llm_cache(cache)

%%time
# The first time, it is not yet in the cache, so it should take longer
print(llm.invoke("How long do dogs live?"))

The average lifespan of a dog is around 12 years, but this can vary depending on the breed, size, and overall health of the individual dog. Some smaller breeds may live longer, while larger breeds may have shorter lifespans. Proper care, diet, and exercise can also play a role in extending a dog's lifespan.
CPU times: user 826 ms, sys: 2.46 s, total: 3.28 s
Wall time: 2.87 s

%%time
# The second time, it is in the cache, so it should be much faster
print(llm.invoke("What is the expected lifespan of a dog?"))

The average lifespan of a dog is around 12 years, but this can vary depending on the breed, size, and overall health of the individual dog. Some smaller breeds may live longer, while larger breeds may have shorter lifespans. Proper care, diet, and exercise can also play a role in extending a dog's lifespan.
CPU times: user 9.82 ms, sys: 2.61 ms, total: 12.4 ms
Wall time: 311 ms

缓存条目的生存时间 (TTL)

可以通过在初始化缓存时指定ttl参数来自动删除指定时间后的缓存文档。

from datetime import timedelta

set_llm_cache(
    CouchbaseSemanticCache(
        cluster=cluster,
        embedding=embeddings,
        bucket_name=BUCKET_NAME,
        scope_name=SCOPE_NAME,
        collection_name=COLLECTION_NAME,
        index_name=INDEX_NAME,
        score_threshold=0.8,
        ttl=timedelta(minutes=5),
    )
)

缓存类：汇总表

缓存类通过继承BaseCache类来实现。

此表列出了所有派生类及其API参考链接。

命名空间	类 🔻
langchain_astradb.cache	AstraDBCache
langchain_astradb.cache	AstraDBSemanticCache
langchain_community.cache	AstraDBCache
langchain_community.cache	AstraDBSemanticCache
langchain_community.cache	AzureCosmosDBSemanticCache
langchain_community.cache	CassandraCache
langchain_community.cache	CassandraSemanticCache
langchain_couchbase.cache	CouchbaseCache
langchain_couchbase.cache	CouchbaseSemanticCache
langchain_elasticsearch.cache	ElasticsearchCache
langchain_elasticsearch.cache	ElasticsearchEmbeddingsCache
langchain_community.cache	GPTCache
langchain_core.caches	InMemoryCache
langchain_community.cache	InMemoryCache
langchain_community.cache	MomentoCache
langchain_mongodb.cache	MongoDBAtlasSemanticCache
langchain_mongodb.cache	MongoDBCache
langchain_community.cache	OpenSearchSemanticCache
langchain_community.cache	RedisSemanticCache
langchain_community.cache	SingleStoreDBSemanticCache
langchain_community.cache	SQLAlchemyCache
langchain_community.cache	SQLAlchemyMd5Cache
langchain_community.cache	UpstashRedisCache

In Memory 缓存​

SQLite 缓存​

Upstash Redis 缓存​

标准缓存​

语义缓存​

Redis 缓存​

标准缓存​

语义缓存​

GPTCache​

MongoDB Atlas 缓存​

标准缓存​

语义缓存​

Momento 缓存​

SQLAlchemy 缓存​

标准缓存​

自定义 SQLAlchemy 模式​

Cassandra 缓存​

连接到数据库​

连接到 Cassandra 集群​

通过CQL连接到Astra DB​

标准缓存​

语义缓存​

Astra DB 缓存​

标准缓存​

语义缓存​

Azure Cosmos DB 语义缓存​

Azure Cosmos DB NoSql 语义缓存​

Elasticsearch 缓存​

标准缓存​

索引生成的文本​

嵌入缓存​

LLM特定的可选缓存​

链中的可选缓存​

OpenSearch 语义缓存​

SingleStoreDB 语义缓存​

Memcached 缓存​

Couchbase 缓存​

标准缓存​

缓存条目的生存时间 (TTL)​

语义缓存​

全文搜索服务索引​

缓存条目的生存时间 (TTL)​

缓存类：汇总表​

这个页面有帮助吗？

`In Memory` 缓存

`SQLite` 缓存

`Upstash Redis` 缓存

标准缓存

语义缓存

`Redis` 缓存

标准缓存

语义缓存

`GPTCache`

`MongoDB Atlas` 缓存

标准缓存

语义缓存

`Momento` 缓存

`SQLAlchemy` 缓存

标准缓存

自定义 SQLAlchemy 模式

`Cassandra` 缓存

连接到数据库

连接到 Cassandra 集群

通过CQL连接到Astra DB

标准缓存

语义缓存

`Astra DB` 缓存

标准缓存

语义缓存

`Azure Cosmos DB` 语义缓存

`Azure Cosmos DB NoSql` 语义缓存

`Elasticsearch` 缓存

标准缓存

索引生成的文本

嵌入缓存

LLM特定的可选缓存

链中的可选缓存

`OpenSearch` 语义缓存

`SingleStoreDB` 语义缓存

`Memcached` 缓存

`Couchbase` 缓存

标准缓存

缓存条目的生存时间 (TTL)

语义缓存

全文搜索服务索引

缓存条目的生存时间 (TTL)

缓存类：汇总表