跳至内容

Graph

节点类型

基类: str, Enum

知识图谱中节点类型的枚举。

当前支持的节点类型为: UNKNOWN, DOCUMENT, CHUNK

节点

基类: BaseModel

表示知识图谱中的一个节点。

属性:

名称 类型 描述
id UUID

该节点的唯一标识符。

properties dict

与该节点关联的属性字典。

type NodeType

节点的类型。

add_property

add_property(key: str, value: Any)

向节点添加一个属性。

引发:

类型 描述
ValueError

如果该属性已存在。

Source code in ragas/src/ragas/testset/graph.py
def add_property(self, key: str, value: t.Any):
    """
    Adds a property to the node.

    Raises
    ------
    ValueError
        If the property already exists.
    """
    if key.lower() in self.properties:
        raise ValueError(f"Property {key} already exists")
    self.properties[key.lower()] = value

get_property

get_property(key: str) -> Optional[Any]

通过键检索属性值。

Notes

键不区分大小写。

Source code in ragas/src/ragas/testset/graph.py
def get_property(self, key: str) -> t.Optional[t.Any]:
    """
    Retrieves a property value by key.

    Notes
    -----
    The key is case-insensitive.
    """
    return self.properties.get(key.lower(), None)

关系

基类: BaseModel

表示知识图谱中两个节点之间的关系。

属性:

名称 类型 描述
id (UUID, optional)

关系的唯一标识符。默认值为新的 UUID。

type str

The type of the relationship.

source Node

The source node of the relationship.

target Node

关系的目标节点。

bidirectional (bool, optional)

Whether the relationship is bidirectional. Defaults to False.

properties (dict, optional)

与该关系关联的属性字典。默认值为空字典。

get_property

get_property(key: str) -> Optional[Any]

通过键检索属性值。键不区分大小写。

Source code in ragas/src/ragas/testset/graph.py
def get_property(self, key: str) -> t.Optional[t.Any]:
    """
    Retrieves a property value by key. The key is case-insensitive.
    """
    return self.properties.get(key.lower(), None)

知识图谱 dataclass

KnowledgeGraph(nodes: List[Node] = list(), relationships: List[Relationship] = list())

表示一个包含节点和关系的知识图谱。

属性:

名称 类型 描述
nodes List[Node]

知识图谱中的节点列表。

relationships List[Relationship]

知识图谱中的关系列表。

添加

add(item: Union[Node, Relationship])

将节点或关系添加到知识图谱中。

引发:

类型 描述
ValueError

如果项目类型不是 Node 或 Relationship。

Source code in ragas/src/ragas/testset/graph.py
def add(self, item: t.Union[Node, Relationship]):
    """
    Adds a node or relationship to the knowledge graph.

    Raises
    ------
    ValueError
        If the item type is not Node or Relationship.
    """
    if isinstance(item, Node):
        self._add_node(item)
    elif isinstance(item, Relationship):
        self._add_relationship(item)
    else:
        raise ValueError(f"Invalid item type: {type(item)}")

保存

save(path: Union[str, Path])

将知识图谱保存到 JSON 文件中。

参数:

名称 类型 描述 默认
path Union[str, Path]

JSON 文件应保存的路径。

required
Notes

该文件使用 UTF-8 编码保存,以确保在不同平台上正确处理 Unicode 字符。

Source code in ragas/src/ragas/testset/graph.py
def save(self, path: t.Union[str, Path]):
    """Saves the knowledge graph to a JSON file.

    Parameters
    ----------
    path : Union[str, Path]
        Path where the JSON file should be saved.

    Notes
    -----
    The file is saved using UTF-8 encoding to ensure proper handling of Unicode characters
    across different platforms.
    """
    if isinstance(path, str):
        path = Path(path)

    data = {
        "nodes": [node.model_dump() for node in self.nodes],
        "relationships": [rel.model_dump() for rel in self.relationships],
    }
    with open(path, "w", encoding="utf-8") as f:
        json.dump(data, f, cls=UUIDEncoder, indent=2, ensure_ascii=False)

load classmethod

load(path: Union[str, Path]) -> KnowledgeGraph

从路径加载知识图谱。

参数:

名称 类型 描述 默认
path Union[str, Path]

包含知识图谱的 JSON 文件的路径。

required

返回:

类型 描述
KnowledgeGraph

已加载的知识图谱。

Notes

该文件使用 UTF-8 编码读取,以确保在不同平台上正确处理 Unicode 字符。

Source code in ragas/src/ragas/testset/graph.py
@classmethod
def load(cls, path: t.Union[str, Path]) -> "KnowledgeGraph":
    """Loads a knowledge graph from a path.

    Parameters
    ----------
    path : Union[str, Path]
        Path to the JSON file containing the knowledge graph.

    Returns
    -------
    KnowledgeGraph
        The loaded knowledge graph.

    Notes
    -----
    The file is read using UTF-8 encoding to ensure proper handling of Unicode characters
    across different platforms.
    """
    if isinstance(path, str):
        path = Path(path)

    with open(path, "r", encoding="utf-8") as f:
        data = json.load(f)

    nodes = [Node(**node_data) for node_data in data["nodes"]]

    nodes_map = {str(node.id): node for node in nodes}
    relationships = [
        Relationship(
            id=rel_data["id"],
            type=rel_data["type"],
            source=nodes_map[rel_data["source"]],
            target=nodes_map[rel_data["target"]],
            bidirectional=rel_data["bidirectional"],
            properties=rel_data["properties"],
        )
        for rel_data in data["relationships"]
    ]

    kg = cls()
    kg.nodes.extend(nodes)
    kg.relationships.extend(relationships)
    return kg

find_indirect_clusters

find_indirect_clusters(relationship_condition: Callable[[Relationship], bool] = lambda _: True, depth_limit: int = 3) -> List[Set[Node]]

根据关系条件在知识图谱中查找节点的间接聚类。这里如果 A -> B -> C -> D,那么 A、B、C 和 D 形成一个聚类。如果还有一条路径 A -> B -> C -> E,则会形成一个单独的聚类。

参数:

名称 类型 描述 默认
relationship_condition Callable[[Relationship], bool]

接受一个 Relationship 并返回一个布尔值,默认值为 lambda _: True

lambda _: True

返回:

类型 描述
List[Set[Node]]

集合的列表,其中每个集合包含构成一个簇的节点。

Source code in ragas/src/ragas/testset/graph.py
def find_indirect_clusters(
    self,
    relationship_condition: t.Callable[[Relationship], bool] = lambda _: True,
    depth_limit: int = 3,
) -> t.List[t.Set[Node]]:
    """
    Finds indirect clusters of nodes in the knowledge graph based on a relationship condition.
    Here if A -> B -> C -> D, then A, B, C, and D form a cluster. If there's also a path A -> B -> C -> E,
    it will form a separate cluster.

    Parameters
    ----------
    relationship_condition : Callable[[Relationship], bool], optional
        A function that takes a Relationship and returns a boolean, by default lambda _: True

    Returns
    -------
    List[Set[Node]]
        A list of sets, where each set contains nodes that form a cluster.
    """
    clusters = []
    visited_paths = set()

    relationships = [
        rel for rel in self.relationships if relationship_condition(rel)
    ]

    def dfs(node: Node, cluster: t.Set[Node], depth: int, path: t.Tuple[Node, ...]):
        if depth >= depth_limit or path in visited_paths:
            return
        visited_paths.add(path)
        cluster.add(node)

        for rel in relationships:
            neighbor = None
            if rel.source == node and rel.target not in cluster:
                neighbor = rel.target
            elif (
                rel.bidirectional
                and rel.target == node
                and rel.source not in cluster
            ):
                neighbor = rel.source

            if neighbor is not None:
                dfs(neighbor, cluster.copy(), depth + 1, path + (neighbor,))

        # Add completed path-based cluster
        if len(cluster) > 1:
            clusters.append(cluster)

    for node in self.nodes:
        initial_cluster = set()
        dfs(node, initial_cluster, 0, (node,))

    # Remove duplicates by converting clusters to frozensets
    unique_clusters = [
        set(cluster) for cluster in set(frozenset(c) for c in clusters)
    ]

    return unique_clusters

remove_node

remove_node(node: Node, inplace: bool = True) -> Optional[KnowledgeGraph]

从知识图谱中删除一个节点及其相关关系。

参数:

名称 类型 描述 默认
node Node

要从知识图谱中移除的节点。

required
inplace bool

如果为 True,则原地修改知识图谱。 如果为 False,则返回一个已移除该节点的修改副本。

True

返回:

类型 描述
KnowledgeGraph or None

如果 inplace 为 False,则返回知识图谱的修改副本。 如果 inplace 为 True,则返回 None。

引发:

类型 描述
ValueError

如果该节点不在知识图谱中。

Source code in ragas/src/ragas/testset/graph.py
def remove_node(
    self, node: Node, inplace: bool = True
) -> t.Optional["KnowledgeGraph"]:
    """
    Removes a node and its associated relationships from the knowledge graph.

    Parameters
    ----------
    node : Node
        The node to be removed from the knowledge graph.
    inplace : bool, optional
        If True, modifies the knowledge graph in place.
        If False, returns a modified copy with the node removed.

    Returns
    -------
    KnowledgeGraph or None
        Returns a modified copy of the knowledge graph if `inplace` is False.
        Returns None if `inplace` is True.

    Raises
    ------
    ValueError
        If the node is not present in the knowledge graph.
    """
    if node not in self.nodes:
        raise ValueError("Node is not present in the knowledge graph.")

    if inplace:
        # Modify the current instance
        self.nodes.remove(node)
        self.relationships = [
            rel
            for rel in self.relationships
            if rel.source != node and rel.target != node
        ]
    else:
        # Create a deep copy and modify it
        new_graph = deepcopy(self)
        new_graph.nodes.remove(node)
        new_graph.relationships = [
            rel
            for rel in new_graph.relationships
            if rel.source != node and rel.target != node
        ]
        return new_graph

find_two_nodes_single_rel

find_two_nodes_single_rel(relationship_condition: Callable[[Relationship], bool] = lambda _: True) -> List[Tuple[Node, Relationship, Node]]

根据关系条件在知识图谱中查找节点。(NodeA, NodeB, Rel) 三元组被视为多跳节点。

参数:

名称 类型 描述 默认
relationship_condition Callable[[Relationship], bool]

接收一个 Relationship 并返回一个布尔值,默认 lambda _: True

lambda _: True

返回:

类型 描述
List[Set[Node, Relationship, Node]]

一系列集合,每个集合包含两个节点和一个关系,形成一个多跳节点。

Source code in ragas/src/ragas/testset/graph.py
def find_two_nodes_single_rel(
    self, relationship_condition: t.Callable[[Relationship], bool] = lambda _: True
) -> t.List[t.Tuple[Node, Relationship, Node]]:
    """
    Finds nodes in the knowledge graph based on a relationship condition.
    (NodeA, NodeB, Rel) triples are considered as multi-hop nodes.

    Parameters
    ----------
    relationship_condition : Callable[[Relationship], bool], optional
        A function that takes a Relationship and returns a boolean, by default lambda _: True

    Returns
    -------
    List[Set[Node, Relationship, Node]]
        A list of sets, where each set contains two nodes and a relationship forming a multi-hop node.
    """

    relationships = [
        relationship
        for relationship in self.relationships
        if relationship_condition(relationship)
    ]

    triplets = set()

    for relationship in relationships:
        if relationship.source != relationship.target:
            node_a = relationship.source
            node_b = relationship.target
            # Ensure the smaller ID node is always first
            if node_a.id < node_b.id:
                normalized_tuple = (node_a, relationship, node_b)
            else:
                normalized_relationship = Relationship(
                    source=node_b,
                    target=node_a,
                    type=relationship.type,
                    properties=relationship.properties,
                )
                normalized_tuple = (node_b, normalized_relationship, node_a)

            triplets.add(normalized_tuple)

    return list(triplets)
优云智算