Text2Cypher, Graph Retrieval Arguments Generation(RAG), Graph RAG and Graph+Vector RAG¶

To enable Cognitive intelligence App towards private data, RAG + LLM and Knowledge Graph are state-of-the-art approaches.

In this demo, we will explain know-how of four types of approaches and compare the trade-off and performance among them.

QueryEngine Knowledge Graph query engine Graph RAG query engine Vector RAG query engine Graph Vector RAG query engine
Mechanism 1. Text-to-GraphQuery based on KG
2. Query KG with the result
3. Answer synthesis based on query result
1. Get related entities of the question
2. Get n-depth SubGraphs of related entities from KG
3. Answer synthesis based on related SubGraphs
1. Create embedding of question
2. Semantic search top-k related doc chunks
3. Answer synthesis based on related doc chunks
1. Do retrieval as Vector and Graph RAG
2. Answer synthesis based on both related chunks and SubGraphs

Background, RAG¶

Below digrams are showing how RAG works:

                  RAG with Llama Index
                  ┌────┬────┬────┬────┐                  
                  │ 1  │ 2  │ 3  │ 4  │                  
                  ├────┴────┴────┴────┤                  
                  │  Docs/Knowledge   │                  
┌───────┐         │        ...        │       ┌─────────┐
│       │         ├────┬────┬────┬────┤       │         │
│       │         │ 95 │ 96 │    │    │       │         │
│       │         └────┴────┴────┴────┘       │         │
│ User  │─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─▶   LLM   │
│       │                                     │         │
│       │                                     │         │
└───────┘    ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐  └─────────┘
    │          ┌──────────────────────────┐        ▲     
    └────────┼▶│  Tell me ....., please   │├───────┘     
               └──────────────────────────┘              
             │ ┌────┐ ┌────┐               │             
               │ 3  │ │ 96 │                             
             │ └────┘ └────┘               │             
              ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ 

In VectorDB based RAG, we create embeddings of each node(chunk), and find TopK related ones towards a given question during the query. In the above diagram, nodes 3 and 96 were fetched as the TopK related nodes, used to help answer the user query.

Background Graph RAG¶

In Graph RAG, we will extract relationships between entities, representing concise facts from each node. It would look something like this:

Node Split and Embedding

┌────┬────┬────┬────┐
│ 1  │ 2  │ 3  │ 4  │
├────┴────┴────┴────┤
│  Docs/Knowledge   │
│        ...        │
├────┬────┬────┬────┤
│ 95 │ 96 │    │    │
└────┴────┴────┴────┘

Then, if we zoomed in of it:

       Node Split and Embedding, with Knowledge Graph being extracted

┌──────────────────┬──────────────────┬──────────────────┬──────────────────┐
│ .─.       .─.    │  .─.       .─.   │            .─.   │  .─.       .─.   │
│( x )─────▶ y )   │ ( x )─────▶ a )  │           ( j )  │ ( m )◀────( x )  │
│ `▲'       `─'    │  `─'       `─'   │            `─'   │  `─'       `─'   │
│  │     1         │        2         │        3    │    │        4         │
│ .─.              │                  │            .▼.   │                  │
│( z )─────────────┼──────────────────┼──────────▶( i )─┐│                  │
│ `◀────┐          │                  │            `─'  ││                  │
├───────┼──────────┴──────────────────┴─────────────────┼┴──────────────────┤
│       │                      Docs/Knowledge           │                   │
│       │                            ...                │                   │
│       │                                               │                   │
├───────┼──────────┬──────────────────┬─────────────────┼┬──────────────────┤
│  .─.  └──────.   │  .─.             │                 ││  .─.             │
│ ( x ◀─────( b )  │ ( x )            │                 └┼▶( n )            │
│  `─'       `─'   │  `─'             │                  │  `─'             │
│        95   │    │   │    96        │                  │   │    98        │
│            .▼.   │  .▼.             │                  │   ▼              │
│           ( c )  │ ( d )            │                  │  .─.             │
│            `─'   │  `─'             │                  │ ( x )            │
└──────────────────┴──────────────────┴──────────────────┴──`─'─────────────┘

Where, knowledge, the more granular spliting and information with higher density, optionally multi-hop of x -> y, i -> j -> z -> x etc... across many more nodes(chunks) than K(in TopK search) could be inlucded in Retrievers. And we believe there are cases that this additional work matters.

But how/how well exactly does it work? Let's see in this demo.

1. Preparation¶

1.1 Prepare for LLM¶

In [1]:
# For OpenAI

import os

os.environ["OPENAI_API_KEY"] = "INSERT YOUR KEY"

import logging
import sys

logging.basicConfig(
    stream=sys.stdout, level=logging.INFO
)  # logging.DEBUG for more verbose output

from llama_index import (
    KnowledgeGraphIndex,
    ServiceContext,
    SimpleDirectoryReader,
)
from llama_index.storage.storage_context import StorageContext
from llama_index.graph_stores import NebulaGraphStore


from llama_index.llms import OpenAI
from IPython.display import Markdown, display


# define LLM
# NOTE: at the time of demo, text-davinci-002 did not have rate-limit errors
llm = OpenAI(temperature=0, model="text-davinci-002")
service_context = ServiceContext.from_defaults(llm=llm, chunk_size=512)
INFO:numexpr.utils:NumExpr defaulting to 8 threads.
In [ ]:
# For Azure OpenAI

import os
import json
import openai
from langchain.embeddings import OpenAIEmbeddings
from llama_index.llms import AzureOpenAI
from llama_index import LangchainEmbedding
from llama_index import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    KnowledgeGraphIndex,
    ServiceContext,
)
from llama_index import set_global_service_context

from llama_index.storage.storage_context import StorageContext
from llama_index.graph_stores import NebulaGraphStore

import logging
import sys

from IPython.display import Markdown, display

logging.basicConfig(
    stream=sys.stdout, level=logging.INFO
)  # logging.DEBUG for more verbose output
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

openai.api_type = "azure"
openai.api_base = "https://<foo-bar>.openai.azure.com"
openai.api_version = "2022-12-01"
os.environ["OPENAI_API_KEY"] = "youcannottellanyone"
openai.api_key = os.getenv("OPENAI_API_KEY")

llm = AzureOpenAI(
    engine="<foo-bar-deployment>",
    temperature=0,
    openai_api_version=openai.api_version,
    model_kwargs={
        "api_key": openai.api_key,
        "api_base": openai.api_base,
        "api_type": openai.api_type,
        "api_version": openai.api_version,
    },
)

# You need to deploy your own embedding model as well as your own chat completion model
embedding_llm = LangchainEmbedding(
    OpenAIEmbeddings(
        model="text-embedding-ada-002",
        deployment="<foo-bar-deployment>",
        openai_api_key=openai.api_key,
        openai_api_base=openai.api_base,
        openai_api_type=openai.api_type,
        openai_api_version=openai.api_version,
    ),
    embed_batch_size=1,
)

service_context = ServiceContext.from_defaults(
    llm=llm,
    embed_model=embedding_llm,
)

set_global_service_context(service_context)

1.2. Prepare for NebulaGraph as Graph Store¶

❗Access NebulaGraph Console to create space and graph schema

CREATE SPACE guardians(vid_type=FIXED_STRING(256), partition_num=1, replica_factor=1);
:sleep 10;
USE guardians;
CREATE TAG entity(name string);
CREATE EDGE relationship(relationship string);
:sleep 10;
CREATE TAG INDEX entity_index ON entity(name(256));
In [2]:
%pip install nebula3-python ipython-ngql

os.environ['NEBULA_USER'] = "root"
os.environ['NEBULA_PASSWORD'] = "nebula" # default password
os.environ['NEBULA_ADDRESS'] = "127.0.0.1:9669" # assumed we have NebulaGraph installed locally

space_name = "guardians"
edge_types, rel_prop_names = ["relationship"], ["relationship"] # default, could be omit if create from an empty kg
tags = ["entity"] # default, could be omit if create from an empty kg

graph_store = NebulaGraphStore(
    space_name=space_name,
    edge_types=edge_types,
    rel_prop_names=rel_prop_names,
    tags=tags,
)
storage_context = StorageContext.from_defaults(graph_store=graph_store)
Requirement already satisfied: nebula3-python in /opt/homebrew/lib/python3.11/site-packages/nebula3_python-3.4.0-py3.11.egg (3.4.0)
Requirement already satisfied: ipython-ngql in /opt/homebrew/lib/python3.11/site-packages/ipython_ngql-0.5-py3.11.egg (0.5)
Requirement already satisfied: httplib2>=0.20.0 in /opt/homebrew/lib/python3.11/site-packages/httplib2-0.22.0-py3.11.egg (from nebula3-python) (0.22.0)
Requirement already satisfied: future>=0.18.0 in /opt/homebrew/lib/python3.11/site-packages/future-0.18.3-py3.11.egg (from nebula3-python) (0.18.3)
Requirement already satisfied: six>=1.16.0 in /opt/homebrew/lib/python3.11/site-packages (from nebula3-python) (1.16.0)
Requirement already satisfied: pytz>=2021.1 in /opt/homebrew/lib/python3.11/site-packages (from nebula3-python) (2022.7.1)
Requirement already satisfied: Jinja2 in /opt/homebrew/lib/python3.11/site-packages/Jinja2-3.1.2-py3.11.egg (from ipython-ngql) (3.1.2)
Requirement already satisfied: pandas in /opt/homebrew/lib/python3.11/site-packages (from ipython-ngql) (1.5.3)
Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in /opt/homebrew/lib/python3.11/site-packages/pyparsing-3.1.0b1-py3.11.egg (from httplib2>=0.20.0->nebula3-python) (3.1.0b1)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/homebrew/lib/python3.11/site-packages/MarkupSafe-2.1.2-py3.11-macosx-13-arm64.egg (from Jinja2->ipython-ngql) (2.1.2)
Requirement already satisfied: python-dateutil>=2.8.1 in /opt/homebrew/lib/python3.11/site-packages (from pandas->ipython-ngql) (2.8.2)
Requirement already satisfied: numpy>=1.21.0 in /opt/homebrew/lib/python3.11/site-packages (from pandas->ipython-ngql) (1.24.0)
Note: you may need to restart the kernel to use updated packages.

2. Build the Knowledge Graph¶

In our demo, the Knowledge Graph was created with LLM.

We simply do so leveragint the KnowledgeGraphIndex from LlamaIndex, when creating it, Triplets will be extracted with LLM and evantually persisted into NebulaGraphStore.

2.1 Preprocess Data¶

We will download and preprecess data from: https://en.wikipedia.org/wiki/Guardians_of_the_Galaxy_Vol._3

In [4]:
from llama_index import download_loader

WikipediaReader = download_loader("WikipediaReader")

loader = WikipediaReader()

documents = loader.load_data(pages=['Guardians of the Galaxy Vol. 3'], auto_suggest=False)

2.2 Extract Triplets and Save to NebulaGraph¶

This call will take some time, it'll extract entities and relationships and store them into NebulaGraph

In [6]:
kg_index = KnowledgeGraphIndex.from_documents(
    documents,
    storage_context=storage_context,
    service_context=service_context,
    max_triplets_per_chunk=10,
    space_name=space_name,
    edge_types=edge_types,
    rel_prop_names=rel_prop_names,
    tags=tags,
    include_embeddings=True,
)

3 Create VectorStoreIndex for RAG¶

To compare with/work together with VectorDB based RAG, let's also create a VectorStoreIndex.

During the creation, same data source will be split into chunks and embedding of them will be created, during the RAG query time, the top-k related embeddings will be vector-searched with the embedding of the question.

                  RAG with Llama Index
                  ┌────┬────┬────┬────┐                  
                  │ 1  │ 2  │ 3  │ 4  │                  
                  ├────┴────┴────┴────┤                  
                  │  Docs/Knowledge   │                  
┌───────┐         │        ...        │       ┌─────────┐
│       │         ├────┬────┬────┬────┤       │         │
│       │         │ 95 │ 96 │    │    │       │         │
│       │         └────┴────┴────┴────┘       │         │
│ User  │─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─▶   LLM   │
│       │                                     │         │
│       │                                     │         │
└───────┘    ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐  └─────────┘
    │          ┌──────────────────────────┐        ▲     
    └────────┼▶│  Tell me ....., please   │├───────┘     
               └──────────────────────────┘              
             │ ┌────┐ ┌────┐               │             
               │ 3  │ │ 96 │                             
             │ └────┘ └────┘               │             
              ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ 

In Llama Index, this could be done with oneline of code:

In [5]:
vector_index = VectorStoreIndex.from_documents(
    documents,
    service_context=service_context
)

4. Persist and Load from disk Llama Indexes(Optional)¶

Both the KnowledgeGraphIndex and VectorStoreIndex will be created only once, afterwards, we could persist their in-memory context to enable their reuse from disk anytime.

Persist¶

# persist KG Index(Only MetaData will be persisted, KG is in NebulaGraph)
kg_index.storage_context.persist(persist_dir='./storage_graph')

# persist Vector Index
vector_index.storage_context.persist(persist_dir='./storage_vector')

Then the files are created:

$ ls -l ./storage_*

storage_graph:
total 6384
-rw-r--r--@ 1 weyl  staff    44008 Jul 14 11:06 docstore.json
-rw-r--r--@ 1 weyl  staff  3219385 Jul 14 11:06 index_store.json
-rw-r--r--@ 1 weyl  staff       51 Jul 14 11:06 vector_store.json

storage_vector:
total 712
-rw-r--r--@ 1 weyl  staff   44008 Jul 14 11:06 docstore.json
-rw-r--r--@ 1 weyl  staff      18 Jul 14 11:06 graph_store.json
-rw-r--r--@ 1 weyl  staff    1003 Jul 14 11:06 index_store.json
-rw-r--r--@ 1 weyl  staff  311028 Jul 14 11:06 vector_store.json

Restore¶

So we could restore the index from disk like:

from llama_index import load_index_from_storage

storage_context = StorageContext.from_defaults(persist_dir='./storage_graph', graph_store=graph_store)
kg_index = load_index_from_storage(
    storage_context=storage_context,
    service_context=service_context,
    max_triplets_per_chunk=10,
    space_name=space_name,
    edge_types=edge_types,
    rel_prop_names=rel_prop_names,
    tags=tags,
    include_embeddings=True,
)

storage_context_vector = StorageContext.from_defaults(persist_dir='./storage_vector')
vector_index = load_index_from_storage(
    service_context=service_context,
    storage_context=storage_context_vector
)
In [7]:
# vector_index.storage_context.persist(persist_dir='./storage_vector')

from llama_index import load_index_from_storage

storage_context = StorageContext.from_defaults(persist_dir='./storage_graph', graph_store=graph_store)
kg_index = load_index_from_storage(
    storage_context=storage_context,
    service_context=service_context,
    max_triplets_per_chunk=10,
    space_name=space_name,
    edge_types=edge_types,
    rel_prop_names=rel_prop_names,
    tags=tags,
    include_embeddings=True,
)

storage_context_vector = StorageContext.from_defaults(persist_dir='./storage_vector')
vector_index = load_index_from_storage(
    service_context=service_context,
    storage_context=storage_context_vector
)
INFO:llama_index.indices.loading:Loading all indices.
Loading all indices.
INFO:llama_index.indices.loading:Loading all indices.
Loading all indices.

5. Prepare for different query approaches¶

We will do 4 types of query approaches with LLM, KG, VectorDB:

QueryEngine Knowledge Graph query engine Graph RAG query engine Vector RAG query engine Graph Vector RAG query engine
Mechanism 1. Text-to-GraphQuery based on KG
2. Query KG with the result
3. Answer synthesis based on query result
1. Get related entities of the question
2. Get n-depth SubGraphs of related entities from KG
3. Answer synthesis based on related SubGraphs
1. Create embedding of question
2. Semantic search top-k related doc chunks
3. Answer synthesis based on related doc chunks
1. Do retrieval as Vector and Graph RAG
2. Answer synthesis based on both related chunks and SubGraphs

5.1 text-to-NebulaGraphCypher¶

Text-to-NebulaGraphCypher approach Translate task/question into a Graph Cypher Query, and answer based on its query result.

In [8]:
from llama_index.query_engine import KnowledgeGraphQueryEngine

from llama_index.storage.storage_context import StorageContext
from llama_index.graph_stores import NebulaGraphStore

nl2kg_query_engine = KnowledgeGraphQueryEngine(
    storage_context=storage_context,
    service_context=service_context,
    llm=llm,
    verbose=True,
)

5.2 Graph RAG query engine¶

Graph RAG takes SubGraphs related to entities of the task/question as Context.

           Graph + Vector RAG with Llama Index
                  ┌────┬────┬────┬────┐                  
                  │ 1  │ 2  │ 3  │ 4  │                  
                  ├────┴────┴────┴────┤                  
                  │  Docs/Knowledge   │                  
┌───────┐         │        ...        │       ┌─────────┐
│       │         ├────┬────┬────┬────┤       │         │
│       │         │ 95 │ 96 │    │    │       │         │
│       │         └────┴────┴────┴────┘       │         │
│ User  │─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─▶   LLM   │
│       │                                     │         │
│       │                                     │         │
└───────┘    ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐  └─────────┘
    │          ┌──────────────────────────┐        ▲     
    └────────┼▶│  Tell me about x, please │├───────┘     
               └──────────────────────────┘              
             │ Below are knowledge about x │             
               x->y<-z,x->h->i, m<-n,...                            
             │ Please answer based on them │             
              ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ 
In [10]:
kg_rag_query_engine = kg_index.as_query_engine(
    include_text=False,
    retriever_mode="keyword",
    response_mode="tree_summarize",
)

5.3 Vector RAG query engine¶

Vector RAG is the common approach to find topK semantic related doc chunks as context to synthesize the answer.

In [9]:
vector_rag_query_engine = vector_index.as_query_engine()

5.4 Graph+Vector RAG query engine¶

This is a combined Graph+Vector Based RAG, where we will retrieve both VectorDB and KG SubGraphs as the context, for synthesis of the answer.

           Graph + Vector RAG with Llama Index
                  ┌────┬────┬────┬────┐                  
                  │ 1  │ 2  │ 3  │ 4  │                  
                  ├────┴────┴────┴────┤                  
                  │  Docs/Knowledge   │                  
┌───────┐         │        ...        │       ┌─────────┐
│       │         ├────┬────┬────┬────┤       │         │
│       │         │ 95 │ 96 │    │    │       │         │
│       │         └────┴────┴────┴────┘       │         │
│ User  │─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─▶   LLM   │
│       │                                     │         │
│       │                                     │         │
└───────┘    ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐  └─────────┘
    │          ┌──────────────────────────┐        ▲     
    └────────┼▶│  Tell me ....., please   │├───────┘     
               └──────────────────────────┘              
             │ ┌────┐┌────┐               │             
               │ 3  ││ 96 │ x->y<-z,x->h...                            
             │ └────┘└────┘               │             
              ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ 

To implement that in Llama Index, we create a CustomRetriever to comebine the two:

In [11]:
# import QueryBundle
from llama_index import QueryBundle

# import NodeWithScore
from llama_index.schema import NodeWithScore

# Retrievers
from llama_index.retrievers import BaseRetriever, VectorIndexRetriever, KGTableRetriever

from typing import List


class CustomRetriever(BaseRetriever):
    """Custom retriever that performs both Vector search and Knowledge Graph search"""

    def __init__(
        self,
        vector_retriever: VectorIndexRetriever,
        kg_retriever: KGTableRetriever,
        mode: str = "OR",
    ) -> None:
        """Init params."""

        self._vector_retriever = vector_retriever
        self._kg_retriever = kg_retriever
        if mode not in ("AND", "OR"):
            raise ValueError("Invalid mode.")
        self._mode = mode

    def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]:
        """Retrieve nodes given query."""

        vector_nodes = self._vector_retriever.retrieve(query_bundle)
        kg_nodes = self._kg_retriever.retrieve(query_bundle)

        vector_ids = {n.node.node_id for n in vector_nodes}
        kg_ids = {n.node.node_id for n in kg_nodes}

        combined_dict = {n.node.node_id: n for n in vector_nodes}
        combined_dict.update({n.node.node_id: n for n in kg_nodes})

        if self._mode == "AND":
            retrieve_ids = vector_ids.intersection(kg_ids)
        else:
            retrieve_ids = vector_ids.union(kg_ids)

        retrieve_nodes = [combined_dict[rid] for rid in retrieve_ids]
        return retrieve_nodes

Next, we will create instances of the Vector and KG retrievers, which will be used in the instantiation of the Custom Retriever.

In [12]:
from llama_index import get_response_synthesizer
from llama_index.query_engine import RetrieverQueryEngine

# create custom retriever
vector_retriever = VectorIndexRetriever(index=vector_index)
kg_retriever = KGTableRetriever(
    index=kg_index, retriever_mode="keyword", include_text=False
)
custom_retriever = CustomRetriever(vector_retriever, kg_retriever)

# create response synthesizer
response_synthesizer = get_response_synthesizer(
    service_context=service_context,
    response_mode="tree_summarize",
)

And the query engine:

In [13]:
graph_vector_rag_query_engine = RetrieverQueryEngine(
    retriever=custom_retriever,
    response_synthesizer=response_synthesizer,
)

6. Query with all the Engines¶

6.1 Text-to-GraphQuery¶

In [16]:
response_nl2kg = nl2kg_query_engine.query("Tell me about Peter Quill.")


display(Markdown(f"<b>{response_nl2kg}</b>"))

# Cypher:

print("Cypher Query:")

graph_query = nl2kg_query_engine.generate_query(
    "Tell me about Peter Quill?",
)
graph_query = graph_query.replace("WHERE", "\n  WHERE").replace("RETURN", "\nRETURN")

display(
    Markdown(
        f"""
```cypher
{graph_query}
```
"""
    )
)
Graph Store Query: MATCH (p:`entity`)-[:relationship]->(e:`entity`) WHERE p.`entity`.`name` == 'Peter Quill' RETURN e.`entity`.`name`;
INFO:llama_index.query_engine.knowledge_graph_query_engine:Graph Store Query: MATCH (p:`entity`)-[:relationship]->(e:`entity`) WHERE p.`entity`.`name` == 'Peter Quill' RETURN e.`entity`.`name`;
Graph Store Query: MATCH (p:`entity`)-[:relationship]->(e:`entity`) WHERE p.`entity`.`name` == 'Peter Quill' RETURN e.`entity`.`name`;
Graph Store Response: {'e.entity.name': ['Guardians of the Galaxy']}
INFO:llama_index.query_engine.knowledge_graph_query_engine:Graph Store Response: {'e.entity.name': ['Guardians of the Galaxy']}
Graph Store Response: {'e.entity.name': ['Guardians of the Galaxy']}
Final Response: 
Peter Quill is a member of the superhero team known as the Guardians of the Galaxy.
Peter Quill is a member of the superhero team known as the Guardians of the Galaxy.
Cypher Query:
MATCH (p:`entity`)-[:relationship]->(e:`entity`) 
  WHERE p.`entity`.`name` == 'Peter Quill' 
RETURN e.`entity`.`name`;

6.2 Graph RAG¶

In [17]:
response_graph_rag = kg_rag_query_engine.query("Tell me about Peter Quill.")

display(Markdown(f"<b>{response_graph_rag}</b>"))
INFO:llama_index.indices.knowledge_graph.retriever:> Starting query: Tell me about Peter Quill.
> Starting query: Tell me about Peter Quill.
INFO:llama_index.indices.knowledge_graph.retriever:> Query keywords: ['biography', 'Peter Quill', 'history', 'Peter', 'Quill']
> Query keywords: ['biography', 'Peter Quill', 'history', 'Peter', 'Quill']
INFO:llama_index.indices.knowledge_graph.retriever:> Extracted relationships: The following are knowledge triplets in max depth 2 in the form of `subject [predicate, object, predicate_next_hop, object_next_hop ...]`
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'released in', '2014']
Peter Quill ['portrays', 'Peter Quill']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'reprised role from', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'directed', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'wrote', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'sequel to', 'Guardians of the Galaxy']
Quill ['speaks', ' fuck ']
> Extracted relationships: The following are knowledge triplets in max depth 2 in the form of `subject [predicate, object, predicate_next_hop, object_next_hop ...]`
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'released in', '2014']
Peter Quill ['portrays', 'Peter Quill']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'reprised role from', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'directed', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'wrote', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'sequel to', 'Guardians of the Galaxy']
Quill ['speaks', ' fuck ']
Peter Quill is the leader of the Guardians of the Galaxy, a superhero team from Marvel Comics. He was portrayed by Chris Pratt in the 2014 movie of the same name, and reprised his role in the 2017 sequel. He also wrote and directed the first movie. Quill is known for his foul language, often using the word "fuck".

6.3 Vector RAG¶

In [18]:
response_vector_rag = vector_rag_query_engine.query("Tell me about Peter Quill.")

display(Markdown(f"<b>{response_vector_rag}</b>"))
Peter Quill, also known as Star-Lord, is the half-human, half-Celestial leader of the Guardians of the Galaxy. He was abducted from Earth as a child and raised by a group of alien thieves and smugglers, the Ravagers. In the film, Quill is in a "state of depression" following the appearance of a variant of his dead lover Gamora, who does not share the same affection for Quill as her older version had for him, which in turn affects his leadership of the Guardians.
In [32]:
display(
    Markdown(
        llm(f"""
Compare the two QA result on "Tell me about Peter Quill.", list the differences between them, to help evalute them. Output in markdown table.

Result from Graph: {response_graph_rag}
---
Result from Vector: {response_vector_rag}

"""
           )
    )
)
Result Graph Vector
Character Leader of the Guardians of the Galaxy Half-human, half-Celestial leader of the Guardians of the Galaxy
Movie Portrayed by Chris Pratt in the 2014 movie of the same name, and reprised his role in the 2017 sequel. He also wrote and directed the first movie. In the film, Quill is in a "state of depression" following the appearance of a variant of his dead lover Gamora, who does not share the same affection for Quill as her older version had for him, which in turn affects his leadership of the Guardians.
Personality Known for his foul language, often using the word "fuck". No mention

6.4 Graph + Vector RAG¶

In [33]:
response_graph_vector_rag = graph_vector_rag_query_engine.query("Tell me about Peter Quill.")

display(Markdown(f"<b>{response_graph_vector_rag}</b>"))
INFO:llama_index.indices.knowledge_graph.retriever:> Starting query: Tell me about Peter Quill.
> Starting query: Tell me about Peter Quill.
INFO:llama_index.indices.knowledge_graph.retriever:> Query keywords: ['biography', 'Peter Quill', 'history', 'Peter', 'Quill']
> Query keywords: ['biography', 'Peter Quill', 'history', 'Peter', 'Quill']
INFO:llama_index.indices.knowledge_graph.retriever:> Extracted relationships: The following are knowledge triplets in max depth 2 in the form of `subject [predicate, object, predicate_next_hop, object_next_hop ...]`
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'released in', '2014']
Peter Quill ['portrays', 'Peter Quill']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'reprised role from', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'directed', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'wrote', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'sequel to', 'Guardians of the Galaxy']
Quill ['speaks', ' fuck ']
> Extracted relationships: The following are knowledge triplets in max depth 2 in the form of `subject [predicate, object, predicate_next_hop, object_next_hop ...]`
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'released in', '2014']
Peter Quill ['portrays', 'Peter Quill']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'reprised role from', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'directed', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'wrote', 'Guardians of the Galaxy']
Peter Quill ['is leader of', 'Guardians of the Galaxy', 'sequel to', 'Guardians of the Galaxy']
Quill ['speaks', ' fuck ']
Peter Quill is the half-human, half-Celestial leader of the Guardians of the Galaxy, a group of alien thieves and smugglers. He was abducted from Earth as a child and raised by the Ravagers. Quill reprised his role from the 2014 film Guardians of the Galaxy, which he also directed and wrote. He is the sequel to the original Guardians of the Galaxy and is known to speak with a foul mouth.

7. Conclusion¶

7.1 Overall Comparision¶

Let's compare the results of them.

First check the information that were coverred by different approaches:

In [43]:
display(
    Markdown(
        llm(f"""
Compare the QA results on "Tell me about Peter Quill.", list the knowledge facts between them, to help evalute them. Output in markdown table.

Result text2GraphQuery: {response_nl2kg}
---
Result Graph: {response_graph_rag}
---
Result Vector: {response_vector_rag}
---
Result Graph+Vector: {response_graph_vector_rag}
---

"""
           )
    )
)
Knowledge Fact text2GraphQuery Graph Vector Graph+Vector
Member of Guardians of the Galaxy Yes Yes No Yes
Leader of Guardians of the Galaxy No Yes No Yes
Portrayed by Chris Pratt No Yes No Yes
Wrote and directed first movie No Yes No Yes
Foul language No Yes No Yes
Half-human, half-Celestial No No Yes Yes
Abducted from Earth as a child No No Yes Yes
Raised by a group of alien thieves and smugglers No No Yes Yes
In a "state of depression" No No Yes No

Conclusion

  • The pure KG(both text2GraphQuery and Graph RAG) comes with concise results, and much lower cost(for cost comparision see our previous result here )
  • The Graph+Vector RAG could be more comprehensive in case the question envolves knowledge that's fine-grained spread across more chunks than top-K searching.
QueryEngine Knowledge Graph query engine Graph RAG query engine Vector RAG query engine Graph Vector RAG query engine
Mechanism 1. Text-to-GraphQuery based on KG
2. Query KG with the result
3. Answer synthesis based on query result
1. Get related entities of the question
2. Get n-depth SubGraphs of related entities from KG
3. Answer synthesis based on related SubGraphs
1. Create embedding of question
2. Semantic search top-k related doc chunks
3. Answer synthesis based on related doc chunks
1. Do retrieval as Vector and Graph RAG
2. Answer synthesis based on both related chunks and SubGraphs
Performance Concise Concise Fruitful Fruitful, could be more comprehensive
Cost Low Low High High

7.2 Text2GraphQuery vs Graph RAG¶

In Text2GraphQuery, we leverage the LLM to compose a Graph Query that's trying to provide the answer in the RETURN fields. While, on the other hands, the Graph RAG find all related knowledges as the context.

So when the answer by nature refers to small piece of information as answer, Text2GraphQuery could do the job in the best, and in other cases, the Graph RAG could be better:

In [31]:
display(
    Markdown(
        llm(f"""
Compare the two QA result on "Tell me about Peter Quill.", list the differences between them, to help evalute them. Output in markdown table.

Result from text-to-Graph: {response_nl2kg}
---
Result from Graph RAG: {response_graph_rag}

"""
           )
    )
)
Feature Text-to-Graph Graph RAG
Superhero Team Guardians of the Galaxy Guardians of the Galaxy
Movie N/A 2014 and 2017
Role N/A Portrayed by Chris Pratt
Writing/Directing N/A Wrote and Directed the first movie
Character Trait N/A Foul language, often using the word "fuck"

We could visulize the knowledge context that were coverred during the two approach, we could see their difference:

In [58]:
graph_text2cypher = _
In [55]:
%%ngql
MATCH path_0=(p:`entity`)-[*1..2]-()
  WHERE p.`entity`.`name` == 'Peter Quill' 
RETURN path_0
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
Get connection to ('127.0.0.1', 9669)
Out[55]:
path_0
0 ("Peter Quill" :entity{name: "Peter Quill"})-[...
1 ("Peter Quill" :entity{name: "Peter Quill"})<-...
2 ("Peter Quill" :entity{name: "Peter Quill"})-[...
3 ("Peter Quill" :entity{name: "Peter Quill"})-[...
4 ("Peter Quill" :entity{name: "Peter Quill"})-[...
5 ("Peter Quill" :entity{name: "Peter Quill"})-[...
In [56]:
graph_rag = _
In [71]:
text2Cypher = """
<iframe
    src="https://www.siwei.io/demo-dumps/kg-llm/nebulagraph_draw_nl2cypher.html"
    width=450
    height=400>
</iframe>
"""
graphRAG = """
<iframe
    src="https://www.siwei.io/demo-dumps/kg-llm/nebulagraph_draw_rag.html"
    width=450
    height=400>
</iframe>
"""

table = f"""
<table>
    <tr>
        <th>Text2Cypher Traversed knowledge</th>
        <th>Graph Rag Traversed knowledge</th>
    </tr>
    <tr>
        <td>{text2Cypher}</td>
        <td>{graphRAG}</td>
    </tr>
</table>
"""

display(HTML(table))
Text2Cypher Traversed knowledge Graph Rag Traversed knowledge

Conclusion

For those tasks:

  • Potentially cares more relationed knowledge
  • Schema of the KG is sophisticated to be hard for text2cypher to express the task
  • KG quality isn't good enough
  • Multiple "starting entities" are involved

Graph RAG could be a better approach to start with.