To enable Cognitive intelligence App towards private data, RAG + LLM and Knowledge Graph are state-of-the-art approaches.
In this demo, we will explain know-how of four types of approaches and compare the trade-off and performance among them.
QueryEngine | Knowledge Graph query engine | Graph RAG query engine | Vector RAG query engine | Graph Vector RAG query engine |
---|---|---|---|---|
Mechanism | 1. Text-to-GraphQuery based on KG 2. Query KG with the result 3. Answer synthesis based on query result |
1. Get related entities of the question 2. Get n-depth SubGraphs of related entities from KG 3. Answer synthesis based on related SubGraphs |
1. Create embedding of question 2. Semantic search top-k related doc chunks 3. Answer synthesis based on related doc chunks |
1. Do retrieval as Vector and Graph RAG 2. Answer synthesis based on both related chunks and SubGraphs |
Below digrams are showing how RAG works:
RAG with Llama Index
┌────┬────┬────┬────┐
│ 1 │ 2 │ 3 │ 4 │
├────┴────┴────┴────┤
│ Docs/Knowledge │
┌───────┐ │ ... │ ┌─────────┐
│ │ ├────┬────┬────┬────┤ │ │
│ │ │ 95 │ 96 │ │ │ │ │
│ │ └────┴────┴────┴────┘ │ │
│ User │─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─▶ LLM │
│ │ │ │
│ │ │ │
└───────┘ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ └─────────┘
│ ┌──────────────────────────┐ ▲
└────────┼▶│ Tell me ....., please │├───────┘
└──────────────────────────┘
│ ┌────┐ ┌────┐ │
│ 3 │ │ 96 │
│ └────┘ └────┘ │
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
In VectorDB based RAG, we create embeddings of each node(chunk), and find TopK related ones towards a given question during the query. In the above diagram, nodes 3
and 96
were fetched as the TopK related nodes, used to help answer the user query.
In Graph RAG, we will extract relationships between entities, representing concise facts from each node. It would look something like this:
Node Split and Embedding
┌────┬────┬────┬────┐
│ 1 │ 2 │ 3 │ 4 │
├────┴────┴────┴────┤
│ Docs/Knowledge │
│ ... │
├────┬────┬────┬────┤
│ 95 │ 96 │ │ │
└────┴────┴────┴────┘
Then, if we zoomed in of it:
Node Split and Embedding, with Knowledge Graph being extracted
┌──────────────────┬──────────────────┬──────────────────┬──────────────────┐
│ .─. .─. │ .─. .─. │ .─. │ .─. .─. │
│( x )─────▶ y ) │ ( x )─────▶ a ) │ ( j ) │ ( m )◀────( x ) │
│ `▲' `─' │ `─' `─' │ `─' │ `─' `─' │
│ │ 1 │ 2 │ 3 │ │ 4 │
│ .─. │ │ .▼. │ │
│( z )─────────────┼──────────────────┼──────────▶( i )─┐│ │
│ `◀────┐ │ │ `─' ││ │
├───────┼──────────┴──────────────────┴─────────────────┼┴──────────────────┤
│ │ Docs/Knowledge │ │
│ │ ... │ │
│ │ │ │
├───────┼──────────┬──────────────────┬─────────────────┼┬──────────────────┤
│ .─. └──────. │ .─. │ ││ .─. │
│ ( x ◀─────( b ) │ ( x ) │ └┼▶( n ) │
│ `─' `─' │ `─' │ │ `─' │
│ 95 │ │ │ 96 │ │ │ 98 │
│ .▼. │ .▼. │ │ ▼ │
│ ( c ) │ ( d ) │ │ .─. │
│ `─' │ `─' │ │ ( x ) │
└──────────────────┴──────────────────┴──────────────────┴──`─'─────────────┘
Where, knowledge, the more granular spliting and information with higher density, optionally multi-hop of x -> y
, i -> j -> z -> x
etc... across many more nodes(chunks) than K(in TopK search) could be inlucded in Retrievers. And we believe there are cases that this additional work matters.
But how/how well exactly does it work? Let's see in this demo.
# For OpenAI
import os
os.environ["OPENAI_API_KEY"] = "INSERT YOUR KEY"
import logging
import sys
logging.basicConfig(
stream=sys.stdout, level=logging.INFO
) # logging.DEBUG for more verbose output
from llama_index import (
KnowledgeGraphIndex,
ServiceContext,
SimpleDirectoryReader,
)
from llama_index.storage.storage_context import StorageContext
from llama_index.graph_stores import NebulaGraphStore
from llama_index.llms import OpenAI
from IPython.display import Markdown, display
# define LLM
# NOTE: at the time of demo, text-davinci-002 did not have rate-limit errors
llm = OpenAI(temperature=0, model="text-davinci-002")
service_context = ServiceContext.from_defaults(llm=llm, chunk_size=512)
INFO:numexpr.utils:NumExpr defaulting to 8 threads.
# For Azure OpenAI
import os
import json
import openai
from langchain.embeddings import OpenAIEmbeddings
from llama_index.llms import AzureOpenAI
from llama_index import LangchainEmbedding
from llama_index import (
VectorStoreIndex,
SimpleDirectoryReader,
KnowledgeGraphIndex,
ServiceContext,
)
from llama_index import set_global_service_context
from llama_index.storage.storage_context import StorageContext
from llama_index.graph_stores import NebulaGraphStore
import logging
import sys
from IPython.display import Markdown, display
logging.basicConfig(
stream=sys.stdout, level=logging.INFO
) # logging.DEBUG for more verbose output
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
openai.api_type = "azure"
openai.api_base = "https://<foo-bar>.openai.azure.com"
openai.api_version = "2022-12-01"
os.environ["OPENAI_API_KEY"] = "youcannottellanyone"
openai.api_key = os.getenv("OPENAI_API_KEY")
llm = AzureOpenAI(
engine="<foo-bar-deployment>",
temperature=0,
openai_api_version=openai.api_version,
model_kwargs={
"api_key": openai.api_key,
"api_base": openai.api_base,
"api_type": openai.api_type,
"api_version": openai.api_version,
},
)
# You need to deploy your own embedding model as well as your own chat completion model
embedding_llm = LangchainEmbedding(
OpenAIEmbeddings(
model="text-embedding-ada-002",
deployment="<foo-bar-deployment>",
openai_api_key=openai.api_key,
openai_api_base=openai.api_base,
openai_api_type=openai.api_type,
openai_api_version=openai.api_version,
),
embed_batch_size=1,
)
service_context = ServiceContext.from_defaults(
llm=llm,
embed_model=embedding_llm,
)
set_global_service_context(service_context)
❗Access NebulaGraph Console to create space and graph schema
CREATE SPACE guardians(vid_type=FIXED_STRING(256), partition_num=1, replica_factor=1);
:sleep 10;
USE guardians;
CREATE TAG entity(name string);
CREATE EDGE relationship(relationship string);
:sleep 10;
CREATE TAG INDEX entity_index ON entity(name(256));
%pip install nebula3-python ipython-ngql
os.environ['NEBULA_USER'] = "root"
os.environ['NEBULA_PASSWORD'] = "nebula" # default password
os.environ['NEBULA_ADDRESS'] = "127.0.0.1:9669" # assumed we have NebulaGraph installed locally
space_name = "guardians"
edge_types, rel_prop_names = ["relationship"], ["relationship"] # default, could be omit if create from an empty kg
tags = ["entity"] # default, could be omit if create from an empty kg
graph_store = NebulaGraphStore(
space_name=space_name,
edge_types=edge_types,
rel_prop_names=rel_prop_names,
tags=tags,
)
storage_context = StorageContext.from_defaults(graph_store=graph_store)
Requirement already satisfied: nebula3-python in /opt/homebrew/lib/python3.11/site-packages/nebula3_python-3.4.0-py3.11.egg (3.4.0) Requirement already satisfied: ipython-ngql in /opt/homebrew/lib/python3.11/site-packages/ipython_ngql-0.5-py3.11.egg (0.5) Requirement already satisfied: httplib2>=0.20.0 in /opt/homebrew/lib/python3.11/site-packages/httplib2-0.22.0-py3.11.egg (from nebula3-python) (0.22.0) Requirement already satisfied: future>=0.18.0 in /opt/homebrew/lib/python3.11/site-packages/future-0.18.3-py3.11.egg (from nebula3-python) (0.18.3) Requirement already satisfied: six>=1.16.0 in /opt/homebrew/lib/python3.11/site-packages (from nebula3-python) (1.16.0) Requirement already satisfied: pytz>=2021.1 in /opt/homebrew/lib/python3.11/site-packages (from nebula3-python) (2022.7.1) Requirement already satisfied: Jinja2 in /opt/homebrew/lib/python3.11/site-packages/Jinja2-3.1.2-py3.11.egg (from ipython-ngql) (3.1.2) Requirement already satisfied: pandas in /opt/homebrew/lib/python3.11/site-packages (from ipython-ngql) (1.5.3) Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in /opt/homebrew/lib/python3.11/site-packages/pyparsing-3.1.0b1-py3.11.egg (from httplib2>=0.20.0->nebula3-python) (3.1.0b1) Requirement already satisfied: MarkupSafe>=2.0 in /opt/homebrew/lib/python3.11/site-packages/MarkupSafe-2.1.2-py3.11-macosx-13-arm64.egg (from Jinja2->ipython-ngql) (2.1.2) Requirement already satisfied: python-dateutil>=2.8.1 in /opt/homebrew/lib/python3.11/site-packages (from pandas->ipython-ngql) (2.8.2) Requirement already satisfied: numpy>=1.21.0 in /opt/homebrew/lib/python3.11/site-packages (from pandas->ipython-ngql) (1.24.0) Note: you may need to restart the kernel to use updated packages.
In our demo, the Knowledge Graph was created with LLM.
We simply do so leveragint the KnowledgeGraphIndex
from LlamaIndex, when creating it, Triplets will be extracted with LLM and evantually persisted into NebulaGraphStore
.
We will download and preprecess data from: https://en.wikipedia.org/wiki/Guardians_of_the_Galaxy_Vol._3
from llama_index import download_loader
WikipediaReader = download_loader("WikipediaReader")
loader = WikipediaReader()
documents = loader.load_data(pages=['Guardians of the Galaxy Vol. 3'], auto_suggest=False)
This call will take some time, it'll extract entities and relationships and store them into NebulaGraph
kg_index = KnowledgeGraphIndex.from_documents(
documents,
storage_context=storage_context,
service_context=service_context,
max_triplets_per_chunk=10,
space_name=space_name,
edge_types=edge_types,
rel_prop_names=rel_prop_names,
tags=tags,
include_embeddings=True,
)
To compare with/work together with VectorDB based RAG, let's also create a VectorStoreIndex
.
During the creation, same data source will be split into chunks and embedding of them will be created, during the RAG query time, the top-k related embeddings will be vector-searched with the embedding of the question.
RAG with Llama Index
┌────┬────┬────┬────┐
│ 1 │ 2 │ 3 │ 4 │
├────┴────┴────┴────┤
│ Docs/Knowledge │
┌───────┐ │ ... │ ┌─────────┐
│ │ ├────┬────┬────┬────┤ │ │
│ │ │ 95 │ 96 │ │ │ │ │
│ │ └────┴────┴────┴────┘ │ │
│ User │─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─▶ LLM │
│ │ │ │
│ │ │ │
└───────┘ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ └─────────┘
│ ┌──────────────────────────┐ ▲
└────────┼▶│ Tell me ....., please │├───────┘
└──────────────────────────┘
│ ┌────┐ ┌────┐ │
│ 3 │ │ 96 │
│ └────┘ └────┘ │
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
In Llama Index, this could be done with oneline of code:
vector_index = VectorStoreIndex.from_documents(
documents,
service_context=service_context
)
Both the KnowledgeGraphIndex
and VectorStoreIndex
will be created only once, afterwards, we could persist their in-memory context to enable their reuse from disk anytime.
# persist KG Index(Only MetaData will be persisted, KG is in NebulaGraph)
kg_index.storage_context.persist(persist_dir='./storage_graph')
# persist Vector Index
vector_index.storage_context.persist(persist_dir='./storage_vector')
Then the files are created:
$ ls -l ./storage_*
storage_graph:
total 6384
-rw-r--r--@ 1 weyl staff 44008 Jul 14 11:06 docstore.json
-rw-r--r--@ 1 weyl staff 3219385 Jul 14 11:06 index_store.json
-rw-r--r--@ 1 weyl staff 51 Jul 14 11:06 vector_store.json
storage_vector:
total 712
-rw-r--r--@ 1 weyl staff 44008 Jul 14 11:06 docstore.json
-rw-r--r--@ 1 weyl staff 18 Jul 14 11:06 graph_store.json
-rw-r--r--@ 1 weyl staff 1003 Jul 14 11:06 index_store.json
-rw-r--r--@ 1 weyl staff 311028 Jul 14 11:06 vector_store.json
So we could restore the index from disk like:
from llama_index import load_index_from_storage
storage_context = StorageContext.from_defaults(persist_dir='./storage_graph', graph_store=graph_store)
kg_index = load_index_from_storage(
storage_context=storage_context,
service_context=service_context,
max_triplets_per_chunk=10,
space_name=space_name,
edge_types=edge_types,
rel_prop_names=rel_prop_names,
tags=tags,
include_embeddings=True,
)
storage_context_vector = StorageContext.from_defaults(persist_dir='./storage_vector')
vector_index = load_index_from_storage(
service_context=service_context,
storage_context=storage_context_vector
)
# vector_index.storage_context.persist(persist_dir='./storage_vector')
from llama_index import load_index_from_storage
storage_context = StorageContext.from_defaults(persist_dir='./storage_graph', graph_store=graph_store)
kg_index = load_index_from_storage(
storage_context=storage_context,
service_context=service_context,
max_triplets_per_chunk=10,
space_name=space_name,
edge_types=edge_types,
rel_prop_names=rel_prop_names,
tags=tags,
include_embeddings=True,
)
storage_context_vector = StorageContext.from_defaults(persist_dir='./storage_vector')
vector_index = load_index_from_storage(
service_context=service_context,
storage_context=storage_context_vector
)
INFO:llama_index.indices.loading:Loading all indices. Loading all indices. INFO:llama_index.indices.loading:Loading all indices. Loading all indices.
We will do 4 types of query approaches with LLM, KG, VectorDB:
QueryEngine | Knowledge Graph query engine | Graph RAG query engine | Vector RAG query engine | Graph Vector RAG query engine |
---|---|---|---|---|
Mechanism | 1. Text-to-GraphQuery based on KG 2. Query KG with the result 3. Answer synthesis based on query result |
1. Get related entities of the question 2. Get n-depth SubGraphs of related entities from KG 3. Answer synthesis based on related SubGraphs |
1. Create embedding of question 2. Semantic search top-k related doc chunks 3. Answer synthesis based on related doc chunks |
1. Do retrieval as Vector and Graph RAG 2. Answer synthesis based on both related chunks and SubGraphs |
Text-to-NebulaGraphCypher approach Translate task/question into a Graph Cypher Query, and answer based on its query result.
from llama_index.query_engine import KnowledgeGraphQueryEngine
from llama_index.storage.storage_context import StorageContext
from llama_index.graph_stores import NebulaGraphStore
nl2kg_query_engine = KnowledgeGraphQueryEngine(
storage_context=storage_context,
service_context=service_context,
llm=llm,
verbose=True,
)
Graph RAG takes SubGraphs related to entities of the task/question as Context.
Graph + Vector RAG with Llama Index
┌────┬────┬────┬────┐
│ 1 │ 2 │ 3 │ 4 │
├────┴────┴────┴────┤
│ Docs/Knowledge │
┌───────┐ │ ... │ ┌─────────┐
│ │ ├────┬────┬────┬────┤ │ │
│ │ │ 95 │ 96 │ │ │ │ │
│ │ └────┴────┴────┴────┘ │ │
│ User │─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─▶ LLM │
│ │ │ │
│ │ │ │
└───────┘ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ └─────────┘
│ ┌──────────────────────────┐ ▲
└────────┼▶│ Tell me about x, please │├───────┘
└──────────────────────────┘
│ Below are knowledge about x │
x->y<-z,x->h->i, m<-n,...
│ Please answer based on them │
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
kg_rag_query_engine = kg_index.as_query_engine(
include_text=False,
retriever_mode="keyword",
response_mode="tree_summarize",
)
Vector RAG is the common approach to find topK semantic related doc chunks as context to synthesize the answer.
vector_rag_query_engine = vector_index.as_query_engine()
This is a combined Graph+Vector Based RAG, where we will retrieve both VectorDB and KG SubGraphs as the context, for synthesis of the answer.
Graph + Vector RAG with Llama Index
┌────┬────┬────┬────┐
│ 1 │ 2 │ 3 │ 4 │
├────┴────┴────┴────┤
│ Docs/Knowledge │
┌───────┐ │ ... │ ┌─────────┐
│ │ ├────┬────┬────┬────┤ │ │
│ │ │ 95 │ 96 │ │ │ │ │
│ │ └────┴────┴────┴────┘ │ │
│ User │─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─▶ LLM │
│ │ │ │
│ │ │ │
└───────┘ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ └─────────┘
│ ┌──────────────────────────┐ ▲
└────────┼▶│ Tell me ....., please │├───────┘
└──────────────────────────┘
│ ┌────┐┌────┐ │
│ 3 ││ 96 │ x->y<-z,x->h...
│ └────┘└────┘ │
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
To implement that in Llama Index, we create a CustomRetriever
to comebine the two:
# import QueryBundle
from llama_index import QueryBundle
# import NodeWithScore
from llama_index.schema import NodeWithScore
# Retrievers
from llama_index.retrievers import BaseRetriever, VectorIndexRetriever, KGTableRetriever
from typing import List
class CustomRetriever(BaseRetriever):
"""Custom retriever that performs both Vector search and Knowledge Graph search"""
def __init__(
self,
vector_retriever: VectorIndexRetriever,
kg_retriever: KGTableRetriever,
mode: str = "OR",
) -> None:
"""Init params."""
self._vector_retriever = vector_retriever
self._kg_retriever = kg_retriever
if mode not in ("AND", "OR"):
raise ValueError("Invalid mode.")
self._mode = mode
def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]:
"""Retrieve nodes given query."""
vector_nodes = self._vector_retriever.retrieve(query_bundle)
kg_nodes = self._kg_retriever.retrieve(query_bundle)
vector_ids = {n.node.node_id for n in vector_nodes}
kg_ids = {n.node.node_id for n in kg_nodes}
combined_dict = {n.node.node_id: n for n in vector_nodes}
combined_dict.update({n.node.node_id: n for n in kg_nodes})
if self._mode == "AND":
retrieve_ids = vector_ids.intersection(kg_ids)
else:
retrieve_ids = vector_ids.union(kg_ids)
retrieve_nodes = [combined_dict[rid] for rid in retrieve_ids]
return retrieve_nodes
Next, we will create instances of the Vector and KG retrievers, which will be used in the instantiation of the Custom Retriever.
from llama_index import get_response_synthesizer
from llama_index.query_engine import RetrieverQueryEngine
# create custom retriever
vector_retriever = VectorIndexRetriever(index=vector_index)
kg_retriever = KGTableRetriever(
index=kg_index, retriever_mode="keyword", include_text=False
)
custom_retriever = CustomRetriever(vector_retriever, kg_retriever)
# create response synthesizer
response_synthesizer = get_response_synthesizer(
service_context=service_context,
response_mode="tree_summarize",
)
And the query engine:
graph_vector_rag_query_engine = RetrieverQueryEngine(
retriever=custom_retriever,
response_synthesizer=response_synthesizer,
)
response_nl2kg = nl2kg_query_engine.query("Tell me about Peter Quill.")
display(Markdown(f"<b>{response_nl2kg}</b>"))
# Cypher:
print("Cypher Query:")
graph_query = nl2kg_query_engine.generate_query(
"Tell me about Peter Quill?",
)
graph_query = graph_query.replace("WHERE", "\n WHERE").replace("RETURN", "\nRETURN")
display(
Markdown(
f"""
```cypher
{graph_query}
```
"""
)
)
Graph Store Query: MATCH (p:`entity`)-[:relationship]->(e:`entity`) WHERE p.`entity`.`name` == 'Peter Quill' RETURN e.`entity`.`name`; INFO:llama_index.query_engine.knowledge_graph_query_engine:Graph Store Query: MATCH (p:`entity`)-[:relationship]->(e:`entity`) WHERE p.`entity`.`name` == 'Peter Quill' RETURN e.`entity`.`name`; Graph Store Query: MATCH (p:`entity`)-[:relationship]->(e:`entity`) WHERE p.`entity`.`name` == 'Peter Quill' RETURN e.`entity`.`name`; Graph Store Response: {'e.entity.name': ['Guardians of the Galaxy']} INFO:llama_index.query_engine.knowledge_graph_query_engine:Graph Store Response: {'e.entity.name': ['Guardians of the Galaxy']} Graph Store Response: {'e.entity.name': ['Guardians of the Galaxy']} Final Response: Peter Quill is a member of the superhero team known as the Guardians of the Galaxy.
Cypher Query:
MATCH (p:`entity`)-[:relationship]->(e:`entity`)
WHERE p.`entity`.`name` == 'Peter Quill'
RETURN e.`entity`.`name`;
response_graph_rag = kg_rag_query_engine.query("Tell me about Peter Quill.")
display(Markdown(f"<b>{response_graph_rag}</b>"))
INFO:llama_index.indices.knowledge_graph.retriever:> Starting query: Tell me about Peter Quill. > Starting query: Tell me about Peter Quill. INFO:llama_index.indices.knowledge_graph.retriever:> Query keywords: ['biography', 'Peter Quill', 'history', 'Peter', 'Quill'] > Query keywords: ['biography', 'Peter Quill', 'history', 'Peter', 'Quill'] INFO:llama_index.indices.knowledge_graph.retriever:> Extracted relationships: The following are knowledge triplets in max depth 2 in the form of `subject [predicate, object, predicate_next_hop, object_next_hop ...]` Peter Quill ['is leader of', 'Guardians of the Galaxy', 'released in', '2014'] Peter Quill ['portrays', 'Peter Quill'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'reprised role from', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'directed', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'wrote', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'sequel to', 'Guardians of the Galaxy'] Quill ['speaks', ' fuck '] > Extracted relationships: The following are knowledge triplets in max depth 2 in the form of `subject [predicate, object, predicate_next_hop, object_next_hop ...]` Peter Quill ['is leader of', 'Guardians of the Galaxy', 'released in', '2014'] Peter Quill ['portrays', 'Peter Quill'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'reprised role from', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'directed', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'wrote', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'sequel to', 'Guardians of the Galaxy'] Quill ['speaks', ' fuck ']
response_vector_rag = vector_rag_query_engine.query("Tell me about Peter Quill.")
display(Markdown(f"<b>{response_vector_rag}</b>"))
display(
Markdown(
llm(f"""
Compare the two QA result on "Tell me about Peter Quill.", list the differences between them, to help evalute them. Output in markdown table.
Result from Graph: {response_graph_rag}
---
Result from Vector: {response_vector_rag}
"""
)
)
)
Result | Graph | Vector |
---|---|---|
Character | Leader of the Guardians of the Galaxy | Half-human, half-Celestial leader of the Guardians of the Galaxy |
Movie | Portrayed by Chris Pratt in the 2014 movie of the same name, and reprised his role in the 2017 sequel. He also wrote and directed the first movie. | In the film, Quill is in a "state of depression" following the appearance of a variant of his dead lover Gamora, who does not share the same affection for Quill as her older version had for him, which in turn affects his leadership of the Guardians. |
Personality | Known for his foul language, often using the word "fuck". | No mention |
response_graph_vector_rag = graph_vector_rag_query_engine.query("Tell me about Peter Quill.")
display(Markdown(f"<b>{response_graph_vector_rag}</b>"))
INFO:llama_index.indices.knowledge_graph.retriever:> Starting query: Tell me about Peter Quill. > Starting query: Tell me about Peter Quill. INFO:llama_index.indices.knowledge_graph.retriever:> Query keywords: ['biography', 'Peter Quill', 'history', 'Peter', 'Quill'] > Query keywords: ['biography', 'Peter Quill', 'history', 'Peter', 'Quill'] INFO:llama_index.indices.knowledge_graph.retriever:> Extracted relationships: The following are knowledge triplets in max depth 2 in the form of `subject [predicate, object, predicate_next_hop, object_next_hop ...]` Peter Quill ['is leader of', 'Guardians of the Galaxy', 'released in', '2014'] Peter Quill ['portrays', 'Peter Quill'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'reprised role from', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'directed', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'wrote', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'sequel to', 'Guardians of the Galaxy'] Quill ['speaks', ' fuck '] > Extracted relationships: The following are knowledge triplets in max depth 2 in the form of `subject [predicate, object, predicate_next_hop, object_next_hop ...]` Peter Quill ['is leader of', 'Guardians of the Galaxy', 'released in', '2014'] Peter Quill ['portrays', 'Peter Quill'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'reprised role from', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'directed', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'wrote', 'Guardians of the Galaxy'] Peter Quill ['is leader of', 'Guardians of the Galaxy', 'sequel to', 'Guardians of the Galaxy'] Quill ['speaks', ' fuck ']
Let's compare the results of them.
First check the information that were coverred by different approaches:
display(
Markdown(
llm(f"""
Compare the QA results on "Tell me about Peter Quill.", list the knowledge facts between them, to help evalute them. Output in markdown table.
Result text2GraphQuery: {response_nl2kg}
---
Result Graph: {response_graph_rag}
---
Result Vector: {response_vector_rag}
---
Result Graph+Vector: {response_graph_vector_rag}
---
"""
)
)
)
Knowledge Fact | text2GraphQuery | Graph | Vector | Graph+Vector |
---|---|---|---|---|
Member of Guardians of the Galaxy | Yes | Yes | No | Yes |
Leader of Guardians of the Galaxy | No | Yes | No | Yes |
Portrayed by Chris Pratt | No | Yes | No | Yes |
Wrote and directed first movie | No | Yes | No | Yes |
Foul language | No | Yes | No | Yes |
Half-human, half-Celestial | No | No | Yes | Yes |
Abducted from Earth as a child | No | No | Yes | Yes |
Raised by a group of alien thieves and smugglers | No | No | Yes | Yes |
In a "state of depression" | No | No | Yes | No |
Conclusion
QueryEngine | Knowledge Graph query engine | Graph RAG query engine | Vector RAG query engine | Graph Vector RAG query engine |
---|---|---|---|---|
Mechanism | 1. Text-to-GraphQuery based on KG 2. Query KG with the result 3. Answer synthesis based on query result |
1. Get related entities of the question 2. Get n-depth SubGraphs of related entities from KG 3. Answer synthesis based on related SubGraphs |
1. Create embedding of question 2. Semantic search top-k related doc chunks 3. Answer synthesis based on related doc chunks |
1. Do retrieval as Vector and Graph RAG 2. Answer synthesis based on both related chunks and SubGraphs |
Performance | Concise | Concise | Fruitful | Fruitful, could be more comprehensive |
Cost | Low | Low | High | High |
In Text2GraphQuery, we leverage the LLM to compose a Graph Query that's trying to provide the answer in the RETURN
fields. While, on the other hands, the Graph RAG find all related knowledges as the context.
So when the answer by nature refers to small piece of information as answer, Text2GraphQuery could do the job in the best, and in other cases, the Graph RAG could be better:
display(
Markdown(
llm(f"""
Compare the two QA result on "Tell me about Peter Quill.", list the differences between them, to help evalute them. Output in markdown table.
Result from text-to-Graph: {response_nl2kg}
---
Result from Graph RAG: {response_graph_rag}
"""
)
)
)
Feature | Text-to-Graph | Graph RAG |
---|---|---|
Superhero Team | Guardians of the Galaxy | Guardians of the Galaxy |
Movie | N/A | 2014 and 2017 |
Role | N/A | Portrayed by Chris Pratt |
Writing/Directing | N/A | Wrote and Directed the first movie |
Character Trait | N/A | Foul language, often using the word "fuck" |
We could visulize the knowledge context that were coverred during the two approach, we could see their difference:
graph_text2cypher = _
%%ngql
MATCH path_0=(p:`entity`)-[*1..2]-()
WHERE p.`entity`.`name` == 'Peter Quill'
RETURN path_0
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669) Get connection to ('127.0.0.1', 9669)
path_0 | |
---|---|
0 | ("Peter Quill" :entity{name: "Peter Quill"})-[... |
1 | ("Peter Quill" :entity{name: "Peter Quill"})<-... |
2 | ("Peter Quill" :entity{name: "Peter Quill"})-[... |
3 | ("Peter Quill" :entity{name: "Peter Quill"})-[... |
4 | ("Peter Quill" :entity{name: "Peter Quill"})-[... |
5 | ("Peter Quill" :entity{name: "Peter Quill"})-[... |
graph_rag = _
text2Cypher = """
<iframe
src="https://www.siwei.io/demo-dumps/kg-llm/nebulagraph_draw_nl2cypher.html"
width=450
height=400>
</iframe>
"""
graphRAG = """
<iframe
src="https://www.siwei.io/demo-dumps/kg-llm/nebulagraph_draw_rag.html"
width=450
height=400>
</iframe>
"""
table = f"""
<table>
<tr>
<th>Text2Cypher Traversed knowledge</th>
<th>Graph Rag Traversed knowledge</th>
</tr>
<tr>
<td>{text2Cypher}</td>
<td>{graphRAG}</td>
</tr>
</table>
"""
display(HTML(table))
Text2Cypher Traversed knowledge | Graph Rag Traversed knowledge |
---|---|
Conclusion
For those tasks:
Graph RAG could be a better approach to start with.