Query Planning in LlamaIndex¶
Query planning is a useful tool to leverage the ability of LLMs to structure the user inputs into multiple different queries, either sequentially or in parallel before answering the questions. This method improvers the response by allowing the question to be decomposed into smaller, more answerable questions.
Sub-question queries are one such method. Sub-question queries decompose the user input into multiple different sub-questions. This is great for answering complex questions that require knowledge from different documents.
Relatedly, there are a great deal of configurations for this style of application that must be selected. In this example, we'll iterate through several of these choices and evaluate each with TruLens.
Import from LlamaIndex and TruLens¶
# !pip install trulens trulens-apps-llamaindex trulens-providers-openai llama_index==0.10.11
from llama_index.core import ServiceContext
from llama_index.core import VectorStoreIndex
from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.tools import QueryEngineTool
from llama_index.core.tools import ToolMetadata
from llama_index.readers.web import SimpleWebPageReader
from trulens.core import Feedback
from trulens.core import TruSession
from trulens.apps.llamaindex import TruLlama
session = TruSession()
# NOTE: This is ONLY necessary in jupyter notebook.
# Details: Jupyter runs an event-loop behind the scenes.
# This results in nested event-loops when we start an event-loop to make async queries.
# This is normally not allowed, we use nest_asyncio to allow it for convenience.
import nest_asyncio
nest_asyncio.apply()
Set keys¶
For this example we need an OpenAI key
import os
os.environ["OPENAI_API_KEY"] = "..."
Set up evaluation¶
Here we'll use agreement with GPT-4 as our evaluation metric.
from trulens.providers.openai import OpenAI
openai = OpenAI()
model_agreement = Feedback(openai.model_agreement).on_input_output()
Run the dashboard¶
By starting the dashboard ahead of time, we can watch as the evaluations get logged. This is especially useful for longer-running applications.
from trulens.dashboard import run_dashboard
run_dashboard(session)
Load Data¶
# load data
documents = SimpleWebPageReader(html_to_text=True).load_data(
["https://www.gutenberg.org/files/11/11-h/11-h.htm"]
)
Set configuration space¶
# iterate through embeddings and chunk sizes, evaluating each response's agreement with chatgpt using TruLens
embeddings = ["text-embedding-ada-001", "text-embedding-ada-002"]
query_engine_types = ["VectorStoreIndex", "SubQuestionQueryEngine"]
service_context = 512
Set test prompts¶
# set test prompts
prompts = [
"Describe Alice's growth from meeting the White Rabbit to challenging the Queen of Hearts?",
"Relate aspects of enchantment to the nostalgia that Alice experiences in Wonderland. Why is Alice both fascinated and frustrated by her encounters below-ground?",
"Describe the White Rabbit's function in Alice.",
"Describe some of the ways that Carroll achieves humor at Alice's expense.",
"Compare the Duchess' lullaby to the 'You Are Old, Father William' verse",
"Compare the sentiment of the Mouse's long tale, the Mock Turtle's story and the Lobster-Quadrille.",
"Summarize the role of the mad hatter in Alice's journey",
"How does the Mad Hatter influence the arc of the story throughout?",
]
Iterate through configuration space¶
for embedding in embeddings:
for query_engine_type in query_engine_types:
# build index and query engine
index = VectorStoreIndex.from_documents(documents)
# create embedding-based query engine from index
query_engine = index.as_query_engine(embed_model=embedding)
if query_engine_type == "SubQuestionQueryEngine":
service_context = ServiceContext.from_defaults(chunk_size=512)
# setup base query engine as tool
query_engine_tools = [
QueryEngineTool(
query_engine=query_engine,
metadata=ToolMetadata(
name="Alice in Wonderland",
description="THE MILLENNIUM FULCRUM EDITION 3.0",
),
)
]
query_engine = SubQuestionQueryEngine.from_defaults(
query_engine_tools=query_engine_tools,
service_context=service_context,
)
else:
pass
tru_query_engine_recorder = TruLlama(
app_name=f"{query_engine_type}_{embedding}",
app=query_engine,
feedbacks=[model_agreement],
)
# tru_query_engine_recorder as context manager
with tru_query_engine_recorder as recording:
for prompt in prompts:
query_engine.query(prompt)