๐ LlamaIndex Quickstart with Otelยถ
In this quickstart you will create a simple Llama Index app and learn how to log it and get feedback on an LLM response.
You'll also learn how to use feedbacks for guardrails, via filtering retrieved context.
For evaluation, we will leverage the RAG triad of groundedness, context relevance and answer relevance.
Inย [ย ]:
Copied!
# !pip install trulens trulens-apps-llamaindex trulens-providers-openai llama_index openai
# !pip install trulens trulens-apps-llamaindex trulens-providers-openai llama_index openai
Add API keysยถ
For this quickstart, you will need an Open AI key. The OpenAI key is used for embeddings, completion and evaluation.
Inย [ย ]:
Copied!
import os
if "OPENAI_API_KEY" not in os.environ:
os.environ["OPENAI_API_KEY"] = "sk-proj-..."
os.environ["TRULENS_OTEL_TRACING"] = "1"
import os
if "OPENAI_API_KEY" not in os.environ:
os.environ["OPENAI_API_KEY"] = "sk-proj-..."
os.environ["TRULENS_OTEL_TRACING"] = "1"
Import from TruLensยถ
Inย [ย ]:
Copied!
from trulens.core import TruSession
session = TruSession()
session.reset_database()
from trulens.core import TruSession
session = TruSession()
session.reset_database()
Download dataยถ
This example uses the text of Paul Grahamโs essay, โWhat I Worked Onโ, and is the canonical LlamaIndex example.
The easiest way to get it is to download it via this link and save it in a folder called data. You can do so with the following command:
Inย [ย ]:
Copied!
import os
import urllib.request
url = "https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt"
file_path = "data/paul_graham_essay.txt"
if not os.path.exists("data"):
os.makedirs("data")
if not os.path.exists(file_path):
urllib.request.urlretrieve(url, file_path)
import os
import urllib.request
url = "https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt"
file_path = "data/paul_graham_essay.txt"
if not os.path.exists("data"):
os.makedirs("data")
if not os.path.exists(file_path):
urllib.request.urlretrieve(url, file_path)
Create Simple LLM Applicationยถ
This example uses LlamaIndex which internally uses an OpenAI LLM.
Inย [ย ]:
Copied!
from llama_index.core import Settings
from llama_index.core import SimpleDirectoryReader
from llama_index.core import VectorStoreIndex
from llama_index.llms.openai import OpenAI
Settings.chunk_size = 128
Settings.chunk_overlap = 16
Settings.llm = OpenAI()
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(similarity_top_k=3)
from llama_index.core import Settings
from llama_index.core import SimpleDirectoryReader
from llama_index.core import VectorStoreIndex
from llama_index.llms.openai import OpenAI
Settings.chunk_size = 128
Settings.chunk_overlap = 16
Settings.llm = OpenAI()
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(similarity_top_k=3)
Send your first requestยถ
Inย [ย ]:
Copied!
response = query_engine.query("What did the author do growing up?")
print(response)
response = query_engine.query("What did the author do growing up?")
print(response)
Initialize Feedback Function(s)ยถ
Inย [ย ]:
Copied!
import numpy as np
from trulens.core import Feedback
from trulens.providers.openai import OpenAI
provider = OpenAI(model_engine="gpt-4.1-mini")
# Define a groundedness feedback function
f_groundedness = (
Feedback(
provider.groundedness_measure_with_cot_reasons, name="Groundedness"
)
.on_context(collect_list=True)
.on_output()
)
# Question/answer relevance between overall question and answer.
f_answer_relevance = (
Feedback(provider.relevance_with_cot_reasons, name="Answer Relevance")
.on_input()
.on_output()
)
# Context relevance between question and each context chunk.
f_context_relevance = (
Feedback(
provider.context_relevance_with_cot_reasons, name="Context Relevance"
)
.on_input()
.on_context(collect_list=False)
.aggregate(np.mean) # choose a different aggregation method if you wish
)
import numpy as np
from trulens.core import Feedback
from trulens.providers.openai import OpenAI
provider = OpenAI(model_engine="gpt-4.1-mini")
# Define a groundedness feedback function
f_groundedness = (
Feedback(
provider.groundedness_measure_with_cot_reasons, name="Groundedness"
)
.on_context(collect_list=True)
.on_output()
)
# Question/answer relevance between overall question and answer.
f_answer_relevance = (
Feedback(provider.relevance_with_cot_reasons, name="Answer Relevance")
.on_input()
.on_output()
)
# Context relevance between question and each context chunk.
f_context_relevance = (
Feedback(
provider.context_relevance_with_cot_reasons, name="Context Relevance"
)
.on_input()
.on_context(collect_list=False)
.aggregate(np.mean) # choose a different aggregation method if you wish
)
Instrument app for logging with TruLensยถ
Inย [ย ]:
Copied!
from trulens.apps.llamaindex import TruLlama
tru_query_engine_recorder = TruLlama(
query_engine,
app_name="LlamaIndex_App",
app_version="base",
feedbacks=[f_groundedness, f_answer_relevance, f_context_relevance],
)
from trulens.apps.llamaindex import TruLlama
tru_query_engine_recorder = TruLlama(
query_engine,
app_name="LlamaIndex_App",
app_version="base",
feedbacks=[f_groundedness, f_answer_relevance, f_context_relevance],
)
Inย [ย ]:
Copied!
# or as context manager
with tru_query_engine_recorder as recording:
query_engine.query("What did the author do growing up?")
# or as context manager
with tru_query_engine_recorder as recording:
query_engine.query("What did the author do growing up?")
Explore in a Dashboardยถ
Inย [ย ]:
Copied!
from trulens.dashboard import run_dashboard
run_dashboard(session) # open a local streamlit app to explore
# stop_dashboard(session) # stop if needed
from trulens.dashboard import run_dashboard
run_dashboard(session) # open a local streamlit app to explore
# stop_dashboard(session) # stop if needed