Instrumentation Overview¶
TruLens is a framework designed to help you instrument and evaluate LLM applications, including RAGs and agents. TruLens instrumentation is OpenTelemetry compatible, allowing you to interoperate with other observability systems.
Note
To turn on OpenTelemetry tracing, set the environment variable TRULENS_OTEL_TRACING
to "1".
This instrumentation capability allows you to track the entire execution flow of your app, including inputs, outputs, internal operations, and performance metrics.
Instrumenting Applications with @instrument
¶
For applications that you can edit the source code, TruLens provides a framework-agnostic instrument
decorator to annotate methods with their span type and attributes. TruLens semantic conventions lay out how to emit spans.
In the example below, you can see how we use TruLens semantic conventions to instrument the span types RETRIEVAL
, GENERATION
and RECORD_ROOT
.
In the retrieve
method, we also associate the query
argument with the span attribute RETRIEVAL.QUERY_TEXT
, and the method's return
with RETRIEVAL.RETRIEVED_CONTEXT
. We follow a similar process for the query
method.
Example
from trulens.core.otel.instrument import instrument
from trulens.otel.semconv.trace import SpanAttributes
class RAG:
@instrument(
span_type=SpanAttributes.SpanType.RETRIEVAL,
attributes={
SpanAttributes.RETRIEVAL.QUERY_TEXT: "query",
SpanAttributes.RETRIEVAL.RETRIEVED_CONTEXTS: "return",
},
)
def retrieve(self, query: str) -> list:
"""
Retrieve relevant text from vector store.
"""
@instrument(span_type=SpanAttributes.SpanType.GENERATION)
def generate_completion(self, query: str, context_str: list) -> str:
"""
Generate answer from context.
"""
@instrument(
span_type=SpanAttributes.SpanType.RECORD_ROOT,
attributes={
SpanAttributes.RECORD_ROOT.INPUT: "query",
SpanAttributes.RECORD_ROOT.OUTPUT: "return",
},
)
def query(self, query: str) -> str:
"""
Retrieve relevant text given a query, and then generate an answer from the context.
"""
Instrumenting Common App Frameworks¶
In cases where you are leveraging frameworks like Langchain
or LlamaIndex
, TruLens instruments the framework for you. To take advantage of this instrumentation, you can simply use TruChain
(Read more)for Langchain
apps or TruLlama
(Read more)for LlamaIndex
apps to wrap your application.
Example
from trulens.apps.langchain import TruChain
rag_chain = (
{"context": filtered_retriever
| format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
tru_recorder = TruChain(
rag_chain,
app_name="ChatApplication",
app_version="Base"
)
from trulens.apps.llamaindex import TruLlama
query_engine = index.as_query_engine(similarity_top_k=3)
tru_query_engine_recorder = TruLlama(
query_engine,
app_name="LlamaIndex_App",
app_version="base"
)
Instrumenting Input/Output Apps¶
TruBasicApp is a simple interface to capture the input and output of a basic LLM app. Using TruBasicApp requires no direct instrumentation, simply wrapping your app with the TruBasicApp
class.
Example
from trulens.apps.basic import TruBasicApp
def chat(prompt):
return (
client.chat.completions.create(
model="gpt-4.1",
messages=[
{
"role": "system",
"content": "You are a helpful assistant.",
},
{"role": "user", "content": prompt},
],
)
.choices[0]
.message.content
)
tru_recorder = TruBasicApp(
chat,
app_name="base"
)
Instrumenting apps via instrument_method()
¶
In cases when you do not have access to directly modify the source code of a class
(e.g. adding decorations for tracking), you can use static instrumentation methods
instead: for example, the alternative for making sure the custom retriever gets
instrumented is via instrument_method
. See a usage example below:
Using instrument.method
from trulens.core.otel.instrument import instrument_method
from somepackage.custom_retriever import CustomRetriever
instrument_method(
cls = CustomRetriever,
method_name = "retrieve",
span_type=SpanAttributes.SpanType.RETRIEVAL,
attributes={
SpanAttributes.RETRIEVAL.QUERY_TEXT: "query",
SpanAttributes.RETRIEVAL.RETRIEVED_CONTEXTS: "return",
}
)
# ... rest of the custom class follows ...
Tracking Usage Metrics¶
TruLens tracks the following usage metrics by capturing them from LLM spans.
Usage Metrics¶
- Number of requests (n_requests)
- Number of successful ones (n_successful_requests)
- Number of class scores retrieved (n_classes)
- Total tokens processed (n_tokens)
- In streaming mode, number of chunks produced (n_stream_chunks)
- Number of prompt tokens supplied (n_prompt_tokens)
- Number of completion tokens generated (n_completion_tokens)
- Cost in USD (cost)
Read more about Usage Tracking in Cost API Reference.