📓 🦙 LlamaIndex Integration¶
TruLens provides TruLlama, a deep integration with LlamaIndex to allow you to inspect and evaluate the internals of your application built using LlamaIndex. This is done through the instrumentation of key LlamaIndex classes and methods. To see all classes and methods instrumented, see Appendix: LlamaIndex Instrumented Classes and Methods.
In addition to the default instrumentation, TruChain exposes the select_context and select_source_nodes methods for evaluations that require access to retrieved context or source nodes. Exposing these methods bypasses the need to know the json structure of your app ahead of time, and makes your evaluations re-usable across different apps.
Example usage¶
Below is a quick example of usage. First, we'll create a standard LlamaIndex query engine from Paul Graham's Essay, What I Worked On
from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader
documents = SimpleWebPageReader(html_to_text=True).load_data(
["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
To instrument an LlamaIndex query engine, all that's required is to wrap it using TruLlama.
from trulens_eval import TruLlama
tru_query_engine_recorder = TruLlama(query_engine)
with tru_query_engine_recorder as recording:
print(query_engine.query("What did the author do growing up?"))
🦑 Tru initialized with db url sqlite:///default.sqlite . 🛑 Secret keys may be written to the database. See the `database_redact_keys` option of Tru` to prevent this. The author, growing up, worked on writing short stories and programming.
To properly evaluate LLM apps we often need to point our evaluation at an internal step of our application, such as the retreived context. Doing so allows us to evaluate for metrics including context relevance and groundedness.
For LlamaIndex applications where the source nodes are used, select_context
can be used to access the retrieved text for evaluation.
from trulens_eval.feedback.provider import OpenAI
from trulens_eval.feedback import Feedback
import numpy as np
provider = OpenAI()
context = TruLlama.select_context(query_engine)
f_context_relevance = (
Feedback(provider.context_relevance)
.on_input()
.on(context)
.aggregate(np.mean)
)
For added flexibility, the select_context method is also made available through
trulens_eval.app.App
. This allows you to switch between frameworks without
changing your context selector:
from trulens_eval.app import App
context = App.select_context(query_engine)
You can find the full quickstart available here: LlamaIndex Quickstart
Async Support¶
TruLlama also provides async support for LlamaIndex through the aquery
,
achat
, and astream_chat
methods. This allows you to track and evaluate async
applciations.
As an example, below is an LlamaIndex async chat engine (achat
).
# Imports main tools:
from trulens_eval import TruLlama, Tru
tru = Tru()
from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader
documents = SimpleWebPageReader(html_to_text=True).load_data(
["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)
chat_engine = index.as_chat_engine()
To instrument an LlamaIndex achat
engine, all that's required is to wrap it using TruLlama - just like with the query engine.
tru_chat_recorder = TruLlama(chat_engine)
with tru_chat_recorder as recording:
llm_response_async = await chat_engine.achat("What did the author do growing up?")
print(llm_response_async)
A new object of type ChatMemoryBuffer at 0x2bf581210 is calling an instrumented method put. The path of this call may be incorrect. Guessing path of new object is app.memory based on other object (0x2bf5e5050) using this function. Could not determine main output from None. Could not determine main output from None. Could not determine main output from None. Could not determine main output from None.
The author worked on writing short stories and programming while growing up.
Streaming Support¶
TruLlama also provides streaming support for LlamaIndex. This allows you to track and evaluate streaming applications.
As an example, below is an LlamaIndex query engine with streaming.
from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader
from trulens_eval import TruLlama
documents = SimpleWebPageReader(html_to_text=True).load_data(
["http://paulgraham.com/worked.html"]
)
index = VectorStoreIndex.from_documents(documents)
chat_engine = index.as_chat_engine(streaming=True)
Just like with other methods, just wrap your streaming query engine with TruLlama and operate like before.
You can also print the response tokens as they are generated using the response_gen
attribute.
tru_chat_engine_recorder = TruLlama(chat_engine)
with tru_chat_engine_recorder as recording:
response = chat_engine.stream_chat("What did the author do growing up?")
for c in response.response_gen:
print(c)
A new object of type ChatMemoryBuffer at 0x2c1df9950 is calling an instrumented method put. The path of this call may be incorrect. Guessing path of new object is app.memory based on other object (0x2c08b04f0) using this function. Could not find usage information in openai response: <openai.Stream object at 0x2bf5f3ed0> Could not find usage information in openai response: <openai.Stream object at 0x2bf5f3ed0>
For more usage examples, check out the LlamaIndex examples directory.
Appendix: LlamaIndex Instrumented Classes and Methods¶
The modules, classes, and methods that trulens instruments can be retrieved from the appropriate Instrument subclass.
from trulens_eval.tru_llama import LlamaInstrument
LlamaInstrument().print_instrumentation()
Module langchain* Class langchain.agents.agent.BaseMultiActionAgent Method plan: (self, intermediate_steps: 'List[Tuple[AgentAction, str]]', callbacks: 'Callbacks' = None, **kwargs: 'Any') -> 'Union[List[AgentAction], AgentFinish]' Method aplan: (self, intermediate_steps: 'List[Tuple[AgentAction, str]]', callbacks: 'Callbacks' = None, **kwargs: 'Any') -> 'Union[List[AgentAction], AgentFinish]' Class langchain.agents.agent.BaseSingleActionAgent Method plan: (self, intermediate_steps: 'List[Tuple[AgentAction, str]]', callbacks: 'Callbacks' = None, **kwargs: 'Any') -> 'Union[AgentAction, AgentFinish]' Method aplan: (self, intermediate_steps: 'List[Tuple[AgentAction, str]]', callbacks: 'Callbacks' = None, **kwargs: 'Any') -> 'Union[AgentAction, AgentFinish]' Class langchain.chains.base.Chain Method invoke: (self, input: Dict[str, Any], config: Optional[langchain_core.runnables.config.RunnableConfig] = None, **kwargs: Any) -> Dict[str, Any] Method ainvoke: (self, input: Dict[str, Any], config: Optional[langchain_core.runnables.config.RunnableConfig] = None, **kwargs: Any) -> Dict[str, Any] Method run: (self, *args: Any, callbacks: Union[List[langchain_core.callbacks.base.BaseCallbackHandler], langchain_core.callbacks.base.BaseCallbackManager, NoneType] = None, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, **kwargs: Any) -> Any Method arun: (self, *args: Any, callbacks: Union[List[langchain_core.callbacks.base.BaseCallbackHandler], langchain_core.callbacks.base.BaseCallbackManager, NoneType] = None, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, **kwargs: Any) -> Any Method _call: (self, inputs: Dict[str, Any], run_manager: Optional[langchain_core.callbacks.manager.CallbackManagerForChainRun] = None) -> Dict[str, Any] Method _acall: (self, inputs: Dict[str, Any], run_manager: Optional[langchain_core.callbacks.manager.AsyncCallbackManagerForChainRun] = None) -> Dict[str, Any] Class langchain.memory.chat_memory.BaseChatMemory Method save_context: (self, inputs: Dict[str, Any], outputs: Dict[str, str]) -> None Method clear: (self) -> None Class langchain_core.chat_history.BaseChatMessageHistory Class langchain_core.documents.base.Document Class langchain_core.language_models.base.BaseLanguageModel Class langchain_core.language_models.llms.BaseLLM Class langchain_core.load.serializable.Serializable Class langchain_core.memory.BaseMemory Method save_context: (self, inputs: 'Dict[str, Any]', outputs: 'Dict[str, str]') -> 'None' Method clear: (self) -> 'None' Class langchain_core.prompts.base.BasePromptTemplate Class langchain_core.retrievers.BaseRetriever Method _get_relevant_documents: (self, query: 'str', *, run_manager: 'CallbackManagerForRetrieverRun') -> 'List[Document]' Method get_relevant_documents: (self, query: 'str', *, callbacks: 'Callbacks' = None, tags: 'Optional[List[str]]' = None, metadata: 'Optional[Dict[str, Any]]' = None, run_name: 'Optional[str]' = None, **kwargs: 'Any') -> 'List[Document]' Method aget_relevant_documents: (self, query: 'str', *, callbacks: 'Callbacks' = None, tags: 'Optional[List[str]]' = None, metadata: 'Optional[Dict[str, Any]]' = None, run_name: 'Optional[str]' = None, **kwargs: 'Any') -> 'List[Document]' Method _aget_relevant_documents: (self, query: 'str', *, run_manager: 'AsyncCallbackManagerForRetrieverRun') -> 'List[Document]' Class langchain_core.runnables.base.RunnableSerializable Class langchain_core.tools.BaseTool Method _arun: (self, *args: 'Any', **kwargs: 'Any') -> 'Any' Method _run: (self, *args: 'Any', **kwargs: 'Any') -> 'Any' Module llama_hub.* Module llama_index.* Class llama_index.core.base.base_query_engine.BaseQueryEngine Method query: (self, str_or_query_bundle: Union[str, llama_index.core.schema.QueryBundle]) -> Union[llama_index.core.base.response.schema.Response, llama_index.core.base.response.schema.StreamingResponse, llama_index.core.base.response.schema.PydanticResponse] Method aquery: (self, str_or_query_bundle: Union[str, llama_index.core.schema.QueryBundle]) -> Union[llama_index.core.base.response.schema.Response, llama_index.core.base.response.schema.StreamingResponse, llama_index.core.base.response.schema.PydanticResponse] Method retrieve: (self, query_bundle: llama_index.core.schema.QueryBundle) -> List[llama_index.core.schema.NodeWithScore] Method synthesize: (self, query_bundle: llama_index.core.schema.QueryBundle, nodes: List[llama_index.core.schema.NodeWithScore], additional_source_nodes: Optional[Sequence[llama_index.core.schema.NodeWithScore]] = None) -> Union[llama_index.core.base.response.schema.Response, llama_index.core.base.response.schema.StreamingResponse, llama_index.core.base.response.schema.PydanticResponse] Class llama_index.core.base.base_query_engine.QueryEngineComponent Method _run_component: (self, **kwargs: Any) -> Any Class llama_index.core.base.base_retriever.BaseRetriever Method retrieve: (self, str_or_query_bundle: Union[str, llama_index.core.schema.QueryBundle]) -> List[llama_index.core.schema.NodeWithScore] Method _retrieve: (self, query_bundle: llama_index.core.schema.QueryBundle) -> List[llama_index.core.schema.NodeWithScore] Method _aretrieve: (self, query_bundle: llama_index.core.schema.QueryBundle) -> List[llama_index.core.schema.NodeWithScore] Class llama_index.core.base.embeddings.base.BaseEmbedding Class llama_index.core.base.llms.types.LLMMetadata Class llama_index.core.chat_engine.types.BaseChatEngine Method chat: (self, message: str, chat_history: Optional[List[llama_index.core.base.llms.types.ChatMessage]] = None) -> Union[llama_index.core.chat_engine.types.AgentChatResponse, llama_index.core.chat_engine.types.StreamingAgentChatResponse] Method achat: (self, message: str, chat_history: Optional[List[llama_index.core.base.llms.types.ChatMessage]] = None) -> Union[llama_index.core.chat_engine.types.AgentChatResponse, llama_index.core.chat_engine.types.StreamingAgentChatResponse] Method stream_chat: (self, message: str, chat_history: Optional[List[llama_index.core.base.llms.types.ChatMessage]] = None) -> llama_index.core.chat_engine.types.StreamingAgentChatResponse Class llama_index.core.indices.base.BaseIndex Class llama_index.core.indices.prompt_helper.PromptHelper Class llama_index.core.memory.types.BaseMemory Method put: (self, message: llama_index.core.base.llms.types.ChatMessage) -> None Class llama_index.core.node_parser.interface.NodeParser Class llama_index.core.postprocessor.types.BaseNodePostprocessor Method _postprocess_nodes: (self, nodes: List[llama_index.core.schema.NodeWithScore], query_bundle: Optional[llama_index.core.schema.QueryBundle] = None) -> List[llama_index.core.schema.NodeWithScore] Class llama_index.core.question_gen.types.BaseQuestionGenerator Class llama_index.core.response_synthesizers.base.BaseSynthesizer Class llama_index.core.response_synthesizers.refine.Refine Method get_response: (self, query_str: str, text_chunks: Sequence[str], prev_response: Union[pydantic.v1.main.BaseModel, str, Generator[str, NoneType, NoneType], NoneType] = None, **response_kwargs: Any) -> Union[pydantic.v1.main.BaseModel, str, Generator[str, NoneType, NoneType]] Class llama_index.core.schema.BaseComponent Class llama_index.core.tools.types.BaseTool Method __call__: (self, input: Any) -> llama_index.core.tools.types.ToolOutput Class llama_index.core.tools.types.ToolMetadata Class llama_index.core.vector_stores.types.VectorStore Class llama_index.legacy.llm_predictor.base.BaseLLMPredictor Method predict: (self, prompt: llama_index.legacy.prompts.base.BasePromptTemplate, **prompt_args: Any) -> str Class llama_index.legacy.llm_predictor.base.LLMPredictor Method predict: (self, prompt: llama_index.legacy.prompts.base.BasePromptTemplate, output_cls: Optional[pydantic.v1.main.BaseModel] = None, **prompt_args: Any) -> str Module trulens_eval.* Class trulens_eval.feedback.feedback.Feedback Method __call__: (self, *args, **kwargs) -> 'Any' Class trulens_eval.utils.imports.llama_index.core.llms.base.BaseLLM WARNING: this class could not be imported. It may have been (re)moved. Error: > No module named 'llama_index.core.llms.base' Class trulens_eval.utils.langchain.WithFeedbackFilterDocuments Method _get_relevant_documents: (self, query: str, *, run_manager) -> List[langchain_core.documents.base.Document] Method get_relevant_documents: (self, query: 'str', *, callbacks: 'Callbacks' = None, tags: 'Optional[List[str]]' = None, metadata: 'Optional[Dict[str, Any]]' = None, run_name: 'Optional[str]' = None, **kwargs: 'Any') -> 'List[Document]' Method aget_relevant_documents: (self, query: 'str', *, callbacks: 'Callbacks' = None, tags: 'Optional[List[str]]' = None, metadata: 'Optional[Dict[str, Any]]' = None, run_name: 'Optional[str]' = None, **kwargs: 'Any') -> 'List[Document]' Method _aget_relevant_documents: (self, query: 'str', *, run_manager: 'AsyncCallbackManagerForRetrieverRun') -> 'List[Document]' Class trulens_eval.utils.llama.WithFeedbackFilterNodes WARNING: this class could not be imported. It may have been (re)moved. Error: > No module named 'llama_index.indices.vector_store' Class trulens_eval.utils.python.EmptyType
Instrumenting other classes/methods.¶
Additional classes and methods can be instrumented by use of the
trulens_eval.instruments.Instrument
methods and decorators. Examples of
such usage can be found in the custom app used in the custom_example.ipynb
notebook which can be found in
trulens_eval/examples/expositional/end2end_apps/custom_app/custom_app.py
. More
information about these decorators can be found in the
docs/trulens_eval/tracking/instrumentation/index.ipynb
notebook.
Inspecting instrumentation¶
The specific objects (of the above classes) and methods instrumented for a
particular app can be inspected using the App.print_instrumented
as
exemplified in the next cell. Unlike Instrument.print_instrumentation
, this
function only shows what in an app was actually instrumented.
tru_chat_engine_recorder.print_instrumented()
Components: TruLlama (Other) at 0x2bf5d5d10 with path __app__ OpenAIAgent (Other) at 0x2bf535a10 with path __app__.app ChatMemoryBuffer (Other) at 0x2bf537210 with path __app__.app.memory SimpleChatStore (Other) at 0x2be6ef710 with path __app__.app.memory.chat_store Methods: Object at 0x2bf537210: <function ChatMemoryBuffer.put at 0x2b14c19e0> with path __app__.app.memory <function BaseMemory.put at 0x2b1448f40> with path __app__.app.memory Object at 0x2bf535a10: <function BaseQueryEngine.query at 0x2b137dc60> with path __app__.app <function BaseQueryEngine.aquery at 0x2b137e2a0> with path __app__.app <function AgentRunner.chat at 0x2bf5aa160> with path __app__.app <function AgentRunner.achat at 0x2bf5aa2a0> with path __app__.app <function AgentRunner.stream_chat at 0x2bf5aa340> with path __app__.app <function BaseQueryEngine.retrieve at 0x2b137e340> with path __app__.app <function BaseQueryEngine.synthesize at 0x2b137e3e0> with path __app__.app <function BaseChatEngine.chat at 0x2b1529f80> with path __app__.app <function BaseChatEngine.achat at 0x2b152a0c0> with path __app__.app <function BaseAgent.stream_chat at 0x2beb437e0> with path __app__.app <function BaseChatEngine.stream_chat at 0x2b152a020> with path __app__.app Object at 0x2c1df9950: <function ChatMemoryBuffer.put at 0x2b14c19e0> with path __app__.app.memory