Skip to content

Tru Virtual

trulens_eval.tru_virtual.VirtualRecord

Bases: Record

The TruVirtual module facilitates the ingestion and evaluation of application logs that were generated outside of TruLens. It allows for the creation of a virtual representation of your application, enabling the evaluation of logged data within the TruLens framework.

To begin, construct a virtual application representation. This can be achieved through a simple dictionary or by utilizing the VirtualApp class, which allows for a more structured approach to storing application information relevant for feedback evaluation.

Constructing a Virtual Application

virtual_app = {
    'llm': {'modelname': 'some llm component model name'},
    'template': 'information about the template used in the app',
    'debug': 'optional fields for additional debugging information'
}
# Converting the dictionary to a VirtualApp instance
from trulens_eval import Select
from trulens_eval.tru_virtual import VirtualApp

virtual_app = VirtualApp(virtual_app)
virtual_app[Select.RecordCalls.llm.maxtokens] = 1024

Incorporate components into the virtual app for evaluation by utilizing the Select class. This approach allows for the reuse of setup configurations when defining feedback functions.

Incorporating Components into the Virtual App

# Setting up a virtual app with a retriever component
from trulens_eval import Select
retriever_component = Select.RecordCalls.retriever
virtual_app[retriever_component] = 'this is the retriever component'

With your virtual app configured, it's ready to store logged data. VirtualRecord offers a structured way to build records from your data for ingestion into TruLens, distinguishing itself from direct Record creation by specifying calls through selectors.

Below is an example of adding records for a context retrieval component, emphasizing that only the data intended for tracking or evaluation needs to be provided.

Adding Records for a Context Retrieval Component

from trulens_eval.tru_virtual import VirtualRecord

# Selector for the context retrieval component's `get_context` call
context_call = retriever_component.get_context

# Creating virtual records
rec1 = VirtualRecord(
    main_input='Where is Germany?',
    main_output='Germany is in Europe',
    calls={
        context_call: {
            'args': ['Where is Germany?'],
            'rets': ['Germany is a country located in Europe.']
        }
    }
)
rec2 = VirtualRecord(
    main_input='Where is Germany?',
    main_output='Poland is in Europe',
    calls={
        context_call: {
            'args': ['Where is Germany?'],
            'rets': ['Poland is a country located in Europe.']
        }
    }
)

data = [rec1, rec2]

For existing datasets, such as a dataframe of prompts, contexts, and responses, iterate through the dataframe to create virtual records for each entry.

Creating Virtual Records from a DataFrame

import pandas as pd

# Example dataframe
data = {
    'prompt': ['Where is Germany?', 'What is the capital of France?'],
    'response': ['Germany is in Europe', 'The capital of France is Paris'],
    'context': [
        'Germany is a country located in Europe.',
        'France is a country in Europe and its capital is Paris.'
    ]
}
df = pd.DataFrame(data)

# Ingesting data from the dataframe into virtual records
data_dict = df.to_dict('records')
data = []

for record in data_dict:
    rec = VirtualRecord(
        main_input=record['prompt'],
        main_output=record['response'],
        calls={
            context_call: {
                'args': [record['prompt']],
                'rets': [record['context']]
            }
        }
    )
    data.append(rec)

After constructing the virtual records, feedback functions can be developed in the same manner as with non-virtual applications, using the newly added context_call selector for reference.

Developing Feedback Functions

from trulens_eval.feedback.provider import OpenAI
from trulens_eval.feedback.feedback import Feedback

# Initializing the feedback provider
openai = OpenAI()

# Defining the context for feedback using the virtual `get_context` call
context = context_call.rets[:]

# Creating a feedback function for context relevance
f_context_relevance = Feedback(openai.qs_relevance).on_input().on(context)

These feedback functions are then integrated into TruVirtual to construct the recorder, which can handle most configurations applicable to non-virtual apps.

Integrating Feedback Functions into TruVirtual

from trulens_eval.tru_virtual import TruVirtual

# Setting up the virtual recorder
virtual_recorder = TruVirtual(
    app_id='a virtual app',
    app=virtual_app,
    feedbacks=[f_context_relevance]
)

To process the records and run any feedback functions associated with the recorder, use the add_record method.

Logging records and running feedback functions

# Ingesting records into the virtual recorder
for record in data:
    virtual_recorder.add_record(record)

Metadata about your application can also be included in the VirtualApp for evaluation purposes, offering a flexible way to store additional information about the components of an LLM app.

Storing metadata in a VirtualApp

# Example of storing metadata in a VirtualApp
virtual_app = {
    'llm': {'modelname': 'some llm component model name'},
    'template': 'information about the template used in the app',
    'debug': 'optional debugging information'
}

from trulens_eval.schema import Select
from trulens_eval.tru_virtual import VirtualApp

virtual_app = VirtualApp(virtual_app)
virtual_app[Select.RecordCalls.llm.maxtokens] = 1024

This approach is particularly beneficial for evaluating the components of an LLM app.

Evaluating components of an LLM application

# Adding a retriever component to the virtual app
retriever_component = Select.RecordCalls.retriever
virtual_app[retriever_component] = 'this is the retriever component'

Functions

__init__

__init__(calls: Dict[Lens, Union[Dict, Sequence[Dict]]], cost: Optional[Cost] = None, perf: Optional[Perf] = None, **kwargs: dict)

Create a record for a virtual app.

Many arguments are filled in by default values if not provided. See Record for all arguments. Listing here is only for those which are required for this method or filled with default values.

PARAMETER DESCRIPTION
calls

A dictionary of calls to be recorded. The keys are selectors and the values are dictionaries with the keys listed in the next section.

TYPE: Dict[Lens, Union[Dict, Sequence[Dict]]]

cost

Defaults to zero cost.

TYPE: Optional[Cost] DEFAULT: None

perf

Defaults to time spanning the processing of this virtual record. Note that individual calls also include perf. Time span is extended to make sure it is not of duration zero.

TYPE: Optional[Perf] DEFAULT: None

Call values are dictionaries containing arguments to RecordAppCall constructor. Values can also be lists of the same. This happens in non-virtual apps when the same method is recorded making multiple calls in a single app invocation. The following defaults are used if not provided.

PARAMETER TYPE DEFAULT
stack List[RecordAppCallMethod] Two frames: a root call followed by a call by virtual_object, method name derived from the last element of the selector of this call.
args JSON []
rets JSON []
perf Perf Time spanning the processing of this virtual call.
pid int 0
tid int 0

trulens_eval.tru_virtual.VirtualApp

Bases: dict

A dictionary meant to represent the components of a virtual app.

TruVirtual will refer to this class as the wrapped app. All calls will be under VirtualApp.root

Functions

root

root()

All virtual calls will have this on top of the stack as if their app was called using this as the main/root method.

trulens_eval.tru_virtual.TruVirtual

Bases: App

Recorder for virtual apps.

Virtual apps are data only in that they cannot be executed but for whom previously-computed results can be added using add_record. The VirtualRecord class may be useful for creating records for this. Fields used by non-virtual apps can be specified here, notably:

See App and AppDefinition for constructor arguments.

The app field.

You can store any information you would like by passing in a dictionary to TruVirtual in the app field. This may involve an index of components or versions, or anything else. You can refer to these values for evaluating feedback.

Usage

You can use VirtualApp to create the app structure or a plain dictionary. Using VirtualApp lets you use Selectors to define components:

virtual_app = VirtualApp()
virtual_app[Select.RecordCalls.llm.maxtokens] = 1024
Example
virtual_app = dict(
    llm=dict(
        modelname="some llm component model name"
    ),
    template="information about the template I used in my app",
    debug="all of these fields are completely optional"
)

virtual = TruVirtual(
    app_id="my_virtual_app",
    app=virtual_app
)

Attributes

app_id instance-attribute

app_id: AppID = app_id

Unique identifier for this app.

tags instance-attribute

tags: Tags = tags

Tags for the app.

metadata instance-attribute

metadata: Metadata = metadata

Metadata for the app.

feedback_definitions class-attribute instance-attribute

feedback_definitions: Sequence[FeedbackDefinition] = []

Feedback functions to evaluate on each record.

feedback_mode class-attribute instance-attribute

feedback_mode: FeedbackMode = WITH_APP_THREAD

How to evaluate feedback functions upon producing a record.

initial_app_loader_dump class-attribute instance-attribute

initial_app_loader_dump: Optional[SerialBytes] = None

Serialization of a function that loads an app.

Dump is of the initial app state before any invocations. This can be used to create a new session.

Warning

Experimental work in progress.

app_extra_json instance-attribute

app_extra_json: JSON

Info to store about the app and to display in dashboard.

This can be used even if app itself cannot be serialized. app_extra_json, then, can stand in place for whatever data the user might want to keep track of about the app.

feedbacks class-attribute instance-attribute

feedbacks: List[Feedback] = Field(exclude=True, default_factory=list)

Feedback functions to evaluate on each record.

tru class-attribute instance-attribute

tru: Optional[Tru] = Field(default=None, exclude=True)

Workspace manager.

If this is not povided, a singleton Tru will be made (if not already) and used.

db class-attribute instance-attribute

db: Optional[DB] = Field(default=None, exclude=True)

Database interface.

If this is not provided, a singleton SQLAlchemyDB will be made (if not already) and used.

recording_contexts class-attribute instance-attribute

recording_contexts: ContextVar[RecordingContext] = Field(None, exclude=True)

Sequnces of records produced by the this class used as a context manager are stored in a RecordingContext.

Using a context var so that context managers can be nested.

instrumented_methods class-attribute instance-attribute

instrumented_methods: Dict[int, Dict[Callable, Lens]] = Field(exclude=True, default_factory=dict)

Mapping of instrumented methods (by id(.) of owner object and the function) to their path in this app.

records_with_pending_feedback_results class-attribute instance-attribute

records_with_pending_feedback_results: Queue[Record] = Field(exclude=True, default_factory=lambda: Queue(maxsize=1024))

Records produced by this app which might have yet to finish feedback runs.

manage_pending_feedback_results_thread class-attribute instance-attribute

manage_pending_feedback_results_thread: Optional[Thread] = Field(exclude=True, default=None)

Thread for manager of pending feedback results queue.

See _manage_pending_feedback_results.

selector_check_warning class-attribute instance-attribute

selector_check_warning: bool = False

Selector checking is disabled for virtual apps.

selector_nocheck class-attribute instance-attribute

selector_nocheck: bool = True

The selector check must be disabled for virtual apps.

This is because methods that could be called are not known in advance of creating virtual records.

Functions

on_method_instrumented

on_method_instrumented(obj: object, func: Callable, path: Lens)

Called by instrumentation system for every function requested to be instrumented by this app.

get_method_path

get_method_path(obj: object, func: Callable) -> Lens

Get the path of the instrumented function method relative to this app.

get_methods_for_func

get_methods_for_func(func: Callable) -> Iterable[Tuple[int, Callable, Lens]]

Get the methods (rather the inner functions) matching the given func and the path of each.

See WithInstrumentCallbacks.get_methods_for_func.

on_new_record

on_new_record(func) -> Iterable[RecordingContext]

Called at the start of record creation.

See WithInstrumentCallbacks.on_new_record.

on_add_record

on_add_record(ctx: RecordingContext, func: Callable, sig: Signature, bindings: BoundArguments, ret: Any, error: Any, perf: Perf, cost: Cost, existing_record: Optional[Record] = None) -> Record

Called by instrumented methods if they use _new_record to construct a record call list.

See WithInstrumentCallbacks.on_add_record.

load staticmethod

load(obj, *args, **kwargs)

Deserialize/load this object using the class information in tru_class_info to lookup the actual class that will do the deserialization.

model_validate classmethod

model_validate(*args, **kwargs) -> Any

Deserialized a jsonized version of the app into the instance of the class it was serialized from.

Note

This process uses extra information stored in the jsonized object and handled by WithClassInfo.

continue_session staticmethod

continue_session(app_definition_json: JSON, app: Any) -> AppDefinition

Instantiate the given app with the given state app_definition_json.

Warning

This is an experimental feature with ongoing work.

PARAMETER DESCRIPTION
app_definition_json

The json serialized app.

TYPE: JSON

app

The app to continue the session with.

TYPE: Any

RETURNS DESCRIPTION
AppDefinition

A new AppDefinition instance with the given app and the given app_definition_json state.

new_session staticmethod

new_session(app_definition_json: JSON, initial_app_loader: Optional[Callable] = None) -> AppDefinition

Create an app instance at the start of a session.

Warning

This is an experimental feature with ongoing work.

Create a copy of the json serialized app with the enclosed app being initialized to its initial state before any records are produced (i.e. blank memory).

get_loadable_apps staticmethod

get_loadable_apps()

Gets a list of all of the loadable apps.

Warning

This is an experimental feature with ongoing work.

This is those that have initial_app_loader_dump set.

select_inputs classmethod

select_inputs() -> Lens

Get the path to the main app's call inputs.

select_outputs classmethod

select_outputs() -> Lens

Get the path to the main app's call outputs.

wait_for_feedback_results

wait_for_feedback_results() -> None

Wait for all feedbacks functions to complete.

This applies to all feedbacks on all records produced by this app. This call will block until finished and if new records are produced while this is running, it will include them.

select_context classmethod

select_context(app: Optional[Any] = None) -> Lens

Try to find retriever components in the given app and return a lens to access the retrieved contexts that would appear in a record were these components to execute.

main_call

main_call(human: str) -> str

If available, a single text to a single text invocation of this app.

main_acall async

main_acall(human: str) -> str

If available, a single text to a single text invocation of this app.

main_input

main_input(func: Callable, sig: Signature, bindings: BoundArguments) -> JSON

Determine the main input string for the given function func with signature sig if it is to be called with the given bindings bindings.

main_output

main_output(func: Callable, sig: Signature, bindings: BoundArguments, ret: Any) -> JSON

Determine the main out string for the given function func with signature sig after it is called with the given bindings and has returned ret.

json

json(*args, **kwargs)

Create a json string representation of this app.

awith_ async

awith_(func: CallableMaybeAwaitable[A, T], *args, **kwargs) -> T

Call the given async func with the given *args and **kwargs while recording, producing func results. The record of the computation is available through other means like the database or dashboard. If you need a record of this execution immediately, you can use awith_record or the App as a context mananger instead.

with_ async

with_(func: Callable[[A], T], *args, **kwargs) -> T

Call the given async func with the given *args and **kwargs while recording, producing func results. The record of the computation is available through other means like the database or dashboard. If you need a record of this execution immediately, you can use awith_record or the App as a context mananger instead.

with_record

with_record(func: Callable[[A], T], *args, record_metadata: JSON = None, **kwargs) -> Tuple[T, Record]

Call the given func with the given *args and **kwargs, producing its results as well as a record of the execution.

awith_record async

awith_record(func: Callable[[A], Awaitable[T]], *args, record_metadata: JSON = None, **kwargs) -> Tuple[T, Record]

Call the given func with the given *args and **kwargs, producing its results as well as a record of the execution.

dummy_record

dummy_record(cost: Cost = mod_schema.Cost(), perf: Perf = mod_schema.Perf.now(), ts: datetime = datetime.datetime.now(), main_input: str = 'main_input are strings.', main_output: str = 'main_output are strings.', main_error: str = 'main_error are strings.', meta: Dict = {'metakey': 'meta are dicts'}, tags: str = 'tags are strings') -> Record

Create a dummy record with some of the expected structure without actually invoking the app.

The record is a guess of what an actual record might look like but will be missing information that can only be determined after a call is made.

All args are Record fields except these:

- `record_id` is generated using the default id naming schema.
- `app_id` is taken from this recorder.
- `calls` field is constructed based on instrumented methods.

instrumented

instrumented() -> Iterable[Tuple[Lens, ComponentView]]

Iteration over instrumented components and their categories.

print_instrumented

print_instrumented() -> None

Print the instrumented components and methods.

format_instrumented_methods

format_instrumented_methods() -> str

Build a string containing a listing of instrumented methods.

print_instrumented_methods

print_instrumented_methods() -> None

Print instrumented methods.

print_instrumented_components

print_instrumented_components() -> None

Print instrumented components and their categories.

__init__

__init__(app: Optional[Union[VirtualApp, JSON]] = None, **kwargs: dict)

Virtual app for logging existing app results.

add_record

add_record(record: Record, feedback_mode: Optional[FeedbackMode] = None) -> Record

Add the given record to the database and evaluate any pre-specified feedbacks on it.

The class VirtualRecord may be useful for creating records for virtual models. If feedback_mode is specified, will use that mode for this record only.

trulens_eval.tru_virtual.virtual_module module-attribute

virtual_module = Module(package_name='trulens_eval', module_name='trulens_eval.tru_virtual')

Module to represent the module of virtual apps.

Virtual apps will record this as their module.

trulens_eval.tru_virtual.virtual_class module-attribute

virtual_class = Class(module=virtual_module, name='VirtualApp')

Class to represent the class of virtual apps.

Virtual apps will record this as their class.

trulens_eval.tru_virtual.virtual_object module-attribute

virtual_object = Obj(cls=virtual_class, id=0)

Object to represent instances of virtual apps.

Virtual apps will record this as their instance.

trulens_eval.tru_virtual.virtual_method_root module-attribute

virtual_method_root = Method(cls=virtual_class, obj=virtual_object, name='root')

Method call to represent the root call of virtual apps.

Virtual apps will record this as their root call.

trulens_eval.tru_virtual.virtual_method_call module-attribute

virtual_method_call = Method(cls=virtual_class, obj=virtual_object, name='method_name_not_set')

Method call to represent virtual app calls that do not provide this information.

Method name will be replaced by the last attribute in the selector provided by user.