Feedback¶

Feedback functions are stored as instances of Feedback which itself extends FeedbackDefinition. The definition parent contains serializable fields while the non-definition subclass adds non-serializable instantiations.

trulens_eval.feedback.feedback.Feedback ¶

Bases: FeedbackDefinition

Feedback function container.

Typical usage is to specify a feedback implementation function from a Provider and the mapping of selectors describing how to construct the arguments to the implementation:

Example

from trulens_eval import Feedback
from trulens_eval import Huggingface
hugs = Huggingface()

# Create a feedback function from a provider:
feedback = Feedback(
    hugs.language_match # the implementation
).on_input_output() # selectors shorthand

Attributes¶

imp `class-attribute` `instance-attribute` ¶

imp: Optional[ImpCallable] = imp

Implementation callable.

A serialized version is stored at FeedbackDefinition.implementation.

agg `class-attribute` `instance-attribute` ¶

agg: Optional[AggCallable] = agg

Aggregator method for feedback functions that produce more than one result.

A serialized version is stored at FeedbackDefinition.aggregator.

sig `property` ¶

sig: Signature

Signature of the feedback function implementation.

name `property` ¶

name: str

Name of the feedback function.

Derived from the name of the function implementing it if no supplied name provided.

Functions¶

on_input_output ¶

on_input_output() -> Feedback

Specifies that the feedback implementation arguments are to be the main app input and output in that order.

Returns a new Feedback object with the specification.

on_default ¶

on_default() -> Feedback

Specifies that one argument feedbacks should be evaluated on the main app output and two argument feedbacks should be evaluates on main input and main output in that order.

Returns a new Feedback object with this specification.

evaluate_deferred `staticmethod` ¶

evaluate_deferred(
    tru: Tru,
    limit: Optional[int] = None,
    shuffle: bool = False,
) -> List[Tuple[Series, Future[FeedbackResult]]]

Evaluates feedback functions that were specified to be deferred.

Returns a list of tuples with the DB row containing the Feedback and initial FeedbackResult as well as the Future which will contain the actual result.

PARAMETER	DESCRIPTION
`limit`	The maximum number of evals to start. TYPE: `Optional[int]` DEFAULT: `None`
`shuffle`	Shuffle the order of the feedbacks to evaluate. TYPE: `bool` DEFAULT: `False`

Constants that govern behaviour:

Tru.RETRY_RUNNING_SECONDS: How long to time before restarting a feedback that was started but never failed (or failed without recording that fact).
Tru.RETRY_FAILED_SECONDS: How long to wait to retry a failed feedback.

aggregate ¶

aggregate(
    func: Optional[AggCallable] = None,
    combinations: Optional[FeedbackCombinations] = None,
) -> Feedback

Specify the aggregation function in case the selectors for this feedback generate more than one value for implementation argument(s). Can also specify the method of producing combinations of values in such cases.

Returns a new Feedback object with the given aggregation function and/or the given combination mode.

on_prompt ¶

on_prompt(arg: Optional[str] = None) -> Feedback

Create a variant of self that will take in the main app input or "prompt" as input, sending it as an argument arg to implementation.

on_response ¶

on_response(arg: Optional[str] = None) -> Feedback

Create a variant of self that will take in the main app output or "response" as input, sending it as an argument arg to implementation.

on ¶

on(*args, **kwargs) -> Feedback

Create a variant of self with the same implementation but the given selectors. Those provided positionally get their implementation argument name guessed and those provided as kwargs get their name from the kwargs key.

check_selectors ¶

check_selectors(
    app: Union[AppDefinition, JSON],
    record: Record,
    source_data: Optional[Dict[str, Any]] = None,
    warning: bool = False,
) -> bool

Check that the selectors are valid for the given app and record.

PARAMETER	DESCRIPTION
`app`	The app that produced the record. TYPE: `Union[AppDefinition, JSON]`
`record`	The record that the feedback will run on. This can be a mostly empty record for checking ahead of producing one. The utility method App.dummy_record is built for this prupose. TYPE: `Record`
`source_data`	Additional data to select from when extracting feedback function arguments. TYPE: `Optional[Dict[str, Any]]` DEFAULT: `None`
`warning`	Issue a warning instead of raising an error if a selector is invalid. As some parts of a Record cannot be known ahead of producing it, it may be necessary to not raise exception here and only issue a warning. TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`bool`	True if the selectors are valid. False if not (if warning is set).

RAISES	DESCRIPTION
`ValueError`	If a selector is invalid and warning is not set.

run ¶

run(
    app: Optional[Union[AppDefinition, JSON]] = None,
    record: Optional[Record] = None,
    source_data: Optional[Dict] = None,
    **kwargs: Dict[str, Any]
) -> FeedbackResult

Run the feedback function on the given record. The app that produced the record is also required to determine input/output argument names.

PARAMETER	DESCRIPTION
`app`	The app that produced the record. This can be AppDefinition or a jsonized AppDefinition. It will be jsonized if it is not already. TYPE: `Optional[Union[AppDefinition, JSON]]` DEFAULT: `None`
`record`	The record to evaluate the feedback on. TYPE: `Optional[Record]` DEFAULT: `None`
`source_data`	Additional data to select from when extracting feedback function arguments. TYPE: `Optional[Dict]` DEFAULT: `None`
`**kwargs`	Any additional keyword arguments are used to set or override selected feedback function inputs. TYPE: `Dict[str, Any]` DEFAULT: `{}`

RETURNS	DESCRIPTION
`FeedbackResult`	A FeedbackResult object with the result of the feedback function.

extract_selection ¶

extract_selection(
    app: Optional[Union[AppDefinition, JSON]] = None,
    record: Optional[Record] = None,
    source_data: Optional[Dict] = None,
) -> Iterable[Dict[str, Any]]

Given the app that produced the given record, extract from record the values that will be sent as arguments to the implementation as specified by self.selectors. Additional data to select from can be provided in source_data. All args are optional. If a Record is specified, its calls are laid out as app (see layout_calls_as_app).

Feedback-defining utilities¶

trulens_eval.feedback.feedback.rag_triad ¶

rag_triad(
    provider: LLMProvider,
    question: Optional[Lens] = None,
    answer: Optional[Lens] = None,
    context: Optional[Lens] = None,
) -> Dict[str, Feedback]

Create a triad of feedback functions for evaluating context retrieval generation steps.

If a particular lens is not provided, the relevant selectors will be missing. These can be filled in later or the triad can be used for rails feedback actions whick fill in the selectors based on specification from within colang.

PARAMETER	DESCRIPTION
`provider`	The provider to use for implementing the feedback functions. TYPE: `LLMProvider`
`question`	Selector for the question part. TYPE: `Optional[Lens]` DEFAULT: `None`
`answer`	Selector for the answer part. TYPE: `Optional[Lens]` DEFAULT: `None`
`context`	Selector for the context part. TYPE: `Optional[Lens]` DEFAULT: `None`

trulens_eval.feedback.feedback.ImpCallable `module-attribute` ¶

ImpCallable = Callable[
    [A], Union[float, Tuple[float, Dict[str, Any]]]
]

Signature of feedback implementations.

Those take in any number of arguments and return either a single float or a float and a dictionary (of metadata).

trulens_eval.feedback.feedback.AggCallable `module-attribute` ¶

AggCallable = Callable[[Iterable[float]], float]

Signature of aggregation functions.

trulens_eval.feedback.feedback.SkipEval ¶

Bases: Exception

Raised when evaluating a feedback function implementation to skip it so it is not aggregated with other non-skipped results.

PARAMETER	DESCRIPTION
`reason`	Optional reason for why this evaluation was skipped. TYPE: `Optional[str]` DEFAULT: `None`
`feedback`	The Feedback instance this run corresponds to. TYPE: `Optional[Feedback]` DEFAULT: `None`
`ins`	The arguments to this run. TYPE: `Optional[Dict[str, Any]]` DEFAULT: `None`

trulens_eval.feedback.feedback.InvalidSelector ¶

Bases: Exception

Raised when a selector names something that is missing in a record/app.

trulens_eval.schema.feedback ¶

Serializable feedback-related classes.

Classes¶

Select ¶

Utilities for creating selectors using Lens and aliases/shortcuts.

Attributes¶

Query `class-attribute` `instance-attribute` ¶

Query = Lens

Selector type.

Tru `class-attribute` `instance-attribute` ¶

Tru: Lens = Query()

Selector for the tru wrapper (TruLlama, TruChain, etc.).

Record `class-attribute` `instance-attribute` ¶

Record: Query = __record__

Selector for the record.

App `class-attribute` `instance-attribute` ¶

App: Query = __app__

Selector for the app.

RecordInput `class-attribute` `instance-attribute` ¶

RecordInput: Query = main_input

Selector for the main app input.

RecordOutput `class-attribute` `instance-attribute` ¶

RecordOutput: Query = main_output

Selector for the main app output.

RecordCalls `class-attribute` `instance-attribute` ¶

RecordCalls: Query = app

Selector for the calls made by the wrapped app.

Layed out by path into components.

RecordCall `class-attribute` `instance-attribute` ¶

RecordCall: Query = calls[-1]

Selector for the first called method (last to return).

RecordArgs `class-attribute` `instance-attribute` ¶

RecordArgs: Query = args

Selector for the whole set of inputs/arguments to the first called / last method call.

RecordRets `class-attribute` `instance-attribute` ¶

RecordRets: Query = rets

Selector for the whole output of the first called / last returned method call.

Functions¶

path_and_method `staticmethod` ¶

path_and_method(select: Query) -> Tuple[Query, str]

If select names in method as the last attribute, extract the method name and the selector without the final method name.

dequalify `staticmethod` ¶

dequalify(select: Query) -> Query

If the given selector qualifies record or app, remove that qualification.

render_for_dashboard `staticmethod` ¶

render_for_dashboard(query: Query) -> str

Render the given query for use in dashboard to help user specify feedback functions.

FeedbackMode ¶

Bases: str, Enum

Mode of feedback evaluation.

Specify this using the feedback_mode to App constructors.

Attributes¶

NONE `class-attribute` `instance-attribute` ¶

NONE = 'none'

No evaluation will happen even if feedback functions are specified.

WITH_APP `class-attribute` `instance-attribute` ¶

WITH_APP = 'with_app'

Try to run feedback functions immediately and before app returns a record.

WITH_APP_THREAD `class-attribute` `instance-attribute` ¶

WITH_APP_THREAD = 'with_app_thread'

Try to run feedback functions in the same process as the app but after it produces a record.

DEFERRED `class-attribute` `instance-attribute` ¶

DEFERRED = 'deferred'

Evaluate later via the process started by tru.start_deferred_feedback_evaluator.

FeedbackResultStatus ¶

Bases: Enum

For deferred feedback evaluation, these values indicate status of evaluation.

Attributes¶

NONE `class-attribute` `instance-attribute` ¶

NONE = 'none'

Initial value is none.

RUNNING `class-attribute` `instance-attribute` ¶

RUNNING = 'running'

Once queued/started, status is updated to "running".

FAILED `class-attribute` `instance-attribute` ¶

FAILED = 'failed'

Run failed.

DONE `class-attribute` `instance-attribute` ¶

DONE = 'done'

Run completed successfully.

SKIPPED `class-attribute` `instance-attribute` ¶

SKIPPED = 'skipped'

This feedback was skipped.

This can be because because it had an if_exists selector and did not select anything or it has a selector that did not select anything the on_missing was set to warn or ignore.

FeedbackOnMissingParameters ¶

Bases: str, Enum

How to handle missing parameters in feedback function calls.

This is specifically for the case were a feedback function has a selector that selects something that does not exist in a record/app.

Attributes¶

ERROR `class-attribute` `instance-attribute` ¶

ERROR = 'error'

Raise an error if a parameter is missing.

The result status will be set to FAILED.

WARN `class-attribute` `instance-attribute` ¶

WARN = 'warn'

Warn if a parameter is missing.

The result status will be set to SKIPPED.

IGNORE `class-attribute` `instance-attribute` ¶

IGNORE = 'ignore'

Do nothing.

No warning or error message will be shown. The result status will be set to SKIPPED.

FeedbackCall ¶

Bases: SerialModel

Invocations of feedback function results in one of these instances.

Note that a single Feedback instance might require more than one call.

Attributes¶

args `instance-attribute` ¶

args: Dict[str, Optional[JSON]]

Arguments to the feedback function.

ret `instance-attribute` ¶

ret: float

Return value.

meta `class-attribute` `instance-attribute` ¶

meta: Dict[str, Any] = Field(default_factory=dict)

Any additional data a feedback function returns to display alongside its float result.

FeedbackResult ¶

Bases: SerialModel

Feedback results for a single Feedback instance.

This might involve multiple feedback function calls. Typically you should not be constructing these objects yourself except for the cases where you'd like to log human feedback.

ATTRIBUTE	DESCRIPTION
`feedback_result_id`	Unique identifier for this result. TYPE: `str`
`record_id`	Record over which the feedback was evaluated. TYPE: `str`
`feedback_definition_id`	The id of the FeedbackDefinition which was evaluated to get this result. TYPE: `str`
`last_ts`	Last timestamp involved in the evaluation. TYPE: `datetime`
`status`	For deferred feedback evaluation, the status of the evaluation. TYPE: `FeedbackResultStatus`
`cost`	Cost of the evaluation. TYPE: `Cost`
`name`	Given name of the feedback. TYPE: `str`
`calls`	Individual feedback function invocations. TYPE: `List[FeedbackCall]`
`result`	Final result, potentially aggregating multiple calls. TYPE: `float`
`error`	Error information if there was an error. TYPE: `str`
`multi_result`	TODO: doc TYPE: `str`

Attributes¶

status `class-attribute` `instance-attribute` ¶

status: FeedbackResultStatus = NONE

For deferred feedback evaluation, the status of the evaluation.

FeedbackCombinations ¶

Bases: str, Enum

How to collect arguments for feedback function calls.

Note that this applies only to cases where selectors pick out more than one thing for feedback function arguments. This option is used for the field combinations of FeedbackDefinition and can be specified with Feedback.aggregate.

Attributes¶

ZIP `class-attribute` `instance-attribute` ¶

ZIP = 'zip'

Match argument values per position in produced values.

Example

If the selector for arg1 generates values 0, 1, 2 and one for arg2 generates values "a", "b", "c", the feedback function will be called 3 times with kwargs:

{'arg1': 0, arg2: "a"},
{'arg1': 1, arg2: "b"},
{'arg1': 2, arg2: "c"}

If the quantities of items in the various generators do not match, the result will have only as many combinations as the generator with the fewest items as per python zip (strict mode is not used).

Note that selectors can use Lens collect() to name a single (list) value instead of multiple values.

PRODUCT `class-attribute` `instance-attribute` ¶

PRODUCT = 'product'

Evaluate feedback on all combinations of feedback function arguments.

Example

If the selector for arg1 generates values 0, 1 and the one for arg2 generates values "a", "b", the feedback function will be called 4 times with kwargs:

{'arg1': 0, arg2: "a"},
{'arg1': 0, arg2: "b"},
{'arg1': 1, arg2: "a"},
{'arg1': 1, arg2: "b"}

See itertools.product for more.

Note that selectors can use Lens collect() to name a single (list) value instead of multiple values.

FeedbackDefinition ¶

Bases: WithClassInfo, SerialModel, Hashable

Serialized parts of a feedback function.

The non-serialized parts are in the Feedback class.

Attributes¶

implementation `class-attribute` `instance-attribute` ¶

implementation: Optional[Union[Function, Method]] = None

Implementation serialization.

aggregator `class-attribute` `instance-attribute` ¶

aggregator: Optional[Union[Function, Method]] = None

Aggregator method serialization.

combinations `class-attribute` `instance-attribute` ¶

combinations: Optional[FeedbackCombinations] = PRODUCT

Mode of combining selected values to produce arguments to each feedback function call.

if_exists `class-attribute` `instance-attribute` ¶

if_exists: Optional[Lens] = None

Only execute the feedback function if the following selector names something that exists in a record/app.

Can use this to evaluate conditionally on presence of some calls, for example. Feedbacks skipped this way will have a status of FeedbackResultStatus.SKIPPED.

if_missing `class-attribute` `instance-attribute` ¶

if_missing: FeedbackOnMissingParameters = ERROR

How to handle missing parameters in feedback function calls.

selectors `instance-attribute` ¶

selectors: Dict[str, Lens]

Selectors; pointers into Records of where to get arguments for imp.

supplied_name `class-attribute` `instance-attribute` ¶

supplied_name: Optional[str] = None

An optional name. Only will affect displayed tables.

higher_is_better `class-attribute` `instance-attribute` ¶

higher_is_better: Optional[bool] = None

Feedback result magnitude interpretation.

feedback_definition_id `instance-attribute` ¶

feedback_definition_id: FeedbackDefinitionID = (
    feedback_definition_id
)

Id, if not given, uniquely determined from content.

name `property` ¶

name: str

Name of the feedback function.

Derived from the name of the serialized implementation function if name was not provided.

Feedback¶

trulens_eval.feedback.feedback.Feedback ¶

Attributes¶

imp class-attribute instance-attribute ¶

agg class-attribute instance-attribute ¶

sig property ¶

name property ¶

Functions¶

on_input_output ¶

on_default ¶

evaluate_deferred staticmethod ¶

aggregate ¶

on_prompt ¶

on_response ¶

on ¶

check_selectors ¶

run ¶

extract_selection ¶

Feedback-defining utilities¶

trulens_eval.feedback.feedback.rag_triad ¶

Feedback-related types and containers¶

trulens_eval.feedback.feedback.ImpCallable module-attribute ¶

trulens_eval.feedback.feedback.AggCallable module-attribute ¶

trulens_eval.feedback.feedback.SkipEval ¶

trulens_eval.feedback.feedback.InvalidSelector ¶

trulens_eval.schema.feedback ¶

Classes¶

Select ¶

Attributes¶

Query class-attribute instance-attribute ¶

Tru class-attribute instance-attribute ¶

Record class-attribute instance-attribute ¶

App class-attribute instance-attribute ¶

RecordInput class-attribute instance-attribute ¶

RecordOutput class-attribute instance-attribute ¶

RecordCalls class-attribute instance-attribute ¶

RecordCall class-attribute instance-attribute ¶

RecordArgs class-attribute instance-attribute ¶

RecordRets class-attribute instance-attribute ¶

Functions¶

path_and_method staticmethod ¶

dequalify staticmethod ¶

render_for_dashboard staticmethod ¶

FeedbackMode ¶

Attributes¶

NONE class-attribute instance-attribute ¶

WITH_APP class-attribute instance-attribute ¶

WITH_APP_THREAD class-attribute instance-attribute ¶

DEFERRED class-attribute instance-attribute ¶

FeedbackResultStatus ¶

Attributes¶

NONE class-attribute instance-attribute ¶

RUNNING class-attribute instance-attribute ¶

FAILED class-attribute instance-attribute ¶

DONE class-attribute instance-attribute ¶

SKIPPED class-attribute instance-attribute ¶

FeedbackOnMissingParameters ¶

Attributes¶

ERROR class-attribute instance-attribute ¶

WARN class-attribute instance-attribute ¶

IGNORE class-attribute instance-attribute ¶

FeedbackCall ¶

Attributes¶

args instance-attribute ¶

ret instance-attribute ¶

meta class-attribute instance-attribute ¶

FeedbackResult ¶

Attributes¶

status class-attribute instance-attribute ¶

FeedbackCombinations ¶

Attributes¶

ZIP class-attribute instance-attribute ¶

PRODUCT class-attribute instance-attribute ¶

FeedbackDefinition ¶

Attributes¶

implementation class-attribute instance-attribute ¶

aggregator class-attribute instance-attribute ¶

combinations class-attribute instance-attribute ¶

if_exists class-attribute instance-attribute ¶

if_missing class-attribute instance-attribute ¶

imp `class-attribute` `instance-attribute` ¶

agg `class-attribute` `instance-attribute` ¶

sig `property` ¶

name `property` ¶

evaluate_deferred `staticmethod` ¶

trulens_eval.feedback.feedback.ImpCallable `module-attribute` ¶

trulens_eval.feedback.feedback.AggCallable `module-attribute` ¶

Query `class-attribute` `instance-attribute` ¶

Tru `class-attribute` `instance-attribute` ¶

Record `class-attribute` `instance-attribute` ¶

App `class-attribute` `instance-attribute` ¶

RecordInput `class-attribute` `instance-attribute` ¶

RecordOutput `class-attribute` `instance-attribute` ¶

RecordCalls `class-attribute` `instance-attribute` ¶

RecordCall `class-attribute` `instance-attribute` ¶

RecordArgs `class-attribute` `instance-attribute` ¶

RecordRets `class-attribute` `instance-attribute` ¶

path_and_method `staticmethod` ¶

dequalify `staticmethod` ¶

render_for_dashboard `staticmethod` ¶

NONE `class-attribute` `instance-attribute` ¶

WITH_APP `class-attribute` `instance-attribute` ¶

WITH_APP_THREAD `class-attribute` `instance-attribute` ¶

DEFERRED `class-attribute` `instance-attribute` ¶

NONE `class-attribute` `instance-attribute` ¶

RUNNING `class-attribute` `instance-attribute` ¶

FAILED `class-attribute` `instance-attribute` ¶

DONE `class-attribute` `instance-attribute` ¶

SKIPPED `class-attribute` `instance-attribute` ¶

ERROR `class-attribute` `instance-attribute` ¶

WARN `class-attribute` `instance-attribute` ¶

IGNORE `class-attribute` `instance-attribute` ¶

args `instance-attribute` ¶

ret `instance-attribute` ¶

meta `class-attribute` `instance-attribute` ¶

status `class-attribute` `instance-attribute` ¶

ZIP `class-attribute` `instance-attribute` ¶

PRODUCT `class-attribute` `instance-attribute` ¶

implementation `class-attribute` `instance-attribute` ¶

aggregator `class-attribute` `instance-attribute` ¶

combinations `class-attribute` `instance-attribute` ¶

if_exists `class-attribute` `instance-attribute` ¶

if_missing `class-attribute` `instance-attribute` ¶

selectors `instance-attribute` ¶

supplied_name `class-attribute` `instance-attribute` ¶

higher_is_better `class-attribute` `instance-attribute` ¶

feedback_definition_id `instance-attribute` ¶

name `property` ¶