Skip to content

trulens.core.feedback.feedback

trulens.core.feedback.feedback

DEPRECATED: This module is deprecated. Use trulens.core.metric instead.

The Feedback class is now a deprecated alias for Metric. This module is maintained for backward compatibility.

Classes

InvalidSelector

Bases: Exception

Raised when a selector names something that is missing in a record/app.

SkipEval

Bases: Exception

Raised when evaluating a metric function implementation to skip it so it is not aggregated with other non-skipped results.

PARAMETER DESCRIPTION
reason

Optional reason for why this evaluation was skipped.

TYPE: Optional[str] DEFAULT: None

metric

The Metric instance this run corresponds to.

TYPE: Optional[Metric] DEFAULT: None

ins

The arguments to this run.

TYPE: Optional[Dict[str, Any]] DEFAULT: None

Feedback

Bases: Metric

DEPRECATED: Use Metric instead.

This class is maintained for backward compatibility. All functionality has been moved to the Metric class.

Example of migrating to Metric
# Old way (deprecated):
from trulens.core import Feedback
feedback = Feedback(provider.relevance).on_input_output()

# New way (recommended):
from trulens.core import Metric
metric = Metric(implementation=provider.relevance).on_input_output()
Attributes
tru_class_info instance-attribute
tru_class_info: Class

Class information of this pydantic object for use in deserialization.

Using this odd key to not pollute attribute names in whatever class we mix this into. Should be the same as CLASS_INFO.

implementation class-attribute instance-attribute
implementation: Optional[Union[Function, Method]] = None

Implementation serialization.

aggregator class-attribute instance-attribute
aggregator: Optional[Union[Function, Method]] = None

Aggregator method serialization.

examples class-attribute instance-attribute
examples: Optional[List[Tuple]] = examples

Examples to use when evaluating the metric.

criteria class-attribute instance-attribute
criteria: Optional[str] = criteria

Criteria for the metric.

additional_instructions class-attribute instance-attribute
additional_instructions: Optional[str] = (
    additional_instructions
)

Additional instructions for the metric.

combinations class-attribute instance-attribute

Mode of combining selected values to produce arguments to each feedback function call.

feedback_definition_id instance-attribute
feedback_definition_id: FeedbackDefinitionID = (
    feedback_definition_id
)

Id, if not given, uniquely determined from content.

if_exists class-attribute instance-attribute
if_exists: Optional[Lens] = None

Only execute the feedback function if the following selector names something that exists in a record/app.

Can use this to evaluate conditionally on presence of some calls, for example. Feedbacks skipped this way will have a status of FeedbackResultStatus.SKIPPED.

if_missing class-attribute instance-attribute

How to handle missing parameters in feedback function calls.

run_location instance-attribute

Where the feedback evaluation takes place (e.g. locally, at a Snowflake server, etc).

selectors instance-attribute
selectors: Dict[str, Union[Lens, Selector]]

Selectors; pointers into Records of where to get arguments for imp. In OTEL mode, these are Selector objects; in legacy mode, these are Lens objects.

supplied_name class-attribute instance-attribute
supplied_name: Optional[str] = None

An optional name. Only will affect displayed tables.

higher_is_better class-attribute instance-attribute
higher_is_better: Optional[bool] = None

Feedback result magnitude interpretation.

metric_type class-attribute instance-attribute
metric_type: Optional[str] = None

Implementation identifier for this metric.

E.g., "relevance", "groundedness", "text2sql". If not provided, defaults to the function name. This allows the same metric implementation to be used multiple times with different configurations and names.

description class-attribute instance-attribute
description: Optional[str] = None

Human-readable description of what this metric measures.

name property
name: str

Name of the metric.

Derived from the name of the function implementing it if no supplied name provided.

imp class-attribute instance-attribute
imp: Optional[ImpCallable] = implementation

Implementation callable.

A serialized version is stored at FeedbackDefinition.implementation.

agg class-attribute instance-attribute
agg: Optional[AggCallable] = agg

Aggregator method for metrics that produce more than one result.

A serialized version is stored at FeedbackDefinition.aggregator.

min_score_val class-attribute instance-attribute
min_score_val: Optional[int] = min_score_val

Minimum score value for the metric.

max_score_val class-attribute instance-attribute
max_score_val: Optional[int] = max_score_val

Maximum score value for the metric.

temperature class-attribute instance-attribute
temperature: Optional[float] = temperature

Temperature parameter for the metric.

groundedness_configs class-attribute instance-attribute
groundedness_configs: Optional[GroundednessConfigs] = (
    groundedness_configs
)

Optional groundedness configuration parameters.

enable_trace_compression class-attribute instance-attribute
enable_trace_compression: Optional[bool] = (
    enable_trace_compression
)

Whether to compress trace data to reduce token usage when sending traces to metrics.

When True, traces are compressed to preserve essential information while removing redundant data. When False, full uncompressed traces are used. When None (default), the metric's default behavior is used. This flag is only applicable to metrics that take 'trace' as an input parameter.

sig property
sig: Signature

Signature of the metric function implementation.

Functions
__init__
__init__(
    imp: Optional[Callable] = None,
    agg: Optional[Callable] = None,
    examples: Optional[List[Tuple]] = None,
    criteria: Optional[str] = None,
    additional_instructions: Optional[str] = None,
    min_score_val: Optional[int] = 0,
    max_score_val: Optional[int] = 3,
    temperature: Optional[float] = 0.0,
    groundedness_configs: Optional[
        GroundednessConfigs
    ] = None,
    enable_trace_compression: Optional[bool] = None,
    **kwargs
)

Initialize a Feedback (deprecated, use Metric instead).

PARAMETER DESCRIPTION
imp

The feedback function to execute. DEPRECATED: Use implementation instead.

TYPE: Optional[Callable] DEFAULT: None

agg

Aggregator function for combining multiple feedback results.

TYPE: Optional[Callable] DEFAULT: None

examples

User-supplied examples for this feedback function.

TYPE: Optional[List[Tuple]] DEFAULT: None

criteria

Criteria for the feedback evaluation.

TYPE: Optional[str] DEFAULT: None

additional_instructions

Custom instructions for the feedback function.

TYPE: Optional[str] DEFAULT: None

min_score_val

Minimum score value (default: 0).

TYPE: Optional[int] DEFAULT: 0

max_score_val

Maximum score value (default: 3).

TYPE: Optional[int] DEFAULT: 3

temperature

Temperature parameter for LLM-based feedback (default: 0.0).

TYPE: Optional[float] DEFAULT: 0.0

groundedness_configs

Optional groundedness configuration.

TYPE: Optional[GroundednessConfigs] DEFAULT: None

enable_trace_compression

Whether to compress trace data.

TYPE: Optional[bool] DEFAULT: None

**kwargs

Additional arguments passed to parent class.

DEFAULT: {}

__rich_repr__
__rich_repr__() -> Result

Requirement for pretty printing using the rich package.

load staticmethod
load(obj, *args, **kwargs)

Deserialize/load this object using the class information in tru_class_info to lookup the actual class that will do the deserialization.

model_validate classmethod
model_validate(*args, **kwargs) -> Any

Deserialized a jsonized version of the app into the instance of the class it was serialized from.

Note

This process uses extra information stored in the jsonized object and handled by WithClassInfo.

on_input_output
on_input_output() -> Metric

Specifies that the metric implementation arguments are to be the main app input and output in that order.

Returns a new Metric object with the specification.

on_default
on_default() -> Metric

Specifies that one argument metrics should be evaluated on the main app output and two argument metrics should be evaluated on main input and main output in that order.

Returns a new Metric object with this specification.

evaluate_deferred staticmethod
evaluate_deferred(
    session: TruSession,
    limit: Optional[int] = None,
    shuffle: bool = False,
    run_location: Optional[FeedbackRunLocation] = None,
) -> List[Tuple[Series, Future[FeedbackResult]]]

Evaluates metrics that were specified to be deferred.

Returns a list of tuples with the DB row containing the Metric and initial FeedbackResult as well as the Future which will contain the actual result.

PARAMETER DESCRIPTION
limit

The maximum number of evals to start.

TYPE: Optional[int] DEFAULT: None

shuffle

Shuffle the order of the metrics to evaluate.

TYPE: bool DEFAULT: False

run_location

Only run metrics with this run_location.

TYPE: Optional[FeedbackRunLocation] DEFAULT: None

Constants that govern behavior:

  • TruSession.RETRY_RUNNING_SECONDS: How long to time before restarting a metric that was started but never failed (or failed without recording that fact).

  • TruSession.RETRY_FAILED_SECONDS: How long to wait to retry a failed metric.

aggregate
aggregate(
    func: Optional[AggCallable] = None,
    combinations: Optional[FeedbackCombinations] = None,
) -> Metric

Specify the aggregation function in case the selectors for this metric generate more than one value for implementation argument(s). Can also specify the method of producing combinations of values in such cases.

Returns a new Metric object with the given aggregation function and/or the given combination mode.

on_prompt
on_prompt(arg: Optional[str] = None) -> Metric

Create a variant of self that will take in the main app input or "prompt" as input, sending it as an argument arg to implementation.

on_response
on_response(arg: Optional[str] = None) -> Metric

Create a variant of self that will take in the main app output or "response" as input, sending it as an argument arg to implementation.

on_context
on_context(
    arg: Optional[str] = None, *, collect_list: bool
)

Create a variant of self that will attempt to take in the context from a context retrieval as input, sending it as an argument arg to implementation.

on
on(*args, **kwargs) -> Metric

Create a variant of self with the same implementation but the given selectors. Those provided positionally get their implementation argument name guessed and those provided as kwargs get their name from the kwargs key.

check_selectors
check_selectors(
    app: Union[AppDefinition, JSON],
    record: Record,
    source_data: Optional[Dict[str, Any]] = None,
    warning: bool = False,
) -> bool

Check that the selectors are valid for the given app and record.

PARAMETER DESCRIPTION
app

The app that produced the record.

TYPE: Union[AppDefinition, JSON]

record

The record that the metric will run on. This can be a mostly empty record for checking ahead of producing one. The utility method App.dummy_record is built for this purpose.

TYPE: Record

source_data

Additional data to select from when extracting metric function arguments.

TYPE: Optional[Dict[str, Any]] DEFAULT: None

warning

Issue a warning instead of raising an error if a selector is invalid. As some parts of a Record cannot be known ahead of producing it, it may be necessary to not raise exception here and only issue a warning.

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
bool

True if the selectors are valid. False if not (if warning is set).

RAISES DESCRIPTION
ValueError

If a selector is invalid and warning is not set.

run
run(
    app: Optional[Union[AppDefinition, JSON]] = None,
    record: Optional[Record] = None,
    source_data: Optional[Dict] = None,
    **kwargs: Dict[str, Any]
) -> FeedbackResult

Run the metric on the given record. The app that produced the record is also required to determine input/output argument names.

PARAMETER DESCRIPTION
app

The app that produced the record. This can be AppDefinition or a jsonized AppDefinition. It will be jsonized if it is not already.

TYPE: Optional[Union[AppDefinition, JSON]] DEFAULT: None

record

The record to evaluate the metric on.

TYPE: Optional[Record] DEFAULT: None

source_data

Additional data to select from when extracting metric function arguments.

TYPE: Optional[Dict] DEFAULT: None

**kwargs

Any additional keyword arguments are used to set or override selected metric function inputs.

TYPE: Dict[str, Any] DEFAULT: {}

RETURNS DESCRIPTION
FeedbackResult

A FeedbackResult object with the result of the metric.

extract_selection
extract_selection(
    app: Optional[Union[AppDefinition, JSON]] = None,
    record: Optional[Record] = None,
    source_data: Optional[Dict] = None,
) -> Iterable[Dict[str, Any]]

Given the app that produced the given record, extract from record the values that will be sent as arguments to the implementation as specified by self.selectors. Additional data to select from can be provided in source_data. All args are optional. If a Record is specified, its calls are laid out as app (see layout_calls_as_app).

SnowflakeFeedback

Bases: Feedback

[DEPRECATED] Similar to the parent class Feedback except this ensures the feedback is run only on the Snowflake server.

This class is deprecated and will be removed in the next major release. Please use Metric or Snowflake AI Observability instead.

Attributes
tru_class_info instance-attribute
tru_class_info: Class

Class information of this pydantic object for use in deserialization.

Using this odd key to not pollute attribute names in whatever class we mix this into. Should be the same as CLASS_INFO.

implementation class-attribute instance-attribute
implementation: Optional[Union[Function, Method]] = None

Implementation serialization.

aggregator class-attribute instance-attribute
aggregator: Optional[Union[Function, Method]] = None

Aggregator method serialization.

examples class-attribute instance-attribute
examples: Optional[List[Tuple]] = examples

Examples to use when evaluating the metric.

criteria class-attribute instance-attribute
criteria: Optional[str] = criteria

Criteria for the metric.

additional_instructions class-attribute instance-attribute
additional_instructions: Optional[str] = (
    additional_instructions
)

Additional instructions for the metric.

combinations class-attribute instance-attribute

Mode of combining selected values to produce arguments to each feedback function call.

feedback_definition_id instance-attribute
feedback_definition_id: FeedbackDefinitionID = (
    feedback_definition_id
)

Id, if not given, uniquely determined from content.

if_exists class-attribute instance-attribute
if_exists: Optional[Lens] = None

Only execute the feedback function if the following selector names something that exists in a record/app.

Can use this to evaluate conditionally on presence of some calls, for example. Feedbacks skipped this way will have a status of FeedbackResultStatus.SKIPPED.

if_missing class-attribute instance-attribute

How to handle missing parameters in feedback function calls.

selectors instance-attribute
selectors: Dict[str, Union[Lens, Selector]]

Selectors; pointers into Records of where to get arguments for imp. In OTEL mode, these are Selector objects; in legacy mode, these are Lens objects.

supplied_name class-attribute instance-attribute
supplied_name: Optional[str] = None

An optional name. Only will affect displayed tables.

higher_is_better class-attribute instance-attribute
higher_is_better: Optional[bool] = None

Feedback result magnitude interpretation.

metric_type class-attribute instance-attribute
metric_type: Optional[str] = None

Implementation identifier for this metric.

E.g., "relevance", "groundedness", "text2sql". If not provided, defaults to the function name. This allows the same metric implementation to be used multiple times with different configurations and names.

description class-attribute instance-attribute
description: Optional[str] = None

Human-readable description of what this metric measures.

name property
name: str

Name of the metric.

Derived from the name of the function implementing it if no supplied name provided.

imp class-attribute instance-attribute
imp: Optional[ImpCallable] = implementation

Implementation callable.

A serialized version is stored at FeedbackDefinition.implementation.

agg class-attribute instance-attribute
agg: Optional[AggCallable] = agg

Aggregator method for metrics that produce more than one result.

A serialized version is stored at FeedbackDefinition.aggregator.

min_score_val class-attribute instance-attribute
min_score_val: Optional[int] = min_score_val

Minimum score value for the metric.

max_score_val class-attribute instance-attribute
max_score_val: Optional[int] = max_score_val

Maximum score value for the metric.

temperature class-attribute instance-attribute
temperature: Optional[float] = temperature

Temperature parameter for the metric.

groundedness_configs class-attribute instance-attribute
groundedness_configs: Optional[GroundednessConfigs] = (
    groundedness_configs
)

Optional groundedness configuration parameters.

enable_trace_compression class-attribute instance-attribute
enable_trace_compression: Optional[bool] = (
    enable_trace_compression
)

Whether to compress trace data to reduce token usage when sending traces to metrics.

When True, traces are compressed to preserve essential information while removing redundant data. When False, full uncompressed traces are used. When None (default), the metric's default behavior is used. This flag is only applicable to metrics that take 'trace' as an input parameter.

sig property
sig: Signature

Signature of the metric function implementation.

Functions
__rich_repr__
__rich_repr__() -> Result

Requirement for pretty printing using the rich package.

load staticmethod
load(obj, *args, **kwargs)

Deserialize/load this object using the class information in tru_class_info to lookup the actual class that will do the deserialization.

model_validate classmethod
model_validate(*args, **kwargs) -> Any

Deserialized a jsonized version of the app into the instance of the class it was serialized from.

Note

This process uses extra information stored in the jsonized object and handled by WithClassInfo.

on_input_output
on_input_output() -> Metric

Specifies that the metric implementation arguments are to be the main app input and output in that order.

Returns a new Metric object with the specification.

on_default
on_default() -> Metric

Specifies that one argument metrics should be evaluated on the main app output and two argument metrics should be evaluated on main input and main output in that order.

Returns a new Metric object with this specification.

evaluate_deferred staticmethod
evaluate_deferred(
    session: TruSession,
    limit: Optional[int] = None,
    shuffle: bool = False,
    run_location: Optional[FeedbackRunLocation] = None,
) -> List[Tuple[Series, Future[FeedbackResult]]]

Evaluates metrics that were specified to be deferred.

Returns a list of tuples with the DB row containing the Metric and initial FeedbackResult as well as the Future which will contain the actual result.

PARAMETER DESCRIPTION
limit

The maximum number of evals to start.

TYPE: Optional[int] DEFAULT: None

shuffle

Shuffle the order of the metrics to evaluate.

TYPE: bool DEFAULT: False

run_location

Only run metrics with this run_location.

TYPE: Optional[FeedbackRunLocation] DEFAULT: None

Constants that govern behavior:

  • TruSession.RETRY_RUNNING_SECONDS: How long to time before restarting a metric that was started but never failed (or failed without recording that fact).

  • TruSession.RETRY_FAILED_SECONDS: How long to wait to retry a failed metric.

aggregate
aggregate(
    func: Optional[AggCallable] = None,
    combinations: Optional[FeedbackCombinations] = None,
) -> Metric

Specify the aggregation function in case the selectors for this metric generate more than one value for implementation argument(s). Can also specify the method of producing combinations of values in such cases.

Returns a new Metric object with the given aggregation function and/or the given combination mode.

on_prompt
on_prompt(arg: Optional[str] = None) -> Metric

Create a variant of self that will take in the main app input or "prompt" as input, sending it as an argument arg to implementation.

on_response
on_response(arg: Optional[str] = None) -> Metric

Create a variant of self that will take in the main app output or "response" as input, sending it as an argument arg to implementation.

on_context
on_context(
    arg: Optional[str] = None, *, collect_list: bool
)

Create a variant of self that will attempt to take in the context from a context retrieval as input, sending it as an argument arg to implementation.

on
on(*args, **kwargs) -> Metric

Create a variant of self with the same implementation but the given selectors. Those provided positionally get their implementation argument name guessed and those provided as kwargs get their name from the kwargs key.

check_selectors
check_selectors(
    app: Union[AppDefinition, JSON],
    record: Record,
    source_data: Optional[Dict[str, Any]] = None,
    warning: bool = False,
) -> bool

Check that the selectors are valid for the given app and record.

PARAMETER DESCRIPTION
app

The app that produced the record.

TYPE: Union[AppDefinition, JSON]

record

The record that the metric will run on. This can be a mostly empty record for checking ahead of producing one. The utility method App.dummy_record is built for this purpose.

TYPE: Record

source_data

Additional data to select from when extracting metric function arguments.

TYPE: Optional[Dict[str, Any]] DEFAULT: None

warning

Issue a warning instead of raising an error if a selector is invalid. As some parts of a Record cannot be known ahead of producing it, it may be necessary to not raise exception here and only issue a warning.

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
bool

True if the selectors are valid. False if not (if warning is set).

RAISES DESCRIPTION
ValueError

If a selector is invalid and warning is not set.

run
run(
    app: Optional[Union[AppDefinition, JSON]] = None,
    record: Optional[Record] = None,
    source_data: Optional[Dict] = None,
    **kwargs: Dict[str, Any]
) -> FeedbackResult

Run the metric on the given record. The app that produced the record is also required to determine input/output argument names.

PARAMETER DESCRIPTION
app

The app that produced the record. This can be AppDefinition or a jsonized AppDefinition. It will be jsonized if it is not already.

TYPE: Optional[Union[AppDefinition, JSON]] DEFAULT: None

record

The record to evaluate the metric on.

TYPE: Optional[Record] DEFAULT: None

source_data

Additional data to select from when extracting metric function arguments.

TYPE: Optional[Dict] DEFAULT: None

**kwargs

Any additional keyword arguments are used to set or override selected metric function inputs.

TYPE: Dict[str, Any] DEFAULT: {}

RETURNS DESCRIPTION
FeedbackResult

A FeedbackResult object with the result of the metric.

extract_selection
extract_selection(
    app: Optional[Union[AppDefinition, JSON]] = None,
    record: Optional[Record] = None,
    source_data: Optional[Dict] = None,
) -> Iterable[Dict[str, Any]]

Given the app that produced the given record, extract from record the values that will be sent as arguments to the implementation as specified by self.selectors. Additional data to select from can be provided in source_data. All args are optional. If a Record is specified, its calls are laid out as app (see layout_calls_as_app).