Skip to content

trulens.feedback.v2.feedback

trulens.feedback.v2.feedback

BACKWARD-COMPATIBILITY SHIM.

All template classes have moved to trulens.feedback.templates. This module re-exports them so that existing imports of the form from trulens.feedback.v2.feedback import Groundedness keep working.

Prefer importing from trulens.feedback.templates in new code.

Classes

LogicalConsistency dataclass

Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin

Evaluates the logical consistency of the agentic system's plan and execution.

ExecutionEfficiency dataclass

Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin

Evaluates the efficiency of the agentic system's execution.

PlanAdherence dataclass

Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin

Evaluates the adherence of the agentic system's execution to the agentic system's plan.

PlanQuality dataclass

Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin

Evaluates the quality of the agentic system's plan to address the user's query.

ToolSelection dataclass

Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin

Evaluates the agent's choice of tools for its tasks/subtasks given tool descriptions. Mapped to PLAN (lower-level complement to Plan Quality). Excludes execution efficiency and adherence; focuses on suitability of selection.

ToolCalling dataclass

Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin

Evaluates the agent's tool invocation quality that is within the agent's control: argument validity/completeness, semantic appropriateness, preconditions/postconditions, and output interpretation. Mapped to ACT (specialized complement to Plan Adherence). Excludes selection and efficiency.

ToolQuality dataclass

Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin

Evaluates the tool/system side quality and reliability observed in the trace (external errors, availability, stability, domain-specific output quality like search relevance). Independent of agent behavior; complements GPA by isolating tool-side failures.

Criteria

Bases: str, Enum

A Criteria to evaluate.

OutputSpace

Bases: Enum

Enum for valid output spaces of scores.

FewShotExamples

Bases: BaseModel

Functions
from_examples_list classmethod
from_examples_list(
    examples_list: List[Tuple[Dict[str, str], int]]
) -> FewShotExamples

Create a FewShotExamples instance from a list of examples.

PARAMETER DESCRIPTION
examples_list

A list of tuples where the first element is the feedback_args and the second element is the score.

TYPE: List[Tuple[Dict[str, str], int]]

RETURNS DESCRIPTION
FewShotExamples

An instance of FewShotExamples with the provided

FewShotExamples

examples.

FeedbackOutput

Bases: BaseModel

Feedback functions produce at least a floating score.

ClassificationModel

Bases: Model

Functions
of_prompt staticmethod
of_prompt(model: CompletionModel, prompt: str) -> None

Define a classification model from a completion model, a prompt, and optional examples.

Sentiment dataclass

Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin

This evaluates the positive sentiment of either the prompt or response.

Harmfulness

Bases: Moderation, WithPrompt

Examples of Harmfulness:

Insensitivity

Bases: Semantics, WithPrompt

Examples and categorization of racial insensitivity: https://sph.umn.edu/site/docs/hewg/microaggressions.pdf .

Maliciousness

Bases: Moderation, WithPrompt

Examples of maliciousness:

HateThreatening

Bases: Hate

Examples of (not) Threatening Hate metrics:

  • openai package: openai.moderation category hate/threatening.

SelfHarm

Bases: Moderation

Examples of (not) Self Harm metrics:

  • openai package: openai.moderation category self-harm.

Sexual

Bases: Moderation

Examples of (not) Sexual metrics:

  • openai package: openai.moderation category sexual.

SexualMinors

Bases: Sexual

Examples of (not) Sexual Minors metrics:

  • openai package: openai.moderation category sexual/minors.

Violence

Bases: Moderation

Examples of (not) Violence metrics:

  • openai package: openai.moderation category violence.

GraphicViolence

Bases: Violence

Examples of (not) Graphic Violence:

  • openai package: openai.moderation category violence/graphic.