trulens.feedback.templates.agent¶
trulens.feedback.templates.agent
¶
Agentic evaluation templates: logical consistency, execution efficiency, plan adherence, plan quality, tool selection, tool calling, tool quality.
Classes¶
LogicalConsistency
dataclass
¶
Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin
Evaluates the logical consistency of the agentic system's plan and execution.
ExecutionEfficiency
dataclass
¶
Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin
Evaluates the efficiency of the agentic system's execution.
PlanAdherence
dataclass
¶
Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin
Evaluates the adherence of the agentic system's execution to the agentic system's plan.
PlanQuality
dataclass
¶
Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin
Evaluates the quality of the agentic system's plan to address the user's query.
ToolSelection
dataclass
¶
Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin
Evaluates the agent's choice of tools for its tasks/subtasks given tool descriptions. Mapped to PLAN (lower-level complement to Plan Quality). Excludes execution efficiency and adherence; focuses on suitability of selection.
ToolCalling
dataclass
¶
Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin
Evaluates the agent's tool invocation quality that is within the agent's control: argument validity/completeness, semantic appropriateness, preconditions/postconditions, and output interpretation. Mapped to ACT (specialized complement to Plan Adherence). Excludes selection and efficiency.
ToolQuality
dataclass
¶
Bases: Semantics, WithPrompt, CriteriaOutputSpaceMixin
Evaluates the tool/system side quality and reliability observed in the trace (external errors, availability, stability, domain-specific output quality like search relevance). Independent of agent behavior; complements GPA by isolating tool-side failures.