TruLens: scale up and accelerate LLM app evaluation
Create credible and powerful LLM apps, faster. TruLens is a software tool that helps you to
objectively measure the quality and effectiveness of your LLM-based applications using feedback
functions. Feedback functions help to programmatically evaluate the quality of inputs, outputs,
and intermediate results, so that you can expedite and scale up experiment evaluation. Use it
for a wide variety of use cases including question answering, retrieval-augmented generation,
and agent-based applications.
Evaluate how your choices are performing across multiple feedback functions, such as:
Leverage and add to an extensible library of built-in feedback functions. Observe where apps
have weaknesses to inform iteration on prompts, hyperparameters, and more.
Compare different LLM chains on a metrics leaderboard to pick the best performing one.