Skip to content
π¦ TruLens
trulens.benchmark.benchmark_frameworks
Initializing search
truera/trulens
π Home
π Getting Started
Component Guides
API Reference
π§βπ³ Cookbook
π€ Contributing
π¦ TruLens
truera/trulens
π Home
π Getting Started
π Getting Started
π Quickstarts
π Quickstarts
π TruLens Quickstart
π TruLens with Outside Logs
π LangChain Quickstart
π LlamaIndex Quickstart
π Text to Text Quickstart
π Ground Truth Evaluations
π Logging Human Feedback
β Core Concepts
β Core Concepts
β Feedback Functions
β RAG Triad
π Honest, Harmless, Helpful Evals
πViewing Results
πViewing Results
Component Guides
Component Guides
Instrumentation
Instrumentation
π¦οΈπ LangChain Integration
π¦ LlamaIndex Integration
NeMo Guardrails Integration
Logging
Logging
βοΈ Logging in Snowflake
Logging Methods
π― Evaluation
π― Evaluation
Anatomy of a Feedback Function
Feedback Implementations
Feedback Implementations
Stock Feedback Functions
π Custom Feedback Functions
Feedback Selectors
Feedback Selectors
Selecting Components
Selector Shortcuts
Feedback Aggregation
Feedback Aggregation
Running Feedback Functions
Running Feedback Functions
Running with your app
Running on existing data
Generating Test Cases
Generating Test Cases
π Evaluation Benchmarks
π Evaluation Benchmarks
π Answer Relevance Feedback Evaluation
π Comprehensiveness Evaluations
π Context Relevance Evaluations
π Context Relevance Benchmarking: ranking is all you need.
π Groundedness Evaluations
π‘οΈ Guardrails
π‘οΈ Guardrails
Other
Other
Moving from TruLens Eval
Uninstalling TruLens
API Reference
API Reference
providers
providers
π¦ Snowflake Cortex
π¦ Snowflake Cortex
endpoint
provider
π¦ LangChain
π¦ LangChain
endpoint
provider
π¦ Amazon Bedrock
π¦ Amazon Bedrock
endpoint
provider
π¦ HuggingFace
π¦ HuggingFace
endpoint
provider
π¦ LiteLLM
π¦ LiteLLM
endpoint
provider
π¦ OpenAI
π¦ OpenAI
endpoint
provider
apps
apps
basic
custom
virtual
π¦ LlamaIndex
π¦ LlamaIndex
guardrails
llama
tru_llama
π¦ LangChain
π¦ LangChain
guardrails
langchain
tru_chain
π¦ Nemo Guardrails
π¦ Nemo Guardrails
tru_rails
connectors
connectors
π¦ Snowflake
π¦ Snowflake
connector
utils
utils
server_side_evaluation_artifacts
server_side_evaluation_stored_procedure
β trulens_eval
core
core
app
database
database
base
connector
connector
base
default
exceptions
legacy
legacy
migration
migrations
migrations
data
env
orm
sqlalchemy
utils
feedback
feedback
endpoint
feedback
provider
guardrails
guardrails
base
instruments
schema
schema
app
base
dataset
feedback
groundtruth
record
select
types
session
utils
utils
asynchro
constants
containers
deprecation
imports
json
keys
pace
pyschema
python
serial
text
threading
trulens
feedback
feedback
dummy
dummy
endpoint
provider
embeddings
feedback
generated
groundtruth
llm_provider
prompts
v2
v2
feedback
provider
provider
base
dashboard
dashboard
Leaderboard
appui
components
components
record_viewer
display
notebook_utils
pages
pages
Evaluations
run
streamlit
streamlit_utils
ux
ux
components
page_config
styles
benchmark
benchmark
benchmark_frameworks
benchmark_frameworks
tru_benchmark_experiment
generate
generate
generate_test_set
test_cases
π§βπ³ Cookbook
π§βπ³ Cookbook
Frameworks
Frameworks
Canopy
Canopy
TruLens-Canopy Quickstart
Langchain
Langchain
LangChain Agents
LangChain Async
LangChain Ensemble Retriever
Ground Truth Evaluations
LangChain Math Agent
Langchain model comparison
LangChain retrieval agent
Langchain summarize
Llama index
Llama index
Llama index agents
LlamaIndex Async
Advanced Evaluation Methods
Groundtruth evaluation for LlamaIndex applications
LlamaIndex Hybrid Retriever + Reranking + Guardrails
Evaluating Multi-Modal RAG
Query Planning in LlamaIndex
Measuring Retrieval Quality
Nemoguardrails
Nemoguardrails
Feedback functions in NeMo Guardrails apps
Monitoring and Evaluating NeMo Guardrails apps
Openai assistants
Openai assistants
OpenAI Assistants API
Models
Models
Anthropic
Anthropic
Anthropic Quickstart
Claude 3 Quickstart
Azure
Azure
Azure OpenAI LangChain Quickstart
Azure OpenAI Llama Index Quickstart
Bedrock
Bedrock
AWS Bedrock
Deploy, Fine-tune Foundation Models with AWS Sagemaker, Iterate and Monitor with TruEra
Google
Google
Multi-modal LLMs and Multimodal RAG with Gemini
Google Vertex
local and OSS models
local and OSS models
Vectara HHEM Evaluator Quickstart
LiteLLM Quickstart
Local vs Remote Huggingface Feedback Functions
Ollama Quickstart
Snowflake cortex
Snowflake cortex
βοΈ Snowflake Arctic Quickstart with Cortex LLM Functions
Use cases
Use cases
Language Verification
Model Comparison
Moderation
PII Detection
βοΈ Snowflake with Key-Pair Authentication
Evaluating Summarization with TruLens
Iterate on rag
Iterate on rag
Iterating on LLM Apps with TruLens
Iterating on LLM Apps with TruLens
Iterating on LLM Apps with TruLens
Iterating on LLM Apps with TruLens
Iterating on LLM Apps with TruLens
Vector stores
Vector stores
Faiss
Faiss
LangChain with FAISS Vector DB
Milvus
Milvus
Iterating with RAG on Milvus
Milvus
Mongodb
Mongodb
Atlas quickstart
Pinecone
Pinecone
Pinecone Configuration Choices on Downstream App Performance
Simple Pinecone setup with LlamaIndex + Eval
π€ Contributing
π€ Contributing
π§ Design
β Standards
π£ Tech Debt
β Optional Pckages
β¨ Database Migration
Table of contents
benchmark_frameworks
trulens.benchmark.benchmark_frameworks
¶
trulens.benchmark.benchmark_frameworks
¶
Back to top