Claude 3 Quickstartยถ
In this quickstart you will learn how to use Anthropic's Claude 3 to run feedback functions by using LiteLLM as the feedback provider.
Anthropic Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems. Claude is Anthropics AI assistant, of which Claude 3 is the latest and greatest. Claude 3 comes in three varieties: Haiku, Sonnet and Opus which can all be used to run feedback functions.
# !pip install trulens trulens-providers-litellm chromadb openai
import os
os.environ["OPENAI_API_KEY"] = "sk-..." # for running application only
os.environ["ANTHROPIC_API_KEY"] = "sk-..." # for running feedback functions
import os
from litellm import completion
messages = [{"role": "user", "content": "Hey! how's it going?"}]
response = completion(model="claude-3-haiku-20240307", messages=messages)
print(response)
Get Dataยถ
In this case, we'll just initialize some simple text in the notebook.
university_info = """
The University of Washington, founded in 1861 in Seattle, is a public research university
with over 45,000 students across three campuses in Seattle, Tacoma, and Bothell.
As the flagship institution of the six public universities in Washington state,
UW encompasses over 500 buildings and 20 million square feet of space,
including one of the largest library systems in the world.
"""
Create Vector Storeยถ
Create a chromadb vector store in memory.
from openai import OpenAI
oai_client = OpenAI()
oai_client.embeddings.create(
model="text-embedding-ada-002", input=university_info
)
import chromadb
from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction
embedding_function = OpenAIEmbeddingFunction(
api_key=os.environ.get("OPENAI_API_KEY"),
model_name="text-embedding-ada-002",
)
chroma_client = chromadb.Client()
vector_store = chroma_client.get_or_create_collection(
name="Universities", embedding_function=embedding_function
)
Add the university_info to the embedding database.
vector_store.add("uni_info", documents=university_info)
Build RAG from scratchยถ
Build a custom RAG from scratch, and add TruLens custom instrumentation.
from trulens.core import TruSession
from trulens.apps.custom import instrument
session = TruSession()
session.reset_database()
class RAG_from_scratch:
@instrument
def retrieve(self, query: str) -> list:
"""
Retrieve relevant text from vector store.
"""
results = vector_store.query(query_texts=query, n_results=2)
return results["documents"][0]
@instrument
def generate_completion(self, query: str, context_str: list) -> str:
"""
Generate answer from context.
"""
completion = (
oai_client.chat.completions.create(
model="gpt-3.5-turbo",
temperature=0,
messages=[
{
"role": "user",
"content": f"We have provided context information below. \n"
f"---------------------\n"
f"{context_str}"
f"\n---------------------\n"
f"Given this information, please answer the question: {query}",
}
],
)
.choices[0]
.message.content
)
return completion
@instrument
def query(self, query: str) -> str:
context_str = self.retrieve(query)
completion = self.generate_completion(query, context_str)
return completion
rag = RAG_from_scratch()
Set up feedback functions.ยถ
Here we'll use groundedness, answer relevance and context relevance to detect hallucination.
import numpy as np
from trulens.core import Feedback
from trulens.core import Select
from trulens.feedback.v2.feedback import Groundedness
from trulens.providers.litellm import LiteLLM
# Initialize LiteLLM-based feedback function collection class:
provider = LiteLLM(model_engine="claude-3-opus-20240229")
grounded = Groundedness(groundedness_provider=provider)
# Define a groundedness feedback function
f_groundedness = (
Feedback(
provider.groundedness_measure_with_cot_reasons, name="Groundedness"
)
.on(Select.RecordCalls.retrieve.rets.collect())
.on_output()
)
# Question/answer relevance between overall question and answer.
f_answer_relevance = (
Feedback(provider.relevance_with_cot_reasons, name="Answer Relevance")
.on(Select.RecordCalls.retrieve.args.query)
.on_output()
)
# Question/statement relevance between question and each context chunk.
f_context_relevance = (
Feedback(
provider.context_relevance_with_cot_reasons, name="Context Relevance"
)
.on(Select.RecordCalls.retrieve.args.query)
.on(Select.RecordCalls.retrieve.rets.collect())
.aggregate(np.mean)
)
f_coherence = Feedback(
provider.coherence_with_cot_reasons, name="coherence"
).on_output()
grounded.groundedness_measure_with_cot_reasons(
"""e University of Washington, founded in 1861 in Seattle, is a public '
'research university\n'
'with over 45,000 students across three campuses in Seattle, Tacoma, and '
'Bothell.\n'
'As the flagship institution of the six public universities in Washington 'githugithub
'state,\n'
'UW encompasses over 500 buildings and 20 million square feet of space,\n'
'including one of the largest library systems in the world.\n']]""",
"The University of Washington was founded in 1861. It is the flagship institution of the state of washington.",
)
Construct the appยถ
Wrap the custom RAG with TruCustomApp, add list of feedbacks for eval
from trulens.apps.custom import TruCustomApp
tru_rag = TruCustomApp(
rag,
app_name="RAG",
app_version="v1",
feedbacks=[
f_groundedness,
f_answer_relevance,
f_context_relevance,
f_coherence,
],
)
Run the appยถ
Use tru_rag
as a context manager for the custom RAG-from-scratch app.
with tru_rag as recording:
rag.query("Give me a long history of U Dub")
session.get_leaderboard(app_ids=[tru_rag.app_id])
from trulens.dashboard import run_dashboard
run_dashboard(session)