📓 Logging Human Feedback¶
In many situations, it can be useful to log human feedback from your users about your LLM app's performance. Combining human feedback along with automated feedback can help you drill down on subsets of your app that underperform, and uncover new failure modes. This example will walk you through a simple example of recording human feedback with TruLens.
In [ ]:
Copied!
# ! pip install trulens_eval openai
# ! pip install trulens_eval openai
In [ ]:
Copied!
import os
from trulens_eval import Tru
from trulens_eval import TruCustomApp
tru = Tru()
import os
from trulens_eval import Tru
from trulens_eval import TruCustomApp
tru = Tru()
Set Keys¶
For this example, you need an OpenAI key.
In [ ]:
Copied!
os.environ["OPENAI_API_KEY"] = "sk-..."
os.environ["OPENAI_API_KEY"] = "sk-..."
Set up your app¶
Here we set up a custom application using just an OpenAI chat completion. The process for logging human feedback is the same however you choose to set up your app.
In [ ]:
Copied!
from openai import OpenAI
oai_client = OpenAI()
from trulens_eval.tru_custom_app import instrument
class APP:
@instrument
def completion(self, prompt):
completion = oai_client.chat.completions.create(
model="gpt-3.5-turbo",
temperature=0,
messages=
[
{"role": "user",
"content":
f"Please answer the question: {prompt}"
}
]
).choices[0].message.content
return completion
llm_app = APP()
# add trulens as a context manager for llm_app
tru_app = TruCustomApp(llm_app, app_id = 'LLM App v1')
from openai import OpenAI
oai_client = OpenAI()
from trulens_eval.tru_custom_app import instrument
class APP:
@instrument
def completion(self, prompt):
completion = oai_client.chat.completions.create(
model="gpt-3.5-turbo",
temperature=0,
messages=
[
{"role": "user",
"content":
f"Please answer the question: {prompt}"
}
]
).choices[0].message.content
return completion
llm_app = APP()
# add trulens as a context manager for llm_app
tru_app = TruCustomApp(llm_app, app_id = 'LLM App v1')
Run the app¶
In [ ]:
Copied!
with tru_app as recording:
llm_app.completion("Give me 10 names for a colorful sock company")
with tru_app as recording:
llm_app.completion("Give me 10 names for a colorful sock company")
In [ ]:
Copied!
# Get the record to add the feedback to.
record = recording.get()
# Get the record to add the feedback to.
record = recording.get()
Create a mechamism for recording human feedback.¶
Be sure to click an emoji in the record to record human_feedback
to log.
In [ ]:
Copied!
from ipywidgets import Button, HBox, VBox
thumbs_up_button = Button(description='👍')
thumbs_down_button = Button(description='👎')
human_feedback = None
def on_thumbs_up_button_clicked(b):
global human_feedback
human_feedback = 1
def on_thumbs_down_button_clicked(b):
global human_feedback
human_feedback = 0
thumbs_up_button.on_click(on_thumbs_up_button_clicked)
thumbs_down_button.on_click(on_thumbs_down_button_clicked)
HBox([thumbs_up_button, thumbs_down_button])
from ipywidgets import Button, HBox, VBox
thumbs_up_button = Button(description='👍')
thumbs_down_button = Button(description='👎')
human_feedback = None
def on_thumbs_up_button_clicked(b):
global human_feedback
human_feedback = 1
def on_thumbs_down_button_clicked(b):
global human_feedback
human_feedback = 0
thumbs_up_button.on_click(on_thumbs_up_button_clicked)
thumbs_down_button.on_click(on_thumbs_down_button_clicked)
HBox([thumbs_up_button, thumbs_down_button])
In [ ]:
Copied!
# add the human feedback to a particular app and record
tru.add_feedback(
name="Human Feedack",
record_id=record.record_id,
app_id=tru_app.app_id,
result=human_feedback
)
# add the human feedback to a particular app and record
tru.add_feedback(
name="Human Feedack",
record_id=record.record_id,
app_id=tru_app.app_id,
result=human_feedback
)
See the result logged with your app.¶
In [ ]:
Copied!
tru.get_leaderboard(app_ids=[tru_app.app_id])
tru.get_leaderboard(app_ids=[tru_app.app_id])