Feedback functions in NeMo Guardrails apps¶
This notebook demonstrates how to use feedback functions from within rails apps.
The integration in the other direction, monitoring rails apps using trulens, is
shown in the nemoguardrails_trurails_example.ipynb
notebook.
We feature two examples of how to integrate feedback in rails apps. This
notebook goes over the more complex but ultimately more concise of the two. The
simpler example is shown in nemoguardrails_custom_action_feedback_example.ipynb
.
# Install NeMo Guardrails if not already installed.
# !pip install trulens trulens-apps-nemo trulens-providers-openai trulens-providers-huggingface nemoguardrails
Setup keys and trulens¶
# This notebook uses openai and huggingface providers which need some keys set.
# You can set them here:
from trulens.core import TruSession
from trulens.core.utils.keys import check_or_set_keys
check_or_set_keys(OPENAI_API_KEY="to fill in", HUGGINGFACE_API_KEY="to fill in")
# Load trulens, reset the database:
session = TruSession()
session.reset_database()
Feedback functions setup¶
Lets consider some feedback functions. We will define two types: a simple
language match that checks whether output of the app is in the same language as
the input. The second is a set of three for evaluating context retrieval. The
setup for these is similar to that for other app types such as langchain except
we provide a utility RAG_triad
to create the three context retrieval functions
for you instead of having to create them separately.
from pprint import pprint
from trulens.core import Feedback
from trulens.feedback.feedback import rag_triad
from trulens.providers.huggingface import Huggingface
from trulens.providers.openai import OpenAI
# Initialize provider classes
openai = OpenAI()
hugs = Huggingface()
# Note that we do not specify the selectors (where the inputs to the feedback
# functions come from):
f_language_match = Feedback(hugs.language_match)
fs_triad = rag_triad(provider=openai)
# Overview of the 4 feedback functions defined.
pprint(f_language_match)
pprint(fs_triad)
Feedback functions registration¶
To make feedback functions available to rails apps, we need to first register them the FeedbackActions
class.
from trulens.tru_rails import FeedbackActions
FeedbackActions.register_feedback_functions(**fs_triad)
FeedbackActions.register_feedback_functions(f_language_match)
Rails app setup¶
The files created below define a configuration of a rails app adapted from various examples in the NeMo-Guardrails repository. There is nothing unusual about the app beyond the knowledge base here being the TruLens documentation. This means you should be able to ask the resulting bot questions regarding trulens instead of the fictional company handbook as was the case in the originating example.
Note that new additions to output rail flows in the configuration below. These are setup to run our feedback functions but their definition will come in following colang file.
from trulens.dashboard.notebook_utils import writefileinterpolated
%%writefileinterpolated config.yaml
# Adapted from NeMo-Guardrails/nemoguardrails/examples/bots/abc/config.yml
instructions:
- type: general
content: |
Below is a conversation between a user and a bot called the trulens Bot.
The bot is designed to answer questions about the trulens python library.
The bot is knowledgeable about python.
If the bot does not know the answer to a question, it truthfully says it does not know.
sample_conversation: |
user "Hi there. Can you help me with some questions I have about trulens?"
express greeting and ask for assistance
bot express greeting and confirm and offer assistance
"Hi there! I'm here to help answer any questions you may have about the trulens. What would you like to know?"
models:
- type: main
engine: openai
model: gpt-3.5-turbo-instruct
rails:
output:
flows:
- check language match
# triad defined separately so hopefully they can be executed in parallel
- check rag triad groundedness
- check rag triad relevance
- check rag triad context_relevance
Output flows with feedback¶
Next we define output flows that include checks using all 4 feedback functions we registered above. We will need to specify to the Feedback action the sources of feedback function arguments. The selectors for those can be specified manually or by way of utility container RailsActionSelect
. The data structure from which selectors pick our feedback inputs contains all of the arguments of NeMo GuardRails custom action methods:
async def feedback(
events: Optional[List[Dict]] = None,
context: Optional[Dict] = None,
llm: Optional[BaseLanguageModel] = None,
config: Optional[RailsConfig] = None,
...
)
...
source_data = dict(
action=dict(
events=events,
context=context,
llm=llm,
config=config
)
)
from trulens.apps.nemo import RailsActionSelect
# Will need to refer to these selectors/lenses to define triade checks. We can
# use these shorthands to make things a bit easier. If you are writing
# non-temporary config files, you can print these lenses to help with the
# selectors:
question_lens = RailsActionSelect.LastUserMessage
answer_lens = RailsActionSelect.BotMessage # not LastBotMessage as the flow is evaluated before LastBotMessage is available
contexts_lens = RailsActionSelect.RetrievalContexts
# Inspect the values of the shorthands:
print(list(map(str, [question_lens, answer_lens, contexts_lens])))
Action invocation¶
We can now define output flows that evaluate feedback functions. These are the four "subflow"s in the colang below.
%%writefileinterpolated config.co
# Adapted from NeMo-Guardrails/tests/test_configs/with_kb_openai_embeddings/config.co
define user ask capabilities
"What can you do?"
"What can you help me with?"
"tell me what you can do"
"tell me about you"
define bot inform language mismatch
"I may not be able to answer in your language."
define bot inform triad failure
"I may may have made a mistake interpreting your question or my knowledge base."
define flow
user ask trulens
bot inform trulens
define parallel subflow check language match
$result = execute feedback(\
function="language_match",\
selectors={{\
"text1":"{question_lens}",\
"text2":"{answer_lens}"\
}},\
verbose=True\
)
if $result < 0.8
bot inform language mismatch
stop
define parallel subflow check rag triad groundedness
$result = execute feedback(\
function="groundedness_measure_with_cot_reasons",\
selectors={{\
"statement":"{answer_lens}",\
"source":"{contexts_lens}"\
}},\
verbose=True\
)
if $result < 0.7
bot inform triad failure
stop
define parallel subflow check rag triad relevance
$result = execute feedback(\
function="relevance",\
selectors={{\
"prompt":"{question_lens}",\
"response":"{contexts_lens}"\
}},\
verbose=True\
)
if $result < 0.7
bot inform triad failure
stop
define parallel subflow check rag triad context_relevance
$result = execute feedback(\
function="context_relevance",\
selectors={{\
"question":"{question_lens}",\
"statement":"{answer_lens}"\
}},\
verbose=True\
)
if $result < 0.7
bot inform triad failure
stop
Rails app instantiation¶
The instantiation of the app does not differ from the steps presented in NeMo.
from nemoguardrails import LLMRails
from nemoguardrails import RailsConfig
config = RailsConfig.from_path(".")
rails = LLMRails(config)
Feedback action registration¶
We need to register the method FeedbackActions.feedback_action
as an action to be able to make use of it inside the flows we defined above.
rails.register_action(FeedbackActions.feedback_action)
Optional TruRails
recorder instantiation¶
Though not required, we can also use a trulens recorder to monitor our app.
from trulens.apps.nemo import TruRails
tru_rails = TruRails(rails)
Language match test invocation¶
Lets try to make the app respond in a different language than the question to try to get the language match flow to abort the output. Note that the verbose flag in the feedback action we setup in the colang above makes it print out the inputs and output of the function.
# This may fail the language match:
with tru_rails as recorder:
response = await rails.generate_async(
messages=[
{
"role": "user",
"content": "Please answer in Spanish: what does trulens do?",
}
]
)
print(response["content"])
# Note that the feedbacks involved in the flow are NOT record feedbacks hence
# not available in the usual place:
record = recorder.get()
print(record.feedback_results)
# This should be ok though sometimes answers in English and the RAG triad may
# fail after language match passes.
with tru_rails as recorder:
response = rails.generate(
messages=[
{
"role": "user",
"content": "Por favor responda en español: ¿qué hace trulens?",
}
]
)
print(response["content"])
RAG triad Test¶
Lets check to make sure all 3 RAG feedback functions will run and hopefully pass. Note that the "stop" in their flow definitions means that if any one of them fails, no subsequent ones will be tested.
# Should invoke retrieval:
with tru_rails as recorder:
response = rails.generate(
messages=[
{
"role": "user",
"content": "Does trulens support AzureOpenAI as a provider?",
}
]
)
print(response["content"])