🤗 Huggingface Provider¶

trulens_eval.feedback.provider.hugs.Huggingface ¶

Bases: Provider

Out of the box feedback functions calling Huggingface APIs.

Functions¶

init ¶

__init__(
    name: Optional[str] = None,
    endpoint: Optional[Endpoint] = None,
    **kwargs
)

Create a Huggingface Provider with out of the box feedback functions.

Example

from trulens_eval.feedback.provider.hugs import Huggingface
huggingface_provider = Huggingface()

language_match ¶

language_match(
    text1: str, text2: str
) -> Tuple[float, Dict]

Uses Huggingface's papluca/xlm-roberta-base-language-detection model. A function that uses language detection on text1 and text2 and calculates the probit difference on the language detected on text1. The function is: 1.0 - (|probit_language_text1(text1) - probit_language_text1(text2))

Example

from trulens_eval import Feedback
from trulens_eval.feedback.provider.hugs import Huggingface
huggingface_provider = Huggingface()

feedback = Feedback(huggingface_provider.language_match).on_input_output()

The on_input_output() selector can be changed. See Feedback Function Guide

PARAMETER	DESCRIPTION
`text1`	Text to evaluate. TYPE: `str`
`text2`	Comparative text to evaluate. TYPE: `str`

RETURNS	DESCRIPTION
`float`	A value between 0 and 1. 0 being "different languages" and 1 being "same languages". TYPE: `Tuple[float, Dict]`

groundedness_measure_with_nli ¶

groundedness_measure_with_nli(
    source: str, statement: str
) -> Tuple[float, dict]

A measure to track if the source material supports each sentence in the statement using an NLI model.

First the response will be split into statements using a sentence tokenizer.The NLI model will process each statement using a natural language inference model, and will use the entire source.

Example

from trulens_eval.feedback import Feedback
from trulens_eval.feedback.provider.hugs = Huggingface

huggingface_provider = Huggingface()

f_groundedness = (
    Feedback(huggingface_provider.groundedness_measure_with_nli)
    .on(context)
    .on_output()

PARAMETER	DESCRIPTION
`source`	The source that should support the statement TYPE: `str`
`statement`	The statement to check groundedness TYPE: `str`

RETURNS	DESCRIPTION
`Tuple[float, dict]`	Tuple[float, str]: A tuple containing a value between 0.0 (not grounded) and 1.0 (grounded) and a string containing the reasons for the evaluation.

context_relevance ¶

context_relevance(prompt: str, context: str) -> float

Uses Huggingface's truera/context_relevance model, a model that uses computes the relevance of a given context to the prompt. The model can be found at https://huggingface.co/truera/context_relevance.

Example

from trulens_eval import Feedback
from trulens_eval.feedback.provider.hugs import Huggingface
huggingface_provider = Huggingface()

feedback = (
    Feedback(huggingface_provider.context_relevance)
    .on_input()
    .on(context)
    .aggregate(np.mean)
    )

PARAMETER	DESCRIPTION
`prompt`	The given prompt. TYPE: `str`
`context`	Comparative contextual information. TYPE: `str`

RETURNS	DESCRIPTION
`float`	A value between 0 and 1. 0 being irrelevant and 1 being a relevant context for addressing the prompt. TYPE: `float`

positive_sentiment ¶

positive_sentiment(text: str) -> float

Uses Huggingface's cardiffnlp/twitter-roberta-base-sentiment model. A function that uses a sentiment classifier on text.

Example

from trulens_eval import Feedback
from trulens_eval.feedback.provider.hugs import Huggingface
huggingface_provider = Huggingface()

feedback = Feedback(huggingface_provider.positive_sentiment).on_output()

PARAMETER	DESCRIPTION
`text`	Text to evaluate. TYPE: `str`

RETURNS	DESCRIPTION
`float`	A value between 0 (negative sentiment) and 1 (positive sentiment). TYPE: `float`

toxic ¶

toxic(text: str) -> float

Uses Huggingface's martin-ha/toxic-comment-model model. A function that uses a toxic comment classifier on text.

Example

from trulens_eval import Feedback
from trulens_eval.feedback.provider.hugs import Huggingface
huggingface_provider = Huggingface()

feedback = Feedback(huggingface_provider.toxic).on_output()

PARAMETER	DESCRIPTION
`text`	Text to evaluate. TYPE: `str`

RETURNS	DESCRIPTION
`float`	A value between 0 (not toxic) and 1 (toxic). TYPE: `float`

pii_detection ¶

pii_detection(text: str) -> float

NER model to detect PII.

Example

hugs = Huggingface()

# Define a pii_detection feedback function using HuggingFace.
f_pii_detection = Feedback(hugs.pii_detection).on_input()

The on(...) selector can be changed. See Feedback Function Guide: Selectors

PARAMETER	DESCRIPTION
`text`	A text prompt that may contain a PII. TYPE: `str`

RETURNS	DESCRIPTION
`float`	The likelihood that a PII is contained in the input text. TYPE: `float`

pii_detection_with_cot_reasons ¶

pii_detection_with_cot_reasons(text: str)

NER model to detect PII, with reasons.

Example

hugs = Huggingface()

# Define a pii_detection feedback function using HuggingFace.
f_pii_detection = Feedback(hugs.pii_detection).on_input()

The on(...) selector can be changed. See Feedback Function Guide : Selectors

Args: text: A text prompt that may contain a name.

Returns: Tuple[float, str]: A tuple containing a the likelihood that a PII is contained in the input text and a string containing what PII is detected (if any).

hallucination_evaluator ¶

hallucination_evaluator(
    model_output: str, retrieved_text_chunks: str
) -> float

Evaluates the hallucination score for a combined input of two statements as a float 0<x<1 representing a true/false boolean. if the return is greater than 0.5 the statement is evaluated as true. if the return is less than 0.5 the statement is evaluated as a hallucination.

Example

from trulens_eval.feedback.provider.hugs import Huggingface
huggingface_provider = Huggingface()

score = huggingface_provider.hallucination_evaluator("The sky is blue. [SEP] Apples are red , the grass is green.")

PARAMETER	DESCRIPTION
`model_output`	This is what an LLM returns based on the text chunks retrieved during RAG TYPE: `str`
`retrieved_text_chunk`	These are the text chunks you have retrieved during RAG TYPE: `str`

RETURNS	DESCRIPTION
`float`	Hallucination score TYPE: `float`

🤗 Huggingface Provider¶

trulens_eval.feedback.provider.hugs.Huggingface ¶

Functions¶

__init__ ¶

language_match ¶

groundedness_measure_with_nli ¶

context_relevance ¶

positive_sentiment ¶

toxic ¶

pii_detection ¶

pii_detection_with_cot_reasons ¶

hallucination_evaluator ¶

init ¶