Skip to content

πŸ€— Huggingface Provider

trulens_eval.feedback.provider.hugs.Huggingface

Bases: Provider

Out of the box feedback functions calling Huggingface APIs.

Functions

__init__

__init__(
    name: Optional[str] = None,
    endpoint: Optional[Endpoint] = None,
    **kwargs
)

Create a Huggingface Provider with out of the box feedback functions.

Example

from trulens_eval.feedback.provider.hugs import Huggingface
huggingface_provider = Huggingface()

language_match

language_match(
    text1: str, text2: str
) -> Tuple[float, Dict]

Uses Huggingface's papluca/xlm-roberta-base-language-detection model. A function that uses language detection on text1 and text2 and calculates the probit difference on the language detected on text1. The function is: 1.0 - (|probit_language_text1(text1) - probit_language_text1(text2))

Example

from trulens_eval import Feedback
from trulens_eval.feedback.provider.hugs import Huggingface
huggingface_provider = Huggingface()

feedback = Feedback(huggingface_provider.language_match).on_input_output() 

The on_input_output() selector can be changed. See Feedback Function Guide

PARAMETER DESCRIPTION
text1

Text to evaluate.

TYPE: str

text2

Comparative text to evaluate.

TYPE: str

RETURNS DESCRIPTION
float

A value between 0 and 1. 0 being "different languages" and 1 being "same languages".

TYPE: Tuple[float, Dict]

groundedness_measure_with_nli

groundedness_measure_with_nli(
    source: str, statement: str
) -> Tuple[float, dict]

A measure to track if the source material supports each sentence in the statement using an NLI model.

First the response will be split into statements using a sentence tokenizer.The NLI model will process each statement using a natural language inference model, and will use the entire source.

Example

from trulens_eval.feedback import Feedback
from trulens_eval.feedback.provider.hugs = Huggingface

huggingface_provider = Huggingface()

f_groundedness = (
    Feedback(huggingface_provider.groundedness_measure_with_nli)
    .on(context)
    .on_output()
PARAMETER DESCRIPTION
source

The source that should support the statement

TYPE: str

statement

The statement to check groundedness

TYPE: str

RETURNS DESCRIPTION
Tuple[float, dict]

Tuple[float, str]: A tuple containing a value between 0.0 (not grounded) and 1.0 (grounded) and a string containing the reasons for the evaluation.

context_relevance

context_relevance(prompt: str, context: str) -> float

Uses Huggingface's truera/context_relevance model, a model that uses computes the relevance of a given context to the prompt. The model can be found at https://huggingface.co/truera/context_relevance.

Example

from trulens_eval import Feedback
from trulens_eval.feedback.provider.hugs import Huggingface
huggingface_provider = Huggingface()

feedback = (
    Feedback(huggingface_provider.context_relevance)
    .on_input()
    .on(context)
    .aggregate(np.mean)
    )
PARAMETER DESCRIPTION
prompt

The given prompt.

TYPE: str

context

Comparative contextual information.

TYPE: str

RETURNS DESCRIPTION
float

A value between 0 and 1. 0 being irrelevant and 1 being a relevant context for addressing the prompt.

TYPE: float

positive_sentiment

positive_sentiment(text: str) -> float

Uses Huggingface's cardiffnlp/twitter-roberta-base-sentiment model. A function that uses a sentiment classifier on text.

Example

from trulens_eval import Feedback
from trulens_eval.feedback.provider.hugs import Huggingface
huggingface_provider = Huggingface()

feedback = Feedback(huggingface_provider.positive_sentiment).on_output() 
PARAMETER DESCRIPTION
text

Text to evaluate.

TYPE: str

RETURNS DESCRIPTION
float

A value between 0 (negative sentiment) and 1 (positive sentiment).

TYPE: float

toxic

toxic(text: str) -> float

Uses Huggingface's martin-ha/toxic-comment-model model. A function that uses a toxic comment classifier on text.

Example

from trulens_eval import Feedback
from trulens_eval.feedback.provider.hugs import Huggingface
huggingface_provider = Huggingface()

feedback = Feedback(huggingface_provider.toxic).on_output() 
PARAMETER DESCRIPTION
text

Text to evaluate.

TYPE: str

RETURNS DESCRIPTION
float

A value between 0 (not toxic) and 1 (toxic).

TYPE: float

pii_detection

pii_detection(text: str) -> float

NER model to detect PII.

Example

hugs = Huggingface()

# Define a pii_detection feedback function using HuggingFace.
f_pii_detection = Feedback(hugs.pii_detection).on_input()

The on(...) selector can be changed. See Feedback Function Guide: Selectors

PARAMETER DESCRIPTION
text

A text prompt that may contain a PII.

TYPE: str

RETURNS DESCRIPTION
float

The likelihood that a PII is contained in the input text.

TYPE: float

pii_detection_with_cot_reasons

pii_detection_with_cot_reasons(text: str)

NER model to detect PII, with reasons.

Example

hugs = Huggingface()

# Define a pii_detection feedback function using HuggingFace.
f_pii_detection = Feedback(hugs.pii_detection).on_input()

The on(...) selector can be changed. See Feedback Function Guide : Selectors

Args: text: A text prompt that may contain a name.

Returns: Tuple[float, str]: A tuple containing a the likelihood that a PII is contained in the input text and a string containing what PII is detected (if any).

hallucination_evaluator

hallucination_evaluator(
    model_output: str, retrieved_text_chunks: str
) -> float

Evaluates the hallucination score for a combined input of two statements as a float 0<x<1 representing a true/false boolean. if the return is greater than 0.5 the statement is evaluated as true. if the return is less than 0.5 the statement is evaluated as a hallucination.

Example

from trulens_eval.feedback.provider.hugs import Huggingface
huggingface_provider = Huggingface()

score = huggingface_provider.hallucination_evaluator("The sky is blue. [SEP] Apples are red , the grass is green.")
PARAMETER DESCRIPTION
model_output

This is what an LLM returns based on the text chunks retrieved during RAG

TYPE: str

retrieved_text_chunk

These are the text chunks you have retrieved during RAG

TYPE: str

RETURNS DESCRIPTION
float

Hallucination score

TYPE: float