Endpoint¶

trulens_eval.feedback.provider.endpoint.base ¶

Attributes¶

DEFAULT_RPM `module-attribute` ¶

DEFAULT_RPM = 60

Default requests per minute for endpoints.

Classes¶

EndpointCallback ¶

Bases: SerialModel

Callbacks to be invoked after various API requests and track various metrics like token usage.

Attributes¶

endpoint `class-attribute` `instance-attribute` ¶

endpoint: Endpoint = Field(exclude=True)

Thhe endpoint owning this callback.

cost `class-attribute` `instance-attribute` ¶

cost: Cost = Field(default_factory=Cost)

Costs tracked by this callback.

Functions¶

handle ¶

handle(response: Any) -> None

Called after each request.

handle_chunk ¶

handle_chunk(response: Any) -> None

Called after receiving a chunk from a request.

handle_generation ¶

handle_generation(response: Any) -> None

Called after each completion request.

handle_generation_chunk ¶

handle_generation_chunk(response: Any) -> None

Called after receiving a chunk from a completion request.

handle_classification ¶

handle_classification(response: Any) -> None

Called after each classification response.

Endpoint ¶

Bases: WithClassInfo, SerialModel, SingletonPerName

API usage, pacing, and utilities for API endpoints.

Attributes¶

instrumented_methods `class-attribute` ¶

instrumented_methods: Dict[
    Any, List[Tuple[Callable, Callable, Type[Endpoint]]]
] = defaultdict(list)

Mapping of classe/module-methods that have been instrumented for cost tracking along with the wrapper methods and the class that instrumented them.

Key is the class or module owning the instrumented method. Tuple value has:

original function,
wrapped version,
endpoint that did the wrapping.

name `instance-attribute` ¶

name: str

API/endpoint name.

rpm `class-attribute` `instance-attribute` ¶

rpm: float = DEFAULT_RPM

Requests per minute.

retries `class-attribute` `instance-attribute` ¶

retries: int = 3

Retries (if performing requests using this class).

post_headers `class-attribute` `instance-attribute` ¶

post_headers: Dict[str, str] = Field(
    default_factory=dict, exclude=True
)

Optional post headers for post requests if done by this class.

pace `class-attribute` `instance-attribute` ¶

pace: Pace = Field(
    default_factory=lambda: Pace(
        marks_per_second=DEFAULT_RPM / 60.0,
        seconds_per_period=60.0,
    ),
    exclude=True,
)

Pacing instance to maintain a desired rpm.

global_callback `class-attribute` `instance-attribute` ¶

global_callback: EndpointCallback = Field(exclude=True)

Track costs not run inside "track_cost" here.

Also note that Endpoints are singletons (one for each unique name argument) hence this global callback will track all requests for the named api even if you try to create multiple endpoints (with the same name).

callback_class `class-attribute` `instance-attribute` ¶

callback_class: Type[EndpointCallback] = Field(exclude=True)

Callback class to use for usage tracking.

callback_name `class-attribute` `instance-attribute` ¶

callback_name: str = Field(exclude=True)

Name of variable that stores the callback noted above.

Classes¶

EndpointSetup `dataclass` ¶

Class for storing supported endpoint information.

See track_all_costs for usage.

Functions¶

pace_me ¶

pace_me() -> float

Block until we can make a request to this endpoint to keep pace with maximum rpm. Returns time in seconds since last call to this method returned.

run_in_pace ¶

run_in_pace(
    func: Callable[[A], B], *args, **kwargs
) -> B

Run the given func on the given args and kwargs at pace with the endpoint-specified rpm. Failures will be retried self.retries times.

run_me ¶

run_me(thunk: Thunk[T]) -> T

DEPRECTED: Run the given thunk, returning itse output, on pace with the api. Retries request multiple times if self.retries > 0.

DEPRECATED: Use run_in_pace instead.

print_instrumented `classmethod` ¶

print_instrumented()

Print out all of the methods that have been instrumented for cost tracking. This is organized by the classes/modules containing them.

track_all_costs `staticmethod` ¶

track_all_costs(
    __func: CallableMaybeAwaitable[A, T],
    *args,
    with_openai: bool = True,
    with_hugs: bool = True,
    with_litellm: bool = True,
    with_bedrock: bool = True,
    with_cortex: bool = True,
    **kwargs
) -> Tuple[T, Sequence[EndpointCallback]]

Track costs of all of the apis we can currently track, over the execution of thunk.

track_all_costs_tally `staticmethod` ¶

track_all_costs_tally(
    __func: CallableMaybeAwaitable[A, T],
    *args,
    with_openai: bool = True,
    with_hugs: bool = True,
    with_litellm: bool = True,
    with_bedrock: bool = True,
    with_cortex: bool = True,
    **kwargs
) -> Tuple[T, Cost]

Track costs of all of the apis we can currently track, over the execution of thunk.

track_cost ¶

track_cost(
    __func: CallableMaybeAwaitable[T], *args, **kwargs
) -> Tuple[T, EndpointCallback]

Tally only the usage performed within the execution of the given thunk. Returns the thunk's result alongside the EndpointCallback object that includes the usage information.

handle_wrapped_call ¶

handle_wrapped_call(
    func: Callable,
    bindings: BoundArguments,
    response: Any,
    callback: Optional[EndpointCallback],
) -> None

This gets called with the results of every instrumented method. This should be implemented by each subclass.

PARAMETER	DESCRIPTION
`func`	the wrapped method. TYPE: `Callable`
`bindings`	the inputs to the wrapped method. TYPE: `BoundArguments`
`response`	whatever the wrapped function returned. TYPE: `Any`
`callback`	the callback set up by `track_cost` if the wrapped method was called and returned within an invocation of `track_cost`. TYPE: `Optional[EndpointCallback]`

wrap_function ¶

wrap_function(func)

Create a wrapper of the given function to perform cost tracking.

DummyEndpoint ¶

Bases: Endpoint

Endpoint for testing purposes.

Does not make any network calls and just pretends to.

Attributes¶

loading_prob `instance-attribute` ¶

loading_prob: float

How often to produce the "model loading" response that huggingface api sometimes produces.

loading_time `class-attribute` `instance-attribute` ¶

loading_time: Callable[[], float] = Field(
    exclude=True,
    default_factory=lambda: lambda: uniform(0.73, 3.7),
)

How much time to indicate as needed to load the model in the above response.

error_prob `instance-attribute` ¶

error_prob: float

How often to produce an error response.

freeze_prob `instance-attribute` ¶

freeze_prob: float

How often to freeze instead of producing a response.

overloaded_prob `instance-attribute` ¶

overloaded_prob: float

How often to produce the overloaded message that huggingface sometimes produces.¶

alloc `instance-attribute` ¶

alloc: int

How much data in bytes to allocate when making requests.

delay `class-attribute` `instance-attribute` ¶

delay: float = 0.0

How long to delay each request.

Functions¶

handle_wrapped_call ¶

handle_wrapped_call(
    func: Callable,
    bindings: BoundArguments,
    response: Any,
    callback: Optional[EndpointCallback],
) -> None

Dummy handler does nothing.

post ¶

post(
    url: str, payload: JSON, timeout: Optional[float] = None
) -> Any

Pretend to make a classification request similar to huggingface API.

Simulates overloaded, model loading, frozen, error as configured:

requests.post(
    url, json=payload, timeout=timeout, headers=self.post_headers
)

Endpoint¶

trulens_eval.feedback.provider.endpoint.base ¶

Attributes¶

DEFAULT_RPM module-attribute ¶

Classes¶

EndpointCallback ¶

Attributes¶

endpoint class-attribute instance-attribute ¶

cost class-attribute instance-attribute ¶

Functions¶

handle ¶

handle_chunk ¶

handle_generation ¶

handle_generation_chunk ¶

handle_classification ¶

Endpoint ¶

Attributes¶

instrumented_methods class-attribute ¶

name instance-attribute ¶

rpm class-attribute instance-attribute ¶

retries class-attribute instance-attribute ¶

post_headers class-attribute instance-attribute ¶

pace class-attribute instance-attribute ¶

global_callback class-attribute instance-attribute ¶

callback_class class-attribute instance-attribute ¶

callback_name class-attribute instance-attribute ¶

Classes¶

EndpointSetup dataclass ¶

Functions¶

pace_me ¶

run_in_pace ¶

run_me ¶

print_instrumented classmethod ¶

track_all_costs staticmethod ¶

track_all_costs_tally staticmethod ¶

track_cost ¶

handle_wrapped_call ¶

wrap_function ¶

DummyEndpoint ¶

Attributes¶

loading_prob instance-attribute ¶

loading_time class-attribute instance-attribute ¶

error_prob instance-attribute ¶

freeze_prob instance-attribute ¶

overloaded_prob instance-attribute ¶

How often to produce the overloaded message that huggingface sometimes produces.¶

alloc instance-attribute ¶

delay class-attribute instance-attribute ¶

Functions¶

handle_wrapped_call ¶

post ¶

Functions¶

DEFAULT_RPM `module-attribute` ¶

endpoint `class-attribute` `instance-attribute` ¶

cost `class-attribute` `instance-attribute` ¶

instrumented_methods `class-attribute` ¶

name `instance-attribute` ¶

rpm `class-attribute` `instance-attribute` ¶

retries `class-attribute` `instance-attribute` ¶

post_headers `class-attribute` `instance-attribute` ¶

pace `class-attribute` `instance-attribute` ¶

global_callback `class-attribute` `instance-attribute` ¶

callback_class `class-attribute` `instance-attribute` ¶

callback_name `class-attribute` `instance-attribute` ¶

EndpointSetup `dataclass` ¶

print_instrumented `classmethod` ¶

track_all_costs `staticmethod` ¶

track_all_costs_tally `staticmethod` ¶

loading_prob `instance-attribute` ¶

loading_time `class-attribute` `instance-attribute` ¶

error_prob `instance-attribute` ¶

freeze_prob `instance-attribute` ¶

overloaded_prob `instance-attribute` ¶

alloc `instance-attribute` ¶

delay `class-attribute` `instance-attribute` ¶