Endpoint¶
trulens_eval.feedback.provider.endpoint.base
¶
Attributes¶
Classes¶
EndpointCallback
¶
Bases: SerialModel
Callbacks to be invoked after various API requests and track various metrics like token usage.
Attributes¶
endpoint
class-attribute
instance-attribute
¶
endpoint: Endpoint = Field(exclude=True)
Thhe endpoint owning this callback.
cost
class-attribute
instance-attribute
¶
Costs tracked by this callback.
Functions¶
Endpoint
¶
Bases: WithClassInfo
, SerialModel
, SingletonPerName
API usage, pacing, and utilities for API endpoints.
Attributes¶
instrumented_methods
class-attribute
¶
instrumented_methods: Dict[Any, List[Tuple[Callable, Callable, Type[Endpoint]]]] = defaultdict(list)
Mapping of classe/module-methods that have been instrumented for cost tracking along with the wrapper methods and the class that instrumented them.
Key is the class or module owning the instrumented method. Tuple value has:
-
original function,
-
wrapped version,
-
endpoint that did the wrapping.
retries
class-attribute
instance-attribute
¶
retries: int = 3
Retries (if performing requests using this class).
post_headers
class-attribute
instance-attribute
¶
Optional post headers for post requests if done by this class.
pace
class-attribute
instance-attribute
¶
pace: Pace = Field(default_factory=lambda: Pace(marks_per_second=DEFAULT_RPM / 60.0, seconds_per_period=60.0), exclude=True)
Pacing instance to maintain a desired rpm.
global_callback
class-attribute
instance-attribute
¶
global_callback: EndpointCallback = Field(exclude=True)
Track costs not run inside "track_cost" here.
Also note that Endpoints are singletons (one for each unique name argument) hence this global callback will track all requests for the named api even if you try to create multiple endpoints (with the same name).
callback_class
class-attribute
instance-attribute
¶
callback_class: Type[EndpointCallback] = Field(exclude=True)
Callback class to use for usage tracking.
callback_name
class-attribute
instance-attribute
¶
callback_name: str = Field(exclude=True)
Name of variable that stores the callback noted above.
Classes¶
EndpointSetup
dataclass
¶
Functions¶
pace_me
¶
pace_me() -> float
Block until we can make a request to this endpoint to keep pace with maximum rpm. Returns time in seconds since last call to this method returned.
run_in_pace
¶
run_in_pace(func: Callable[[A], B], *args, **kwargs) -> B
Run the given func
on the given args
and kwargs
at pace with the
endpoint-specified rpm. Failures will be retried self.retries
times.
run_me
¶
run_me(thunk: Thunk[T]) -> T
DEPRECTED: Run the given thunk, returning itse output, on pace with the api. Retries request multiple times if self.retries > 0.
DEPRECATED: Use run_in_pace
instead.
print_instrumented
classmethod
¶
print_instrumented()
Print out all of the methods that have been instrumented for cost tracking. This is organized by the classes/modules containing them.
track_all_costs
staticmethod
¶
track_all_costs(__func: CallableMaybeAwaitable[A, T], *args, with_openai: bool = True, with_hugs: bool = True, with_litellm: bool = True, with_bedrock: bool = True, **kwargs) -> Tuple[T, Sequence[EndpointCallback]]
Track costs of all of the apis we can currently track, over the execution of thunk.
track_all_costs_tally
staticmethod
¶
track_all_costs_tally(__func: CallableMaybeAwaitable[A, T], *args, with_openai: bool = True, with_hugs: bool = True, with_litellm: bool = True, with_bedrock: bool = True, **kwargs) -> Tuple[T, Cost]
Track costs of all of the apis we can currently track, over the execution of thunk.
track_cost
¶
track_cost(__func: CallableMaybeAwaitable[T], *args, **kwargs) -> Tuple[T, EndpointCallback]
Tally only the usage performed within the execution of the given thunk. Returns the thunk's result alongside the EndpointCallback object that includes the usage information.
handle_wrapped_call
¶
handle_wrapped_call(func: Callable, bindings: BoundArguments, response: Any, callback: Optional[EndpointCallback]) -> None
This gets called with the results of every instrumented method. This should be implemented by each subclass.
PARAMETER | DESCRIPTION |
---|---|
func |
the wrapped method.
TYPE:
|
bindings |
the inputs to the wrapped method.
TYPE:
|
response |
whatever the wrapped function returned.
TYPE:
|
callback |
the callback set up by
TYPE:
|
wrap_function
¶
wrap_function(func)
Create a wrapper of the given function to perform cost tracking.
DummyEndpoint
¶
Bases: Endpoint
Endpoint for testing purposes.
Does not make any network calls and just pretends to.
Attributes¶
loading_prob
instance-attribute
¶
loading_prob: float
How often to produce the "model loading" response that huggingface api sometimes produces.
loading_time
class-attribute
instance-attribute
¶
loading_time: Callable[[], float] = Field(exclude=True, default_factory=lambda: lambda: uniform(0.73, 3.7))
How much time to indicate as needed to load the model in the above response.
freeze_prob
instance-attribute
¶
freeze_prob: float
How often to freeze instead of producing a response.
overloaded_prob
instance-attribute
¶
overloaded_prob: float
How often to produce the overloaded message that huggingface sometimes produces.¶
Functions¶
handle_wrapped_call
¶
handle_wrapped_call(func: Callable, bindings: BoundArguments, response: Any, callback: Optional[EndpointCallback]) -> None
Dummy handler does nothing.