Advanced topics related to self-hosting
This guide covers advanced topics related to self-hosting.
Data plane vs. Control plane
Braintrust's architecture has two main components: the data plane and the control plane. The data plane is the component that handles the actual data, while the control plane is the component that serves the UI along with metadata.
API vs. Full configuration
Braintrust offers two modes for self-hosting: API and Full. In API mode, you host the data plane (API) in your own environment, while the control plane (web app and metadata database) is hosted by Braintrust. In Full mode, you host both the data plane and the control plane in your own environment.
The primary difference between the two options is that the API configuration allows you to host the most sensitive data in your environment, behind an API which changes infrequently (you need to update roughly 1-2 times per month), while allowing the Braintrust team to host the web app and metadata database which are updated multiple times per day. On the other hand, the full configuration allows you to host the webserver and metadata database in your own environment, ensuring total isolation, but with additional maintenance overhead.
To clarify which data is stored in which location, here is a breakdown of the data stored in each place:
Data | API | Full |
---|---|---|
Experiment records (input, output, expected, scores, metadata, traces, spans) | Your env | Your env |
Log records (input, output, expected, scores, metadata, traces, spans) | Your env | Your env |
Dataset records (input, output, metadata) | Your env | Your env |
Prompt playground prompts | Your env | Your env |
Prompt playground completions | Your env | Your env |
Human review scores | Your env | Your env |
Experiment and dataset names | Global | Your env |
Project names | Global | Your env |
Project settings | Global | Your env |
Git metadata about experiments | Global | Your env |
Organization info (name, settings) | Global | Your env |
Login info (name, email, avatar URL) | Global | Your env |
Auth credentials | AWS Cognito (SSO, passwords) | Your env (password) |
API keys (hashed) | Global | Your env |
LLM provider secrets (encrypted) | Global | Your env |
Customizing the webapp URL
The SDKs guide users to https://www.braintrustdata.com
(or the BRAINTRUST_APP_URL
variable) to view their experiments. However,
in certain advanced configurations, you may want to reverse proxy traffic to the BRAINTRUST_APP_URL
from the SDKs while pointing users
to a different URL.
To do this, you can set the BRAINTRUST_APP_PUBLIC_URL
environment variable to the URL of your webapp. By default, this variable is set to
the value of BRAINTRUST_APP_URL
, but you can customize it as you wish. This variable is only used to display information, so even its destination
does not need to be accessible from the SDK.
Constraining SDK to the data plane
If you're self-hosting the data plane, it may also be advantageous to constrain the SDKs to only communicate with your data plane. Normally, they communicate with the control plane to:
- Get your data plane's URL
- Register and retrieve metadata (e.g. about experiments)
- Print URLs to the webapp
The data plane can proxy the endpoints that the SDKs use to communicate with the control plane, allowing your SDKs to only communicate with the data plane
directly. Simply set the BRAINTRUST_APP_URL
environment variable to the URL of your data plane and BRAINTRUST_APP_PUBLIC_URL
to "https://www.braintrustdata.com"
(or the URL of your webapp).
Allow-list URLs
In some cases, you may want to restrict the URLs that the SDKs or API server can communicate with. If so, you should include the following URLs:
Configuring Rate-Limits
By default, the Braintrust API server imposes rate limits against any external
domains it reaches out to, such as the BRAINTRUST_APP_URL
. The purpose of
rate-limiting is to prevent unintentionally overloading any external domains,
which may block the API server IP in response.
By default, the rate limit is 100 requests per minute per user auth token. The API server exposes the following variables to configure the rate limits:
OUTBOUND_RATE_LIMIT_MAX_REQUESTS
: Configure the number of requests per time window. This can be set to 0 to disable rate limiting. In the braintrust CLI, this variable can be set with the--outbound-rate-limit-max-requests
flag, or theOutboundRateLimitMaxRequests
CloudFormation template parameter.OUTBOUND_RATE_LIMIT_WINDOW_MINUTES
: Configure the time window in minutes before the rate limit resets. In the braintrust CLI, this variable can be set with the--outbound-rate-limit-window-minutes
flag, or theOutboundRateLimitWindowMinutes
CloudFormation template parameter.