Skip to content

Commit 465441d

Browse files
committed
chore(nox): centralize dependency pins in pyproject.toml
Migrate all nox session dependency pins and package metadata into py/pyproject.toml as the single source of truth: - Replace setup.py with pyproject.toml (setuptools backend) - Move base test deps, session auxiliary deps, lint deps, build deps, and dev deps into PEP 735 dependency groups - Move provider version matrices into [tool.braintrust.matrix] custom table (read by noxfile at import time) - Generate and commit uv.lock for reproducible auxiliary dep resolution - Pin all previously-floating 'latest' versions to explicit versions - Remove requirements-dev.txt, requirements-build.txt, requirements-optional.txt, and the install-optional Makefile target - Switch Makefile and CI from uv pip install to uv sync - Bump uv to 0.11.6 in .tool-versions - Add weekly dependency-updates.yml workflow that runs uv lock --upgrade and labels PRs based on whether provider SDKs changed - Exclude uv.lock from codespell checks (false positive on astroid) - Update AGENTS.md, CONTRIBUTING.md, and agent skill docs to reflect the new layout
1 parent c81ccaf commit 465441d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+7981
-7620
lines changed

.agents/skills/sdk-benchmarking/SKILL.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ Read when relevant:
2424

2525
- `py/benchmarks/benches/bench_bt_json.py` for the module pattern
2626
- `py/benchmarks/fixtures.py` for shared payload builders
27-
- `py/setup.py` when benchmarking the optional `orjson` fast path
27+
- `py/pyproject.toml` when benchmarking the optional `orjson` fast path (see `[project.optional-dependencies]`)
2828
- `references/benchmark-patterns.md` in this skill for command and module templates
2929

3030
## Workflow
@@ -56,7 +56,7 @@ If the benchmark should measure the optional `orjson` path, install the performa
5656

5757
```bash
5858
cd py
59-
python -m uv pip install -e '.[performance]'
59+
uv sync --extra performance
6060
```
6161

6262
## Adding Benchmarks

.agents/skills/sdk-ci-triage/SKILL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -192,7 +192,7 @@ The smoke job validates install + import across OS and Python versions.
192192
Local equivalents:
193193

194194
```bash
195-
mise exec python@3.10 -- uv pip install -e ./py[all]
195+
mise exec python@3.10 -- uv sync --project ./py --all-extras
196196
mise exec python@3.10 -- uv run --active --no-project python -c 'import braintrust'
197197
```
198198

.agents/skills/sdk-vcr-workflows/SKILL.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ This repo prefers real recorded integration coverage over mocks for provider beh
1414
Always read:
1515

1616
- `AGENTS.md`
17+
- `py/pyproject.toml` (`[tool.braintrust.matrix]` for provider version pins)
1718
- `py/noxfile.py`
1819
- `py/src/braintrust/conftest.py`
1920
- `py/src/braintrust/integrations/conftest.py` (shared version-aware cassette directory resolution)
@@ -37,7 +38,7 @@ These rules must stay aligned with `AGENTS.md`:
3738

3839
- Work from `py/` for SDK tasks.
3940
- Use `mise` as the source of truth for tools and environment.
40-
- Do not guess nox session names or provider/version coverage.
41+
- Do not guess nox session names or provider/version coverage. Provider version pins (including what "latest" resolves to) are in `py/pyproject.toml` `[tool.braintrust.matrix]`.
4142
- Default bug-fix workflow is red -> green.
4243
- Prefer VCR-backed provider tests over mocks or fakes whenever practical.
4344
- Treat mock/fake tests for provider behavior as an exception that requires justification, not as a neutral alternative.

.github/workflows/checks.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ jobs:
4949
- name: Run pylint and type tests
5050
shell: bash
5151
run: |
52-
mise exec python@${{ matrix.python-version }} -- nox -f ./py/noxfile.py -s pylint test_types
52+
mise exec python@${{ matrix.python-version }} -- uv run --project ./py nox -f ./py/noxfile.py -s pylint test_types
5353
5454
smoke:
5555
runs-on: ${{ matrix.os }}
@@ -71,10 +71,10 @@ jobs:
7171
run: |
7272
# This is already done by make install-dev, but we're keeping this as a separate step
7373
# to explicitly verify that installation works
74-
mise exec python@${{ matrix.python-version }} -- uv pip install -e ./py[all]
74+
mise exec python@${{ matrix.python-version }} -- uv sync --project ./py --all-extras
7575
- name: Test whether the Python SDK can be imported
7676
run: |
77-
mise exec python@${{ matrix.python-version }} -- uv run --active --no-project python -c 'import braintrust'
77+
mise exec python@${{ matrix.python-version }} -- uv run --project ./py python -c 'import braintrust'
7878
7979
nox:
8080
runs-on: ${{ matrix.os }}
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
name: Dependency updates
2+
on:
3+
schedule:
4+
- cron: "0 6 * * *" # daily 6am UTC
5+
workflow_dispatch:
6+
7+
permissions:
8+
contents: write
9+
pull-requests: write
10+
11+
jobs:
12+
update:
13+
runs-on: ubuntu-latest
14+
steps:
15+
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
16+
- name: Set up mise
17+
uses: jdx/mise-action@1648a7812b9aeae629881980618f079932869151 # v4.0.1
18+
with:
19+
cache: true
20+
experimental: true
21+
22+
- name: Upgrade lockfile
23+
working-directory: py
24+
run: uv lock --upgrade
25+
26+
- name: Determine labels
27+
id: labels
28+
working-directory: py
29+
run: |
30+
python3 << 'PYEOF' >> "$GITHUB_OUTPUT"
31+
import subprocess, sys
32+
33+
if sys.version_info >= (3, 11):
34+
import tomllib
35+
else:
36+
try:
37+
import tomllib
38+
except ModuleNotFoundError:
39+
import tomli as tomllib
40+
41+
diff = subprocess.check_output(["git", "diff", "uv.lock"], text=True)
42+
if not diff:
43+
print("changed=false")
44+
raise SystemExit(0)
45+
46+
# Read pyproject.toml to find provider SDK packages from the matrix table
47+
with open("pyproject.toml", "rb") as f:
48+
pyproject = tomllib.load(f)
49+
50+
matrix = pyproject.get("tool", {}).get("braintrust", {}).get("matrix", {})
51+
52+
# Extract the base package name from each matrix requirement string
53+
provider_pkgs = set()
54+
for _prefix, versions in matrix.items():
55+
for req in versions.values():
56+
# req looks like "openai==1.92.0" or "pydantic-ai==1.82.0"
57+
pkg = req.split("==")[0].split(">=")[0].split("<=")[0].strip()
58+
provider_pkgs.add(pkg)
59+
60+
# Check if any provider package changed in the lockfile diff
61+
needs_rerecord = any(pkg in diff for pkg in provider_pkgs)
62+
63+
print("changed=true")
64+
print(f"needs_rerecord={str(needs_rerecord).lower()}")
65+
PYEOF
66+
67+
- name: Get date
68+
id: date
69+
run: echo "date=$(date +%Y-%m-%d)" >> "$GITHUB_OUTPUT"
70+
71+
- name: Open PR
72+
if: steps.labels.outputs.changed == 'true'
73+
uses: peter-evans/create-pull-request@271a8d0340265f705b14b31e8c0e067c3b0d45ef # v7.0.8
74+
with:
75+
title: "chore(deps): daily dependency update"
76+
body: |
77+
Automated daily dependency update via `uv lock --upgrade`.
78+
79+
${{ steps.labels.outputs.needs_rerecord == 'true' && '⚠️ **Provider SDK packages changed.** A human needs to re-record cassettes locally before merging.' || '✅ Only test infrastructure deps changed. Safe to merge if CI passes.' }}
80+
branch: deps/daily-update-${{ steps.date.outputs.date }}
81+
labels: |
82+
dependencies
83+
${{ steps.labels.outputs.needs_rerecord == 'true' && 'needs-cassette-rerecord' || 'auto-merge-candidate' }}

.pre-commit-config.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,14 @@ repos:
1919
- id: ruff-format
2020
- id: ruff-check
2121
args: [--fix, --exit-non-zero-on-fix]
22+
- repo: local
23+
hooks:
24+
- id: check-stale-cassettes
25+
name: check stale cassettes
26+
entry: python py/scripts/check-stale-cassettes.py
27+
language: python
28+
pass_filenames: false
29+
files: (pyproject\.toml|cassettes/)
2230
- repo: https://github.com/codespell-project/codespell
2331
rev: v2.2.5
2432
hooks:
@@ -28,6 +36,7 @@ repos:
2836
.*\.(json|prisma|svg)
2937
|.*.yaml
3038
|.*/cassettes/.*
39+
|.*uv\.lock
3140
)$
3241
args:
3342
- "-L"

.tool-versions

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
python 3.14.3
22
pre-commit 4.2.0
3-
uv 0.7.8
3+
uv 0.11.6

AGENTS.md

Lines changed: 29 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,8 @@ Use this file as the default playbook for work in this repository.
1515
2. **Use `mise` as the source of truth for tools and environment.**
1616

1717
3. **Do not guess test commands or version coverage.**
18-
- `py/noxfile.py` is the source of truth for nox session names, provider/version matrices, and local reproduction commands.
18+
- `py/pyproject.toml` `[tool.braintrust.matrix]` is the source of truth for provider version pins (including what "latest" resolves to).
19+
- `py/noxfile.py` reads those pins and is the source of truth for nox session names, parametrized version matrices, and local reproduction commands.
1920
- `.github/workflows/checks.yaml` is the source of truth for which sessions run in CI, on which Python versions, and outside vs. inside the nox shard matrix.
2021
- For provider and integration work, also check `py/src/braintrust/integrations/versioning.py`.
2122

@@ -42,6 +43,8 @@ Use this file as the default playbook for work in this repository.
4243
## Repo Map
4344

4445
- `py/`: main Python package, tests, examples, nox sessions, build/release workflow
46+
- `py/pyproject.toml`: single source of truth for package metadata, dependency groups, and provider version matrix
47+
- `py/uv.lock`: committed lockfile for reproducible auxiliary dep resolution
4548
- `py/src/braintrust/`: SDK source
4649
- top-level package files: core SDK
4750
- `wrappers/`: wrappers
@@ -70,12 +73,7 @@ cd py
7073
make install-dev
7174
```
7275

73-
Install optional provider dependencies only when needed:
74-
75-
```bash
76-
cd py
77-
make install-optional
78-
```
76+
Note: `install-dev` uses `uv sync` under the hood, reading dependency groups from `py/pyproject.toml`. There is no separate `requirements-dev.txt`.
7977

8078
## Default Workflow
8179

@@ -236,6 +234,16 @@ BRAINTRUST_CLAUDE_AGENT_SDK_RECORD_MODE=all nox -s "test_claude_agent_sdk(latest
236234

237235
Only re-record HTTP or subprocess cassettes when the behavior change is intentional. If unsure, ask the user.
238236

237+
Stale cassette detection:
238+
239+
- When versions are dropped from `[tool.braintrust.matrix]`, their cassette subdirectories become orphaned.
240+
- `py/scripts/check-stale-cassettes.py` compares on-disk cassette version directories against the matrix.
241+
- The mapping from integration directory name to matrix key(s) lives in `[tool.braintrust.cassette-dirs]` in `py/pyproject.toml`.
242+
- Runs automatically via pre-commit hook (triggers on `pyproject.toml` or `cassettes/` changes).
243+
- Manual check: `cd py && make check-stale-cassettes`
244+
- To auto-delete stale dirs: `cd py && python scripts/check-stale-cassettes.py --clean`
245+
- When adding a new integration with versioned cassettes, add an entry to `[tool.braintrust.cassette-dirs]`.
246+
239247
## Benchmarks
240248

241249
If you touch a hot path such as serialization, deep-copy, span creation, or logging, consider benchmarks.
@@ -270,13 +278,27 @@ cd py
270278
make build
271279
```
272280

281+
The build uses `pyproject.toml` with the setuptools backend. There is no `setup.py`.
282+
273283
Caveat:
274284

275285
- `py/scripts/template-version.py` rewrites `py/src/braintrust/version.py` during build
276286
- `py/Makefile` restores that file afterward with `git checkout`
277287

278288
Avoid editing `py/src/braintrust/version.py` while also running build commands.
279289

290+
## Dependency Pinning
291+
292+
All nox session dependency pins are centralized in `py/pyproject.toml`:
293+
294+
- **`[dependency-groups]`** (PEP 735): base test deps, session-specific auxiliary deps (e.g. `test-litellm`, `test-langchain`), lint deps, build deps, and dev deps. These participate in `uv lock` and produce `py/uv.lock`.
295+
- **`[tool.braintrust.matrix]`**: provider version matrix pins (e.g. `openai`, `anthropic`). Each provider has a `latest` key pinned to an explicit version and additional older version keys. The noxfile reads these at import time to derive the parametrized version tuples.
296+
- **`[tool.uv.conflicts]`**: declares which dependency groups cannot coexist in one resolution (e.g. groups pinning different `openai` versions).
297+
298+
The `LATEST` sentinel in the noxfile maps to the `latest` key in the matrix table — it is no longer a floating install.
299+
300+
A daily GitHub Actions workflow (`.github/workflows/dependency-updates.yml`) runs `uv lock --upgrade`, classifies changes by reading group names from `pyproject.toml`, and opens a PR labeled `needs-cassette-rerecord` (provider SDK bumps) or `auto-merge-candidate` (infra-only bumps).
301+
280302
## Editing Guidelines
281303

282304
- Keep tests close to the code they cover.

CONTRIBUTING.md

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -23,10 +23,12 @@ If you use `mise activate` in your shell, entering the repo will automatically e
2323
## Repo Layout
2424

2525
- `py/`: main Python SDK
26+
- `py/pyproject.toml`: single source of truth for package metadata, dependency groups, and provider version matrix
27+
- `py/uv.lock`: committed lockfile for reproducible auxiliary dep resolution
2628
- `integrations/`: separate integration packages such as LangChain and ADK
2729
- `docs/`: supporting docs
2830

29-
Most SDK changes should happen under `py/`.
31+
Most SDK changes should happen under `py/`. There is no `setup.py``pyproject.toml` is the build configuration.
3032

3133
## Common Workflows
3234

@@ -40,6 +42,8 @@ make lint
4042
nox -l
4143
```
4244

45+
`make install-dev` uses `uv sync` under the hood, reading dependency groups from `py/pyproject.toml`. There are no separate `requirements-*.txt` files.
46+
4347
Run a focused session:
4448

4549
```bash
@@ -54,12 +58,7 @@ cd py
5458
nox -s "test_openai(latest)" -- -k "test_chat_metrics"
5559
```
5660

57-
Install optional provider packages only when you need them:
58-
59-
```bash
60-
cd py
61-
make install-optional
62-
```
61+
Optional provider packages are installed automatically by each nox session — there is no separate `install-optional` target.
6362

6463
### Repo-Level Commands
6564

@@ -95,7 +94,7 @@ uv run pytest
9594

9695
## Testing Notes
9796

98-
The SDK uses [nox](https://nox.thea.codes/) for compatibility testing across optional providers and versions. `py/noxfile.py` is the source of truth for available sessions.
97+
The SDK uses [nox](https://nox.thea.codes/) for compatibility testing across optional providers and versions. Provider version pins live in `py/pyproject.toml` under `[tool.braintrust.matrix]`; the noxfile reads them at import time. `py/noxfile.py` is the source of truth for available sessions and their auxiliary deps.
9998

10099
### VCR Tests
101100

@@ -187,7 +186,7 @@ To benchmark with the optional `orjson` fast-path installed:
187186

188187
```bash
189188
cd py
190-
python -m uv pip install -e '.[performance]'
189+
uv sync --extra performance
191190
make bench
192191
```
193192

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,9 @@ This repository contains Braintrust's Python SDKs and integrations, including:
1313
Install the main SDK and scorer package:
1414

1515
```bash
16+
# uv
17+
uv add braintrust autoevals
18+
# pip
1619
pip install braintrust autoevals
1720
```
1821

0 commit comments

Comments
 (0)