Skip to content

Add proposal for per-tenant cardinality API#7335

Merged
CharlieTLe merged 6 commits intocortexproject:masterfrom
CharlieTLe:proposal/per-tenant-tsdb-status-api
Mar 30, 2026
Merged

Add proposal for per-tenant cardinality API#7335
CharlieTLe merged 6 commits intocortexproject:masterfrom
CharlieTLe:proposal/per-tenant-tsdb-status-api

Conversation

@CharlieTLe
Copy link
Copy Markdown
Member

@CharlieTLe CharlieTLe commented Mar 7, 2026

Summary

Proposal for a per-tenant cardinality API (GET /api/v1/cardinality) that exposes cardinality statistics (top metrics by series count, top labels by value count, top label-value pairs by series count) across two data sources:

  • source=head: Fans out to ingesters via the distributor, aggregates TSDB head stats with RF-based deduplication.
  • source=blocks: Fans out to store gateways via BlocksFinder + GetClientsFor, computes cardinality from block indexes with per-block caching.

Key design points:

  • start/end required for blocks path, rejected for head path (head cannot sub-filter)
  • Per-tenant limits: cardinality_api_enabled, cardinality_max_query_range, cardinality_max_concurrent_requests, cardinality_query_timeout
  • Standard {status, data} Prometheus response envelope with approximated field for block overlap / partial results
  • Phased rollout: head path first, blocks path second, behind per-tenant feature flag

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
Comment thread docs/proposals/per-tenant-tsdb-status-api.md Outdated
Comment thread docs/proposals/per-tenant-tsdb-status-api.md Outdated
Comment thread docs/proposals/per-tenant-tsdb-status-api.md Outdated
Comment thread docs/proposals/per-tenant-tsdb-status-api.md Outdated
Comment thread docs/proposals/per-tenant-tsdb-status-api.md Outdated
Comment thread docs/proposals/per-tenant-tsdb-status-api.md Outdated
| `labelValueCountByLabelName` | No | Portable to block storage |
| `seriesCountByLabelValuePair` | No | Portable to block storage |
| `memoryInBytesByLabelName` | **Yes** | In-memory byte usage has no analogue in object storage |
| `minTime` / `maxTime` | **Yes** | Reflects head time range, not total storage |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add those head specific fields?

CharlieTLe and others added 5 commits March 13, 2026 12:38
…ore gateways

Add source=blocks query parameter to analyze cardinality from compacted
blocks in object storage. The blocks path fans out to store gateways,
which compute statistics from block index headers (cheap label value
counts) and posting list expansion (exact series counts per metric).
Results are cached per immutable block.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
…plify

Address feedback from PR cortexproject#7335 review:
- Rename endpoint from /api/v1/status/tsdb to /api/v1/cardinality
- Drop Prometheus compatibility as a goal
- Add start/end time range query parameters
- Drop head-specific fields (numLabelPairs, memoryInBytesByLabelName,
  minTime, maxTime) to unify response across both sources
- Remove API Compatibility and Field Portability sections

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
…limit

Make start/end required for source=blocks to prevent unbounded block
scanning. Add cardinality_max_query_range per-tenant limit (default 24h)
to give operators control over the blast radius.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
Critical:
- Fix blocks path aggregation: no SG RF division since GetClientsFor
  routes each block to exactly one store gateway

Significant:
- Add min_time, max_time, block_ids to store gateway CardinalityRequest
- Specify MaxErrors=0 for head path with availability implications
- Add consistency check and retry logic for blocks path
- Document RF division as best-effort approximation

Moderate:
- Wrap responses in standard {status, data} Prometheus envelope
- Change HTTP 422 to HTTP 400 for limit violations
- Add Error Responses section with all validation scenarios
- Add approximated field for block overlap and partial results
- Add Observability section with metrics
- Add per-tenant concurrency limit and query timeout
- Reject start/end for source=head instead of silently ignoring

Low:
- Add Rollout Plan with phased approach and feature flag
- Document rolling upgrade compatibility (Unimplemented handling)
- Document Query Frontend bypass
- Improve caching: full results keyed by ULID, limit at response time
- Add missing files to implementation section
- Move shared proto to pkg/cortexpb/cardinality.proto
- Rename TSDBStatus* to Cardinality* throughout
- Add limit upper bound (max 512)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
@CharlieTLe CharlieTLe changed the title Add proposal for per-tenant TSDB status API Add proposal for per-tenant cardinality API Mar 13, 2026
Copy link
Copy Markdown
Member

@friedrichg friedrichg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a great feature. thanks for working on this!

- Zero read-time cost — statistics are available immediately from block metadata.
- The compactor already reads the full block index during compaction and validation (`GatherIndexHealthStats`).

**Cons:**
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one extra con here is that we would need to align with block ranges

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 16, 2026
Copy link
Copy Markdown
Member

@SungJin1212 SungJin1212 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think it would be good to configure the Admin UI using those APIs.

@CharlieTLe CharlieTLe merged commit 38a0c36 into cortexproject:master Mar 30, 2026
64 of 65 checks passed

The HTTP handler parses the `source` parameter and delegates to the appropriate backend. The endpoint is registered via `NewQuerierHandler` in `pkg/api/handlers.go` and does **not** go through the Query Frontend — it is served directly by the Querier. The Query Frontend's splitting, caching, and retry logic is designed for PromQL queries and does not apply to cardinality statistics. The Querier's own per-tenant concurrency limit provides sufficient request control (see [Per-Tenant Limits](#per-tenant-limits)).

#### Head Path (`source=head`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

head or blocks seems more an implementation detail of the underlying storage. Probably better to just enforce with time range?
I think from a user perspective it feels very confusing to see these storage details


#### Blocks Path (`source=blocks`)

The request flows through the Querier's HTTP handler, which fans out to store gateways using the same `BlocksFinder` + `GetClientsFor` pattern used by `LabelNames`, `LabelValues`, and `Series`:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm I am confused I thought head and blocks are both ingesters but for inmemory and on disk part. Should we clarify it? These names feel a bit confusing to me


#### Store Gateway Service

A new `Cardinality` RPC is added to the StoreGateway service in `pkg/storegateway/storegatewaypb/gateway.proto`:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would require changes to Thanos because we use Thanos Store Gateway underneath? So need a proposal there as well

CharlieTLe added a commit to CharlieTLe/cortex that referenced this pull request Apr 6, 2026
* Add proposal for per-tenant TSDB status API

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>

* Extend TSDB status proposal with long-term storage cardinality via store gateways

Add source=blocks query parameter to analyze cardinality from compacted
blocks in object storage. The blocks path fans out to store gateways,
which compute statistics from block index headers (cheap label value
counts) and posting list expansion (exact series counts per metric).
Results are cached per immutable block.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>

* Update proposal based on PR review: rename to Cardinality API and simplify

Address feedback from PR cortexproject#7335 review:
- Rename endpoint from /api/v1/status/tsdb to /api/v1/cardinality
- Drop Prometheus compatibility as a goal
- Add start/end time range query parameters
- Drop head-specific fields (numLabelPairs, memoryInBytesByLabelName,
  minTime, maxTime) to unify response across both sources
- Remove API Compatibility and Field Portability sections

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>

* Require start/end for blocks path and add per-tenant max query range limit

Make start/end required for source=blocks to prevent unbounded block
scanning. Add cardinality_max_query_range per-tenant limit (default 24h)
to give operators control over the blast radius.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>

* Address all review findings from proposal review

Critical:
- Fix blocks path aggregation: no SG RF division since GetClientsFor
  routes each block to exactly one store gateway

Significant:
- Add min_time, max_time, block_ids to store gateway CardinalityRequest
- Specify MaxErrors=0 for head path with availability implications
- Add consistency check and retry logic for blocks path
- Document RF division as best-effort approximation

Moderate:
- Wrap responses in standard {status, data} Prometheus envelope
- Change HTTP 422 to HTTP 400 for limit violations
- Add Error Responses section with all validation scenarios
- Add approximated field for block overlap and partial results
- Add Observability section with metrics
- Add per-tenant concurrency limit and query timeout
- Reject start/end for source=head instead of silently ignoring

Low:
- Add Rollout Plan with phased approach and feature flag
- Document rolling upgrade compatibility (Unimplemented handling)
- Document Query Frontend bypass
- Improve caching: full results keyed by ULID, limit at response time
- Add missing files to implementation section
- Move shared proto to pkg/cortexpb/cardinality.proto
- Rename TSDBStatus* to Cardinality* throughout
- Add limit upper bound (max 512)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>

* Rename proposal file to per-tenant-cardinality-api.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>

---------

Signed-off-by: Charlie Le <charlie_le@apple.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size/L type/feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants