Add proposal for per-tenant cardinality API#7335
Add proposal for per-tenant cardinality API#7335CharlieTLe merged 6 commits intocortexproject:masterfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com>
| | `labelValueCountByLabelName` | No | Portable to block storage | | ||
| | `seriesCountByLabelValuePair` | No | Portable to block storage | | ||
| | `memoryInBytesByLabelName` | **Yes** | In-memory byte usage has no analogue in object storage | | ||
| | `minTime` / `maxTime` | **Yes** | Reflects head time range, not total storage | |
There was a problem hiding this comment.
Do we need to add those head specific fields?
…ore gateways Add source=blocks query parameter to analyze cardinality from compacted blocks in object storage. The blocks path fans out to store gateways, which compute statistics from block index headers (cheap label value counts) and posting list expansion (exact series counts per metric). Results are cached per immutable block. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com>
…plify Address feedback from PR cortexproject#7335 review: - Rename endpoint from /api/v1/status/tsdb to /api/v1/cardinality - Drop Prometheus compatibility as a goal - Add start/end time range query parameters - Drop head-specific fields (numLabelPairs, memoryInBytesByLabelName, minTime, maxTime) to unify response across both sources - Remove API Compatibility and Field Portability sections Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com>
…limit Make start/end required for source=blocks to prevent unbounded block scanning. Add cardinality_max_query_range per-tenant limit (default 24h) to give operators control over the blast radius. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com>
Critical:
- Fix blocks path aggregation: no SG RF division since GetClientsFor
routes each block to exactly one store gateway
Significant:
- Add min_time, max_time, block_ids to store gateway CardinalityRequest
- Specify MaxErrors=0 for head path with availability implications
- Add consistency check and retry logic for blocks path
- Document RF division as best-effort approximation
Moderate:
- Wrap responses in standard {status, data} Prometheus envelope
- Change HTTP 422 to HTTP 400 for limit violations
- Add Error Responses section with all validation scenarios
- Add approximated field for block overlap and partial results
- Add Observability section with metrics
- Add per-tenant concurrency limit and query timeout
- Reject start/end for source=head instead of silently ignoring
Low:
- Add Rollout Plan with phased approach and feature flag
- Document rolling upgrade compatibility (Unimplemented handling)
- Document Query Frontend bypass
- Improve caching: full results keyed by ULID, limit at response time
- Add missing files to implementation section
- Move shared proto to pkg/cortexpb/cardinality.proto
- Rename TSDBStatus* to Cardinality* throughout
- Add limit upper bound (max 512)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com>
friedrichg
left a comment
There was a problem hiding this comment.
Looks like a great feature. thanks for working on this!
| - Zero read-time cost — statistics are available immediately from block metadata. | ||
| - The compactor already reads the full block index during compaction and validation (`GatherIndexHealthStats`). | ||
|
|
||
| **Cons:** |
There was a problem hiding this comment.
I think one extra con here is that we would need to align with block ranges
SungJin1212
left a comment
There was a problem hiding this comment.
LGTM, I think it would be good to configure the Admin UI using those APIs.
|
|
||
| The HTTP handler parses the `source` parameter and delegates to the appropriate backend. The endpoint is registered via `NewQuerierHandler` in `pkg/api/handlers.go` and does **not** go through the Query Frontend — it is served directly by the Querier. The Query Frontend's splitting, caching, and retry logic is designed for PromQL queries and does not apply to cardinality statistics. The Querier's own per-tenant concurrency limit provides sufficient request control (see [Per-Tenant Limits](#per-tenant-limits)). | ||
|
|
||
| #### Head Path (`source=head`) |
There was a problem hiding this comment.
head or blocks seems more an implementation detail of the underlying storage. Probably better to just enforce with time range?
I think from a user perspective it feels very confusing to see these storage details
|
|
||
| #### Blocks Path (`source=blocks`) | ||
|
|
||
| The request flows through the Querier's HTTP handler, which fans out to store gateways using the same `BlocksFinder` + `GetClientsFor` pattern used by `LabelNames`, `LabelValues`, and `Series`: |
There was a problem hiding this comment.
Umm I am confused I thought head and blocks are both ingesters but for inmemory and on disk part. Should we clarify it? These names feel a bit confusing to me
|
|
||
| #### Store Gateway Service | ||
|
|
||
| A new `Cardinality` RPC is added to the StoreGateway service in `pkg/storegateway/storegatewaypb/gateway.proto`: |
There was a problem hiding this comment.
This would require changes to Thanos because we use Thanos Store Gateway underneath? So need a proposal there as well
* Add proposal for per-tenant TSDB status API Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com> * Extend TSDB status proposal with long-term storage cardinality via store gateways Add source=blocks query parameter to analyze cardinality from compacted blocks in object storage. The blocks path fans out to store gateways, which compute statistics from block index headers (cheap label value counts) and posting list expansion (exact series counts per metric). Results are cached per immutable block. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com> * Update proposal based on PR review: rename to Cardinality API and simplify Address feedback from PR cortexproject#7335 review: - Rename endpoint from /api/v1/status/tsdb to /api/v1/cardinality - Drop Prometheus compatibility as a goal - Add start/end time range query parameters - Drop head-specific fields (numLabelPairs, memoryInBytesByLabelName, minTime, maxTime) to unify response across both sources - Remove API Compatibility and Field Portability sections Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com> * Require start/end for blocks path and add per-tenant max query range limit Make start/end required for source=blocks to prevent unbounded block scanning. Add cardinality_max_query_range per-tenant limit (default 24h) to give operators control over the blast radius. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com> * Address all review findings from proposal review Critical: - Fix blocks path aggregation: no SG RF division since GetClientsFor routes each block to exactly one store gateway Significant: - Add min_time, max_time, block_ids to store gateway CardinalityRequest - Specify MaxErrors=0 for head path with availability implications - Add consistency check and retry logic for blocks path - Document RF division as best-effort approximation Moderate: - Wrap responses in standard {status, data} Prometheus envelope - Change HTTP 422 to HTTP 400 for limit violations - Add Error Responses section with all validation scenarios - Add approximated field for block overlap and partial results - Add Observability section with metrics - Add per-tenant concurrency limit and query timeout - Reject start/end for source=head instead of silently ignoring Low: - Add Rollout Plan with phased approach and feature flag - Document rolling upgrade compatibility (Unimplemented handling) - Document Query Frontend bypass - Improve caching: full results keyed by ULID, limit at response time - Add missing files to implementation section - Move shared proto to pkg/cortexpb/cardinality.proto - Rename TSDBStatus* to Cardinality* throughout - Add limit upper bound (max 512) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com> * Rename proposal file to per-tenant-cardinality-api.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com> --------- Signed-off-by: Charlie Le <charlie_le@apple.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com>
Summary
Proposal for a per-tenant cardinality API (
GET /api/v1/cardinality) that exposes cardinality statistics (top metrics by series count, top labels by value count, top label-value pairs by series count) across two data sources:source=head: Fans out to ingesters via the distributor, aggregates TSDB head stats with RF-based deduplication.source=blocks: Fans out to store gateways viaBlocksFinder+GetClientsFor, computes cardinality from block indexes with per-block caching.Key design points:
start/endrequired for blocks path, rejected for head path (head cannot sub-filter)cardinality_api_enabled,cardinality_max_query_range,cardinality_max_concurrent_requests,cardinality_query_timeout{status, data}Prometheus response envelope withapproximatedfield for block overlap / partial results🤖 Generated with Claude Code