storage: introduce StorageScalarExpr for postgres source casts#36047
Open
antiguru wants to merge 8 commits intoMaterializeInc:mainfrom
Open
storage: introduce StorageScalarExpr for
postgres source casts#36047antiguru wants to merge 8 commits intoMaterializeInc:mainfrom
antiguru wants to merge 8 commits intoMaterializeInc:mainfrom
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy the exact eval implementations from mz_expr's CastStringTo* functions. Each CastFunc variant delegates to the same strconv::parse_* functions to ensure bit-for-bit identical behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace MirScalarExpr with StorageScalarExpr in the full column cast pipeline: * Change PostgresSourceExportDetails.column_casts field type * Rewrite generate_column_casts to build StorageScalarExpr directly from PG type metadata, bypassing plan_cast/lower_uncorrelated * Update cast_row and SourceOutputInfo in storage to use the new type Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a `#[cfg(test)] mod tests` block to `src/storage-types/src/sources/casts.rs` covering all categories: simple scalar casts (Bool, Int16, Int32, Int64, Float32, Float64, Date, Uuid, Bytes, Jsonb), parameterized casts (Numeric with/without scale, Timestamp with/without precision, Char, VarChar), null propagation, ErrorIfNull (fires on null, passes through non-null), error cases (invalid input, fail_on_len), Literal unpacking, and Column extraction. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
|
Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone. PR title guidelines
Pre-merge checklist
|
Handle Json as jsonb (matching old plan_cast behavior) and produce TableContainsUningestableTypes errors for reg* types with proper table/column context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add 22 parity tests that compare StorageScalarExpr::eval against MirScalarExpr::eval for identical inputs, asserting structural equality of both Ok and Err results. Also fix error handling for uningestable PG types (reg*, json): produce TableContainsUningestableTypes with table/column context, and treat json as jsonb. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Assert exact EvalError variants for each CastFunc on invalid input. These tests lock down error behavior independently of MirScalarExpr — parity tests can be removed in a follow-up PR while these remain. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Error variants, error messages, and output Datum types must not change across releases. Document external crate dependencies whose version bumps could alter parse behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replace
MirScalarExprinPostgresSourceExportDetails.column_castswith a new closed-enumStorageScalarExprinmz-storage-types.Storage needs stable cast evaluation that does not change when the expression framework evolves.
StorageScalarExpr(4 variants: Column, Literal, CallUnary, ErrorIfNull) andCastFunc(24 cast variants) inmz-storage-typesmz_exprCastStringTo* functions, delegating to the samestrconv::parse_*functionsgenerate_column_caststo buildStorageScalarExprdirectly from PG type metadata, bypassing the HIR/MIR planning pipelinecast_rowin storage to useStorageScalarExpr::evalcolumn_castsis never durably persisted, always recomputed at plan timeMirScalarExprEvalErrorvariants for each cast functionStability contract
Error variants, error messages, and output Datum types produced by
StorageScalarExpr::evalmust not change across releases.The eval implementations delegate to
mz_repr::strconv::parse_*, which depend on these external crates whose version bumps may alter parse behavior:chrono/chrono-tz— date, time, timestamp, timestamptz parsingdec(decnumber-sys) — numeric parsing and rescalingserde_json— jsonb parsinguuid— uuid parsinghex— bytea hex decodingregex— float special-value detection (inf, NaN)ordered-float— float Datum representationAdditionally, Materialize-internal crates control behavior for:
mz-repr(strconv, adt types) — core parsing logicmz-pgtz— timezone resolutionmz-ore— lexing for container types (array, list, map, range)