[iceberg-rust] Manifest writer declares bucket() partition field as String instead of Int32

> Filed here because issues are disabled on \`Embucket/iceberg-rust\`. The fix lives in that fork.

## Summary

When writing to an Iceberg table whose partition spec uses \`bucket(N, col)\`, the iceberg-rust manifest writer builds the Avro schema for the \`data_file.partition\` struct with the bucket field typed as a nullable **string**. But \`transform_arrow()\` produces an \`Int32\` hash for \`Bucket\`, so serialization fails at commit time:

\`\`\`
Iceberg error: Failed to serialize field 'data_file' for record
Record(RecordSchema { name: Name { name: \"manifest_entry\", ... },
  fields: [... RecordField { name: \"data_file\", ... schema: Record(RecordSchema {
    name: Name { name: \"r2\", ... },
    fields: [... RecordField { name: \"partition\", ... schema: Record(RecordSchema {
      name: Name { name: \"r102\", ... },
      fields: [RecordField {
        name: \"id_bucket\", ...,
        schema: Union(UnionSchema { schemas: [Null, String], ... }),  ← wrong type
        ...
      }],
      ...
\`\`\`

Expected partition field type: \`Union(Null, Int)\` (matching the \`Int32Type\` returned by \`transform_arrow\` for \`Bucket\`).

## Repro

Create any Iceberg table partitioned by \`bucket(N, col)\` via Athena — \`WITH (table_type = 'ICEBERG', partitioning = ARRAY['bucket(4, id)'])\`. Seed one row. Then from Embucket run any MERGE or UPDATE on that target. Seen against the probe table \`atomic.merge_test_bucket\` on S3 Tables during the verification of [Embucket/iceberg-rust#57](https://github.com/Embucket/iceberg-rust/pull/57).

## Likely location

Wherever \`iceberg-rust\` derives the partition-struct Avro schema from the Iceberg partition spec. The per-field result type for each transform should match what \`transform_arrow\` produces: \`Bucket(_) → Int\`, \`Truncate(_) → same as source\`, \`Day/Month/Year → Int\`, \`Hour → Int\`, \`Identity → source type\`. Almost certainly a single table of \`Transform → Iceberg result type\` that's missing or wrong for \`Bucket\`.

## Related

Unmasked once [Embucket/iceberg-rust#57](https://github.com/Embucket/iceberg-rust/pull/57) landed — before that, the projection schema mismatch on partitioned targets short-circuited every MERGE before the manifest writer ran.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[iceberg-rust] Manifest writer declares bucket() partition field as String instead of Int32 #125

Summary

Repro

Likely location

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[iceberg-rust] Manifest writer declares bucket() partition field as String instead of Int32 #125

Description

Summary

Repro

Likely location

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions