Skip to content

test: add pk compaction inte test#203

Open
lxy-9602 wants to merge 4 commits intoalibaba:mainfrom
lxy-9602:add-pk-compact-inte-test
Open

test: add pk compaction inte test#203
lxy-9602 wants to merge 4 commits intoalibaba:mainfrom
lxy-9602:add-pk-compact-inte-test

Conversation

@lxy-9602
Copy link
Copy Markdown
Collaborator

@lxy-9602 lxy-9602 commented Mar 30, 2026

Purpose

Add inte test for pk compact.
Also add file.compression.per.level option.

Linked issue: #93

Tests

MergeFileSplitReadTest, TestGenerateKeyValueReadSchema2
PkCompactionInteTest

API and Format

Add file.compression.per.level option.

Documentation

Generative AI tooling

Partially Generated-by: Claude-4.6-Opus

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds broader PK compaction integration testing (including DV, row-kinds, nested types, schema evolution, branch scenario) and introduces per-level file compression selection to better simulate/validate compaction outputs across levels.

Changes:

  • Add new PK compaction integration tests and extend scan verification to handle multiple splits per (partition, bucket).
  • Introduce file.compression.per.level option and wire per-level compression into writers (append/compact paths).
  • Extend compression mappings to support none in ORC and treat none as null in Avro; add/adjust related unit tests.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
test/inte/pk_compaction_inte_test.cpp Adds multiple PK compaction integration tests (DV, nested types, schema evolution, per-level format/compression, background compact, row kinds, branch).
src/paimon/format/orc/orc_format_writer.cpp Adds ORC compression mapping for none.
src/paimon/format/avro/avro_writer_builder.h Treats none as null codec for Avro.
src/paimon/core/postpone/postpone_bucket_writer.cpp Switches writer compression selection to per-level compression API.
src/paimon/core/operation/merge_file_split_read_test.cpp Adds a new schema-generation test case for compaction mode.
src/paimon/core/operation/key_value_file_store_scan.h Updates TODO attribution comment.
src/paimon/core/mergetree/merge_tree_writer.cpp Switches writer compression selection to per-level compression API.
src/paimon/core/mergetree/lookup_levels.cpp Forces prefetch parallelism to 1 when building read context for lookup.
src/paimon/core/mergetree/lookup_file.h Minor TODO comment tweak.
src/paimon/core/mergetree/levels.h Minor TODO comment tweak.
src/paimon/core/mergetree/compact/merge_tree_compact_rewriter_test.cpp Removes outdated TODO test-plan comment block.
src/paimon/core/mergetree/compact/merge_tree_compact_rewriter.cpp Forces prefetch parallelism to 1; uses per-level compression for compact output.
src/paimon/core/mergetree/compact/lookup_merge_tree_compact_rewriter.cpp Forces prefetch parallelism to 1.
src/paimon/core/mergetree/compact/compact_strategy.h Narrows TODO comment scope.
src/paimon/core/io/key_value_data_file_writer.cpp Updates TODO comment text.
src/paimon/core/io/data_file_writer.cpp Updates TODO comment text.
src/paimon/core/global_index/global_index_scan_impl.cpp Removes an obsolete TODO comment.
src/paimon/core/core_options_test.cpp Adds coverage for GetWriteFileCompression() and invalid per-level compression parsing.
src/paimon/core/core_options.h Adds GetWriteFileCompression(level) API.
src/paimon/core/core_options.cpp Parses file.compression.per.level and implements GetWriteFileCompression(level).
src/paimon/common/defs.cpp Adds Options::FILE_COMPRESSION_PER_LEVEL constant.
include/paimon/defs.h Documents file.compression.per.level option.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants