Open
Conversation
There was a problem hiding this comment.
Pull request overview
Adds broader PK compaction integration testing (including DV, row-kinds, nested types, schema evolution, branch scenario) and introduces per-level file compression selection to better simulate/validate compaction outputs across levels.
Changes:
- Add new PK compaction integration tests and extend scan verification to handle multiple splits per (partition, bucket).
- Introduce
file.compression.per.leveloption and wire per-level compression into writers (append/compact paths). - Extend compression mappings to support
nonein ORC and treatnoneasnullin Avro; add/adjust related unit tests.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| test/inte/pk_compaction_inte_test.cpp | Adds multiple PK compaction integration tests (DV, nested types, schema evolution, per-level format/compression, background compact, row kinds, branch). |
| src/paimon/format/orc/orc_format_writer.cpp | Adds ORC compression mapping for none. |
| src/paimon/format/avro/avro_writer_builder.h | Treats none as null codec for Avro. |
| src/paimon/core/postpone/postpone_bucket_writer.cpp | Switches writer compression selection to per-level compression API. |
| src/paimon/core/operation/merge_file_split_read_test.cpp | Adds a new schema-generation test case for compaction mode. |
| src/paimon/core/operation/key_value_file_store_scan.h | Updates TODO attribution comment. |
| src/paimon/core/mergetree/merge_tree_writer.cpp | Switches writer compression selection to per-level compression API. |
| src/paimon/core/mergetree/lookup_levels.cpp | Forces prefetch parallelism to 1 when building read context for lookup. |
| src/paimon/core/mergetree/lookup_file.h | Minor TODO comment tweak. |
| src/paimon/core/mergetree/levels.h | Minor TODO comment tweak. |
| src/paimon/core/mergetree/compact/merge_tree_compact_rewriter_test.cpp | Removes outdated TODO test-plan comment block. |
| src/paimon/core/mergetree/compact/merge_tree_compact_rewriter.cpp | Forces prefetch parallelism to 1; uses per-level compression for compact output. |
| src/paimon/core/mergetree/compact/lookup_merge_tree_compact_rewriter.cpp | Forces prefetch parallelism to 1. |
| src/paimon/core/mergetree/compact/compact_strategy.h | Narrows TODO comment scope. |
| src/paimon/core/io/key_value_data_file_writer.cpp | Updates TODO comment text. |
| src/paimon/core/io/data_file_writer.cpp | Updates TODO comment text. |
| src/paimon/core/global_index/global_index_scan_impl.cpp | Removes an obsolete TODO comment. |
| src/paimon/core/core_options_test.cpp | Adds coverage for GetWriteFileCompression() and invalid per-level compression parsing. |
| src/paimon/core/core_options.h | Adds GetWriteFileCompression(level) API. |
| src/paimon/core/core_options.cpp | Parses file.compression.per.level and implements GetWriteFileCompression(level). |
| src/paimon/common/defs.cpp | Adds Options::FILE_COMPRESSION_PER_LEVEL constant. |
| include/paimon/defs.h | Documents file.compression.per.level option. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/paimon/core/mergetree/compact/lookup_merge_tree_compact_rewriter.cpp
Show resolved
Hide resolved
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Add inte test for pk compact.
Also add
file.compression.per.leveloption.Linked issue: #93
Tests
MergeFileSplitReadTest, TestGenerateKeyValueReadSchema2
PkCompactionInteTest
API and Format
Add
file.compression.per.leveloption.Documentation
Generative AI tooling
Partially Generated-by: Claude-4.6-Opus