Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE-8532] Disallow certain schema evolution actions if a field appears in the partition spec #25114

Merged

Conversation

oleiman
Copy link
Member

@oleiman oleiman commented Feb 19, 2025

General idea is as follows (quoted from the spec):

Type promotion is not allowed for a field that is referenced by source-id or source-ids of a partition field if the partition transform would produce a different value after promoting the type.

In general, the type promotion rules preclude this, but date -> timestamp promotions are called out specifically as potentially rule violating, if the promoted field appears in the partition spec. So this diff prevents that by wiring the current partition spec into the compat module and adding a check to the validation path.

This PR also disallows removal of any data field which also appears in the current partition spec at the time of removal. This is technically outside the spec and needs followup ecosystem investigation, but it appears that dropping a column in this way can cause downstream catalog queries to fail. The safest thing at this stage is to simply disable the capability for now.

This PR also fixes a bug in iceberg/schema_avro whereby we would try to serialize date_type to an AVRO_LONG with logical type "date", which itself is restricted to AVRO_INT. Fix includes a unit test.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

  • none

@oleiman oleiman self-assigned this Feb 19, 2025
@oleiman oleiman changed the title Disallow certain schema evolution actions if a field appears in the partition spec [CORE-8532] Disallow certain schema evolution actions if a field appears in the partition spec Feb 19, 2025
@oleiman oleiman marked this pull request as ready for review February 19, 2025 21:07
@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Feb 20, 2025

CI test results

test results on build#62039
test_id test_kind job_url test_status passed
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/62039#0195204f-d6c6-4734-b7e5-99f3c602c57d FLAKY 1/6
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/62039#01952069-fe70-4baa-be48-2de4b36457af FLAKY 1/3
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.delete.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=10 ducktape https://buildkite.com/redpanda/redpanda/builds/62039#01952069-fe68-40ed-8cbb-4d657447c169 FLAKY 1/2
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.delete.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=3 ducktape https://buildkite.com/redpanda/redpanda/builds/62039#01952069-fe6c-486a-ac2b-79b21753a320 FLAKY 1/3
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=3 ducktape https://buildkite.com/redpanda/redpanda/builds/62039#01952069-fe68-40ed-8cbb-4d657447c169 FLAKY 1/2
test results on build#62052
test_id test_kind job_url test_status passed
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.delete.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=3 ducktape https://buildkite.com/redpanda/redpanda/builds/62052#019521ea-9063-4e47-a25f-59cb5322c60f FLAKY 1/2
test results on build#62068
test_id test_kind job_url test_status passed
rptest.tests.archival_test.ArchivalTest.test_all_partitions_leadership_transfer.cloud_storage_type=CloudStorageType.ABS ducktape https://buildkite.com/redpanda/redpanda/builds/62068#019524ca-326e-4101-acf2-2837eb715b20 FLAKY 1/2
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/62068#019524cd-f886-4dc0-9fb8-fd7dcb50f53b FLAKY 1/7
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade ducktape https://buildkite.com/redpanda/redpanda/builds/62068#019524ca-3270-41ea-b4e3-fcb846f59b0f FLAKY 1/2
rptest.tests.e2e_shadow_indexing_test.ShadowIndexingWhileBusyTest.test_create_or_delete_topics_while_busy.short_retention=True.cloud_storage_type=CloudStorageType.ABS ducktape https://buildkite.com/redpanda/redpanda/builds/62068#019524cd-f885-48ec-99bc-cacb1c6b73b1 FLAKY 1/2
rptest.tests.full_disk_test.FullDiskReclaimTest.test_full_disk_triggers_gc ducktape https://buildkite.com/redpanda/redpanda/builds/62068#019524ca-3270-41ea-b4e3-fcb846f59b0f FLAKY 1/2
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=3 ducktape https://buildkite.com/redpanda/redpanda/builds/62068#019524cd-f884-4f87-85c1-e01181e315da FLAKY 1/2
rptest.tests.partition_movement_test.SIPartitionMovementTest.test_shadow_indexing.num_to_upgrade=0.cloud_storage_type=CloudStorageType.ABS ducktape https://buildkite.com/redpanda/redpanda/builds/62068#019524cd-f884-4f87-85c1-e01181e315da FLAKY 1/2
rptest.tests.scaling_up_test.ScalingUpTest.test_scaling_up_with_recovered_topic ducktape https://buildkite.com/redpanda/redpanda/builds/62068#019524cd-f884-4f87-85c1-e01181e315da FLAKY 1/3
rptest.tests.write_caching_fi_e2e_test.WriteCachingFailureInjectionE2ETest.test_crash_all_with_consumer_group ducktape https://buildkite.com/redpanda/redpanda/builds/62068#019524ca-3270-41ea-b4e3-fcb846f59b0f FLAKY 1/2
test results on build#62129
test_id test_kind job_url test_status passed
rptest.tests.datalake.datalake_dlq_test.DatalakeDLQTest.test_dlq_table_for_mixed_records.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.TRINO.catalog_type=CatalogType.REST_JDBC ducktape https://buildkite.com/redpanda/redpanda/builds/62129#01952ce8-52f5-4446-80fb-7bf316f100c0 FLAKY 1/2
rptest.tests.datalake.schema_evolution_test.SchemaEvolutionE2ETests.test_partition_spec_evo.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.SPARK.use_partition_spec=False ducktape https://buildkite.com/redpanda/redpanda/builds/62129#01952ce8-52f5-4446-80fb-7bf316f100c0 FAIL 0/20
rptest.tests.datalake.schema_evolution_test.SchemaEvolutionE2ETests.test_partition_spec_evo.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.SPARK.use_partition_spec=True ducktape https://buildkite.com/redpanda/redpanda/builds/62129#01952ce8-52f2-4457-93c8-391ae45a5c02 FAIL 0/20
rptest.tests.datalake.schema_evolution_test.SchemaEvolutionE2ETests.test_partition_spec_evo.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.TRINO.use_partition_spec=False ducktape https://buildkite.com/redpanda/redpanda/builds/62129#01952ce8-52f3-4dc6-9bcc-e2c41c46417d FAIL 0/20
rptest.tests.datalake.schema_evolution_test.SchemaEvolutionE2ETests.test_partition_spec_evo.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.TRINO.use_partition_spec=True ducktape https://buildkite.com/redpanda/redpanda/builds/62129#01952ce8-52f4-4c6c-a825-c9dec126b27e FAIL 0/20
rptest.tests.internal_topic_protection_test.InternalTopicProtectionLargeClusterTest.test_consumer_offset_topic ducktape https://buildkite.com/redpanda/redpanda/builds/62129#01952ce8-52f5-4446-80fb-7bf316f100c0 FLAKY 1/2
rptest.transactions.producers_api_test.ProducersAdminAPITest.test_producers_state_api_during_load ducktape https://buildkite.com/redpanda/redpanda/builds/62129#01952ce8-52f5-4446-80fb-7bf316f100c0 FLAKY 1/2
test results on build#62149
test_id test_kind job_url test_status passed
rptest.tests.e2e_shadow_indexing_test.ShadowIndexingWhileBusyTest.test_create_or_delete_topics_while_busy.short_retention=True.cloud_storage_type=CloudStorageType.ABS ducktape https://buildkite.com/redpanda/redpanda/builds/62149#0195311a-531b-4e89-8e2f-808798ce7916 FLAKY 1/2
rptest.tests.scaling_up_test.ScalingUpTest.test_scaling_up_with_recovered_topic ducktape https://buildkite.com/redpanda/redpanda/builds/62149#0195311a-531a-4c74-8fb1-c503c0b4f272 FLAKY 1/2
test results on build#62177
test_id test_kind job_url test_status passed
rptest.tests.e2e_shadow_indexing_test.ShadowIndexingWhileBusyTest.test_create_or_delete_topics_while_busy.short_retention=True.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/62177#0195368d-de5f-4685-a8c9-e2693ade4b48 FLAKY 1/2
rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=False.mixed_versions=True.with_tiered_storage=True.with_iceberg=False.with_chunked_compaction=False.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/62177#0195368d-de59-4e56-8dba-d6fc18b90541 FLAKY 1/2

@oleiman oleiman force-pushed the dlib/core-8532/promotion-allowability branch from 01658cd to 202a811 Compare February 20, 2025 04:08
@oleiman oleiman marked this pull request as draft February 20, 2025 04:10
@oleiman oleiman force-pushed the dlib/core-8532/promotion-allowability branch from 202a811 to 5d22c2d Compare February 20, 2025 04:20
@oleiman oleiman marked this pull request as ready for review February 20, 2025 04:24
@oleiman oleiman force-pushed the dlib/core-8532/promotion-allowability branch 5 times, most recently from 472426e to 523305a Compare February 21, 2025 03:53
@oleiman
Copy link
Member Author

oleiman commented Feb 21, 2025

/ci-repeat 1
skip-redpanda-build
skip-units

@oleiman
Copy link
Member Author

oleiman commented Feb 21, 2025

CI Failures:

Copy link
Contributor

@andrwng andrwng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Mostly a bunch of nits and questions

@@ -540,6 +547,7 @@ static const std::vector<struct_evolution_test_case> valid_cases{
&& dst_nested.fields.empty()
&& removed(*src_nested.fields.back());
},
.pspec = "(nested)", // TODO(oren): seems wrong frankly, should this fail?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source columns, selected by ids, must be a primitive type and cannot be contained in a map or list, but may be nested in a struct.

Hmm yea this is interesting. Maybe not in this PR, but feels like we should treat it like we would a missing field when resolving

if (!source_field) {
return std::nullopt;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh, i see. so a struct or map or whatever isn't even a valid source column? I don't think partition spec resolution is detecting that at all then. See L448 in this file e.g.... seems like a bug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split if off in case there's more to discuss. #25144 62c61f7

query_engine=QUERY_ENGINES,
use_partition_spec=[True, False],
)
def test_partition_spec_evo(self, cloud_storage_type, query_engine,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR, but it's worth trying these out with a a real catalog instead of just filesystem_catalog

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

@oleiman oleiman force-pushed the dlib/core-8532/promotion-allowability branch 2 times, most recently from 5961ed2 to fbb780d Compare February 22, 2025 07:52
@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Feb 22, 2025

Retry command for Build#62129

please wait until all jobs are finished before running the slash command



/ci-repeat 1
tests/rptest/tests/datalake/schema_evolution_test.py::SchemaEvolutionE2ETests.test_partition_spec_evo@{"cloud_storage_type":1,"query_engine":"spark","use_partition_spec":false}
tests/rptest/tests/datalake/schema_evolution_test.py::SchemaEvolutionE2ETests.test_partition_spec_evo@{"cloud_storage_type":1,"query_engine":"spark","use_partition_spec":true}
tests/rptest/tests/datalake/schema_evolution_test.py::SchemaEvolutionE2ETests.test_partition_spec_evo@{"cloud_storage_type":1,"query_engine":"trino","use_partition_spec":true}
tests/rptest/tests/datalake/schema_evolution_test.py::SchemaEvolutionE2ETests.test_partition_spec_evo@{"cloud_storage_type":1,"query_engine":"trino","use_partition_spec":false}

rockwotj
rockwotj previously approved these changes Feb 22, 2025
Copy link
Contributor

@rockwotj rockwotj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice

Logical DATE type should be bound to AVRO_INT only. Previously we used
AVRO_LONG, so the Avro lib would throw when exercising this code path.

Also adds a unit test for Iceberg types that are backed by Avro logicalTypes.

Note that timestamptz serialization is currently bugged, as it relies on an
extra non-standard field ('adjust-to-utc') in the Avro type description.
Technically this should appear in both micro precision timestamp types,
'false' for 'timestamp'; 'true' for 'timestamptz'. As written, both timestamp
types serialize to logical type timestamp-micros, with no way to distinguish
between the two in deser.

See https://iceberg.apache.org/spec/#avro for detail

Signed-off-by: Oren Leiman <[email protected]>
To support special promotions that affect partition transforms

type_promoted:
  - no
  - yes
  - unless_partition

Signed-off-by: Oren Leiman <[email protected]>
Specify either an annotate error or a validate error. This simplifies some of
the conditional logic around instantiating test suites. No functional changes.

Signed-off-by: Oren Leiman <[email protected]>
Though this restriction does not appear in the spec, dropping a data column
that also appears in the table's partition spec can cause validation errors
in clients, downstream. It may be possible to avoid this by performing live
partition spec reconciliation inline with the schema update, but for now we
simply reject such an update out of hand.

Also slightly refactors to consolidate error checks in validate_schema_transform.

Signed-off-by: Oren Leiman <[email protected]>
@oleiman oleiman force-pushed the dlib/core-8532/promotion-allowability branch from fbb780d to 035c4e3 Compare February 23, 2025 00:43
@oleiman
Copy link
Member Author

oleiman commented Feb 23, 2025

force push typos

rockwotj
rockwotj previously approved these changes Feb 23, 2025
@oleiman
Copy link
Member Author

oleiman commented Feb 23, 2025

/ci-repeat 1

@oleiman oleiman enabled auto-merge February 24, 2025 03:57
@oleiman
Copy link
Member Author

oleiman commented Feb 24, 2025

/ci-repeat 1
release
skip-redpanda-build
skip-units

@oleiman oleiman merged commit ef8eab0 into redpanda-data:dev Feb 24, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants