Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dl/translator: wip - port to the new scheduling API #25077

Draft
wants to merge 11 commits into
base: dev
Choose a base branch
from

Conversation

bharathv
Copy link
Contributor

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

  • none

@bharathv
Copy link
Contributor Author

/dt

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Feb 12, 2025

CI test results

test results on build#61812
test_id test_kind job_url test_status passed
datalake_cloud_rpfixture.datalake_cloud_rpfixture unit https://buildkite.com/redpanda/redpanda/builds/61812#0194f809-a850-4ae8-be27-dc244c7b7d7d FAIL 0/2
datalake_cloud_rpfixture.datalake_cloud_rpfixture unit https://buildkite.com/redpanda/redpanda/builds/61812#0194f809-a851-4d8d-994c-4b22503e0924 FAIL 0/2
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/61812#0194f853-0be3-430e-bf2d-8b43e45b00e0 FLAKY 1/3
rptest.tests.datalake.compaction_test.CompactionGapsTest.test_translation_no_gaps.cloud_storage_type=CloudStorageType.S3.catalog_type=CatalogType.REST_HADOOP ducktape https://buildkite.com/redpanda/redpanda/builds/61812#0194f853-0be3-430e-bf2d-8b43e45b00e0 FLAKY 1/2
rptest.tests.datalake.compaction_test.CompactionGapsTest.test_translation_no_gaps.cloud_storage_type=CloudStorageType.S3.catalog_type=CatalogType.REST_JDBC ducktape https://buildkite.com/redpanda/redpanda/builds/61812#0194f853-0be0-466f-9b2e-9f40164b3f99 FLAKY 1/4
rptest.tests.enterprise_features_license_test.EnterpriseFeaturesTest.test_enable_features.feature=Feature.oidc.install_license=False.disable_trial=False ducktape https://buildkite.com/redpanda/redpanda/builds/61812#0194f853-0be3-430e-bf2d-8b43e45b00e0 FLAKY 1/2
test results on build#61827
test_id test_kind job_url test_status passed
kafka_server_rpfixture.kafka_server_rpfixture unit https://buildkite.com/redpanda/redpanda/builds/61827#0194fc77-e54c-4094-ac06-db8d2e99455b FLAKY 1/2
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/61827#0194fcbc-1cb2-48b9-b97c-63a10f74c063 FLAKY 1/2
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=10 ducktape https://buildkite.com/redpanda/redpanda/builds/61827#0194fcc1-9d0c-4df8-b00c-cf2f01da7d89 FLAKY 1/2
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=3 ducktape https://buildkite.com/redpanda/redpanda/builds/61827#0194fcc1-9d0e-4eed-b13a-b0603a75e846 FLAKY 1/3
rptest.tests.partition_balancer_test.PartitionBalancerTest.test_recovery_mode_rebalance_finish ducktape https://buildkite.com/redpanda/redpanda/builds/61827#0194fcc1-9d0c-4df8-b00c-cf2f01da7d89 FLAKY 1/2
rptest.tests.write_caching_fi_e2e_test.WriteCachingFailureInjectionE2ETest.test_crash_all_with_consumer_group ducktape https://buildkite.com/redpanda/redpanda/builds/61827#0194fcbc-1cb2-48b9-b97c-63a10f74c063 FLAKY 1/2
test results on build#61939
test_id test_kind job_url test_status passed
rptest.tests.datalake.datalake_e2e_test.DatalakeMetricsTest.test_lag_metrics.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/61939#0195178a-e67f-4827-973c-6eec9ba57ca6 FAIL 0/20
rptest.tests.datalake.datalake_e2e_test.DatalakeMetricsTest.test_lag_metrics.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/61939#0195178d-f367-48e9-8a24-4b4d7a1af9a4 FAIL 0/20
rptest.tests.datalake.partition_movement_test.PartitionMovementTest.test_cross_core_movements.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/61939#0195178a-e67d-4156-9d8c-75c69b23c3f3 FAIL 0/20
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.delete.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=3 ducktape https://buildkite.com/redpanda/redpanda/builds/61939#0195178d-f367-48e9-8a24-4b4d7a1af9a4 FLAKY 1/2
rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=True.mixed_versions=False.with_tiered_storage=True.with_iceberg=True.with_chunked_compaction=False.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/61939#0195178d-f366-48ae-884a-9ac16dac8a9e FLAKY 1/5
rptest.tests.scaling_up_test.ScalingUpTest.test_scaling_up_with_recovered_topic ducktape https://buildkite.com/redpanda/redpanda/builds/61939#0195178d-f368-48d8-9005-6f325e3c32a9 FLAKY 1/3
rptest.tests.upgrade_test.UpgradeWithWorkloadTest.test_rolling_upgrade_with_rollback.upgrade_after_rollback=True ducktape https://buildkite.com/redpanda/redpanda/builds/61939#0195178a-e67e-477b-9fea-1cc8e60c85ca FLAKY 1/2
translator_test_rpfixture.translator_test_rpfixture unit https://buildkite.com/redpanda/redpanda/builds/61939#0195172e-9aef-4397-9301-7d37bd9672a6 FLAKY 1/2

@bharathv
Copy link
Contributor Author

/dt

This method loops through the column writers to check if any of them are
flush worthy, computes the memory usage in the same loop. Useful for a
latter commit that avoids this loop again and needs stats right after
append.
@bharathv
Copy link
Contributor Author

/dt

The interface implementations keep track of the current memory used by
the writer and related reservations.
Adds the following

- flush() - flushes all the buffered bytes to the output stream
- methods to fetch buffered/flushed bytes
.. instead of lazy_abort_source. To be used later, they are both
connected anyway.
Currently multiplexer is a one shot class with pattern as follows

mux = create_mux();
co_await reader.consume(mux...)

With the new changes, we want multiplexer to multiplex across scheduling
iterations and release resouces inbetween. This commit makes changes to
the API support this port. The new pattern would look something like
this..

mux = create_mux();
mux.multiplex(reader1...)
mux.flush_writers(); // optional
mux.multiplex(reader2..)
mux.flush_writers(); // optional
...
...

result = co_await std::move(mux).finish();

The ability to temporarily flush all the intermediate state and
multiplex across multiple readers enables porting to the new scheduler
API.
Make the task long running to support batching of data across multiple
iterations of scheduler
@bharathv bharathv changed the title dl/translator: preparatory change for translator porting dl/translator: wip - port to the new scheduling API Feb 18, 2025
@bharathv
Copy link
Contributor Author

/dt

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Feb 18, 2025

Retry command for Build#61939

please wait until all jobs are finished before running the slash command



/ci-repeat 1
tests/rptest/tests/datalake/datalake_e2e_test.py::DatalakeMetricsTest.test_lag_metrics@{"cloud_storage_type":1}
tests/rptest/tests/datalake/partition_movement_test.py::PartitionMovementTest.test_cross_core_movements@{"cloud_storage_type":1}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants