Skip to content

v24.2.1

Compare
Choose a tag to compare
@vbotbuildovich vbotbuildovich released this 31 Jul 18:44
· 1499 commits to dev since this release
08ae9f9

Deprecation notice

We recommend v24.2.2 or newer. For more context, see this Technical Service Bulletin.

New Features

  • Redpanda is now compliant with FIPS 140. To put Redpanda into FIPS mode, you will need to install the redpanda-fips package and set the fips_mode node config to true. For more information about Redpanda's security features and it's FIPS compliance, please see our documentation. by @michael-redpanda in #17616
  • rpk: Add support to JSON Schema to rpk, now supported in rpk registry and rpk topic produce/consume by @r-vasquez in #20820
  • PR #19796 [CORE-3180] json schema registry validate input schema by @andijcr
  • PR #19818 [CORE-4180] schema_registry/json: JSON Schema support by @BenPope
  • Support JavaScript and TypeScript as languages for Data Transforms. by @rockwotj in #18078
  • rpk: now you can manage client quotas using rpk cluster quotas. by @r-vasquez in #18711
  • rpk: introduce a command to import cluster client quotas: rpk cluster quotas import. by @r-vasquez in #21311
  • Adds new cluster config tls_min_version that allows users of Redpanda to specify the minimum version of TLS Redpanda will support. By default, the value is TLS v1.2. by @michael-redpanda in #21372
  • Schema Registry: Support /mode endpoints for READONLY by @BenPope in #17952
  • Schema Registry: Support subjectPrefix for GET /subjects by @BenPope in #20145
  • Schema Registry: Support for deleted=true query parameter on POST /subjects/<subject>. by @BenPope in #18391
  • #17418 rpk: ability to transfer partition leadership by @daisukebe in #18026
  • #2183 Adds path-style addressing to S3 client requests. by @WillemKauf in #17806
    This option is configurable through rpk cluster config edit: by @WillemKauf in #17806
  • Adds a cloud storage check as part of the cluster's self test. by @WillemKauf in #17586
  • Allow decreasing core count if node-local core assignment is enabled. by @ztlpn in #20312
  • If the node_local_core_assignment flag is enabled, Redpanda will try to maintain balanced distribution of partition replicas across cores. by @ztlpn in #19864
  • KIP-518 (ListGroups v4): adds redpanda and rpk support for printing consumer group states with rpk group list and filtering for specific group states with --states. by @pgellert in #21347
    #7867 rpk container start: now starts a Redpanda Console container connected with the cluster. by @r-vasquez in #18066
  • rpk container now has a set of flags to specify ports for node to start on. by @r-vasquez in #17908
  • rpk: rpk cluster partitions list now supports filtering with broker IDs. by @daisukebe in #17945
  • Adds a read distribution histogram to the internal metrics that can be visualized using Prometheus and Grafana: by @WillemKauf in #18745
    • vectorized_kafka_fetch_read_distribution_bucket
    • vectorized_kafka_fetch_read_distribution_count
    • vectorized_kafka_fetch_read_distribution_sum

Improvements

  • Adds a new fetch_read_strategy option non_polling_with_debounce. This option will add a fixed delay specified by fetch_reads_debounce_timeout to the start of every fetch. by @ballard26 in #20744
  • Adds configuration options to trigger cache trim before the cache reaches its maximum size. by @jcipar in #18756
    • cloud_storage_cache_trim_threshold_size
    • cloud_storage_cache_trim_threshold_objects
      These mirror the options for controlling maximum size: cloud_storage_cache_size and cloud_storage_cache_max_objects
  • Re-adds the fetch_read_strategy cluster config property to select between polling and non-polling fetch implementations. Uses the non-polling fetch implementation by default. by @StephanDollberg in #18090
  • Split cache into buckets using cloud_storage_cache_num_buckets configuration parameter. by @Lazin in #18762
  • The new default behavior, if these are not set, is to trigger a trim when the cache is 100% full. by @jcipar in #18756
  • #2183* * --cloud-backoff-ms uint, the backoff in milliseconds for a cloud storage request. by @WillemKauf in #17586
  • --cloud-timeout-ms uint, the timeout in milliseconds for a cloud storage request. by * redpanda will now self configure its cloud_storage_url_style for use with the internal S3 Client on start-up if it is left unspecified. by @WillemKauf in #18107
  • Add an optional TLS config field for CRL (Certificate Revocation List) by @oleiman in #18708
  • Add more cases to the rpk disk self-test to better probe write performance at various IO depths, and at 16K block sizes. Return more information about the specifics of the test in the output. by @travisdowns in #20590
  • Add output batch compression for Data Transforms (configurable per deployed transform) by @oleiman in #18514
  • Add schema registry support to experimental Data Transforms C++ SDK by @oleiman in #21292
  • Adds the ability to pause a deployed data transform w/o removing it from the system by @oleiman in #18236
  • Adds the ability to start transform processing from an arbitrary offset on the input topic. by @oleiman in #19975
  • Client qroup quotas now support dynamic configuration of the rate. by @BenPope in #18693
  • Data transforms better respect overall system memory usage and go faster when more memory is available. by @rockwotj in #17810
  • Enable controller log eviction - now the snapshotted part of the controller log will be deleted. by @ztlpn in #18836
  • Golang Transforms SDK (Serde): Adds the ability to specify subject name strategy for output topics. Transforms can also provide a function to be called after the subject name is derived - e.g. to ensure that the subject is created before emitting records. by @oleiman in #19823
  • HTTP Proxy: Avoid large allocations during JSON serde of requests and responses by @BenPope in #20827
  • Implemented node-local core assignment - nodes decide themselves on which core to put partitions instead of using global assignments. This feature is enabled by default for new clusters. by @ztlpn in #18581
  • Improve rpk cluster self-test status output for the cloud test. by @WillemKauf in #21353
  • Improve cloud storage cache to prevent readers from being blocked during cache eviction. by @Lazin in #18056
  • Improved logging around create_topics and alter_configs requests by @graphcareful in #18789
  • It is now possible to use rpk group seek to set commits on a group that does not yet exist by @twmb in #18667
  • Made electing a leader faster by @mmaslankaprv in #18394
  • Made leadership changes related with reconfiguration faster and less disruptive by @mmaslankaprv in #19966
  • Modify connection limit reached log line at WARN instead of INFO level by @graphcareful in #18773
  • Run directory walking during cache trimming concurrently. On some deployments it was observed that it can take hours for 600K objects with busy reactor during which fetch operations that need to cache data are blocked. by @nvartolomei in #18758
  • Schema Registry: Avoid large allocations during JSON serde of requests and responses by @BenPope in #20827
  • Schema Registry: Avoid large contiguous allocations whilst handling schema. by @BenPope in #21323
  • Schema Registry: Improve handling of Accept and Content-Type by @BenPope in #19866
  • Schema Registry: GET /subject/<subject>/versions/ now supports -1 as an alias for latest. by @BenPope in #20146
  • Schema Registry: remove the schema from memory when the last subject version referencing it is deleted. by @pgellert in #20847
  • Self-test results will now also include start and end timestamps of test runs by @graphcareful in #18837
  • Short description of how this PR improves existing behavior. by @jcipar in #18756
  • The produce client quota (target_quota_byte_rate) is now disabled by default. Previously this was enabled at 2GB/shard/client.id. by @pgellert in #20142
  • #17150 Don't try to transfer leadership to just restarted nodes when balancing leaders. by @ztlpn in #18497
  • #17710 rpk: group describe supports --regex flag by @daisukebe in #19839
  • #18057 rpk debug bundle now fallback to loaded profile's admin API URLs if we fail to discover the cluster in the collection steps. by @r-vasquez in #19473
  • #18152 rpk: topic describe supports --regex flag by @daisukebe in #18221
  • #18627 rpk now will exit (1) when running rpk with unknown commands by @r-vasquez in #18650
  • #9008 rpk container start: You can now select the subnet and gateway to create your 'redpanda' network. by @r-vasquez in #17934
  • allow interpreting 'retention_duration' = -1 in a topic_manifest.json file as infinite time retention by @andijcr in #17461
  • made fast partition movements easier to debug. by @mmaslankaprv in #18680
  • new metric providing more insight into recovery process by @mmaslankaprv in #18691
  • reduced the amount of data required to transfer over the network by @mmaslankaprv in #19820
  • refined vectorized_raft_leadership_changes_total metric by @mmaslankaprv in #19873
  • rpk container now starts the seed broker using the default listener ports. by @r-vasquez in #17908
  • rpk debug bundle mark the generated bundle file with the node's advertised_rpc address. by @r-vasquez in #19843
  • rpk debug bundle now collects the log since yesterday by default, you can still change them with --logs-since. by @r-vasquez in #19843
  • rpk group describe now has a log-start-offset column, and lag calculations now properly account for non-zero LSO's by @twmb in #18843
  • rpk: --mechanism flag is now required to update users when using rpk security user update. by @r-vasquez in #21452

None

No release notes explicitly specified.

Unclear

Empty release notes section in the PR body, unspecified sub-section, or other ambiguous content. Refer to CONTRIBUTING.md for expected content.

Bug Fixes

  • Avoid spurious step downs if the append_entries are blocked for longer than usual on the followers. by @bharathv in #19974
  • Don't mark partition rebalance complete if some partitions are not moveable (e.g. due to partial recovery mode) by @ztlpn in #18489
  • Enforce client quota throttling in a Kafka-compatible way, meaning we enforce the throttle delay on the next request if the client did not enforce it on its side. by @pgellert in #18218
  • Fix a bug validating WebAssembly when global constants are specific values that have the encoded byte 0x0B. by @rockwotj in #18077
  • Fix a bug where an invalid buffer passed into the WebAssembly host from the guest could cause Redpanda to abort. by @rockwotj in #18225
  • Fix a scenario where list_offset with a timestamp could return a lower offset than partition start after a trim-prefix command. This could lead to consumers being stuck with an out-of-range-offset exception if they began consuming from an offset below the one which was used in the trim-prefix command. by @nvartolomei in #18112
  • Fix an issue where transforms would miscalculate their initial start offset, leading to consuming the whole input topic. by @oleiman in #20127
  • Fix timequery failing when requested timestamp is in the future and local log is empty. by @nvartolomei in #21312
  • Fix timequery failing with exceptions when the queried partition is empty. by @nvartolomei in #19937
  • Fix transform JavaScript SDK tools on Darwin by @rockwotj in #21490
  • Fixed misnamed schema registry ABI stubs in the Data Transforms Rust SDK and improved tests. by @oleiman in #18555
  • Fixes a bug in the http client where a crash may occur in the event certain tls verification errors are observed by @graphcareful in #18304
  • Fixes a bug that would allow requests to complete that created acls for topics with invalid kafka topic names by @graphcareful in #18701
  • Fixes a bug where remote::list_objects() requests that return truncated results get stuck in an infinite loop by @WillemKauf in #18193
  • Fixes a crash that could happen when reading from local storage with a large number of segments that all do not contain user data. by @andrwng in #18073
  • Fixes compaction related issues of transactional data in groups topic. This prevents a case where groups topic was growing unbounded due to ineffective compaction. by @bharathv in #19931
  • Fixes incorrect ordering of arguments in the cloud cache trim admin endpoint. by @andrwng in #18732
  • Fixes related bugs in which query parameters in remote::list_objects() requests were being improperly assigned to request headers by @WillemKauf in #18193
  • Schema Registry: Improve handling of deleted schema by @BenPope in #19944
  • This fixes a bug in kafka topic configs, where the cleanup.policy config was always set at the topic-level to the cluster default even when the topic creation request did not specify this. by @pgellert in #18284
  • Unbreak building sr-sys crate for wasm32-wasi target by @voutilad in #18307
  • #12920 Fix issuing timequeries to cloud storage if remote.read is not enabled. by @WillemKauf in #17533
  • #13451 Speeds up the shutdown time of partition replicas when compactions are happening. by @andrwng in #21366
  • #15312 Fix an edge case where a timequery returns no results if it races with tiered storage retention and garbage collection. This is important at least for consumers that fall behind retention. They interpret such response as the partition is empty and jump to the HWM instead of resuming consuming from the first available message. by @nvartolomei in #18097
  • #17739 Better mapping of REST error codes by @mmaslankaprv in #18094
  • #18213 Fixes a crash caused by a race between a client disconnect and a segment reader in tiered storage. by @andrwng in #18229
  • #18286 Fixed an assertion triggering in a full-disk scenario by @andijcr in #18303
  • #18415 Fixed a crash during raft snapshot application (prefix truncation of the log). A race condition between raft snapshot application and lagging state machine apply fibers caused the state machine offset to move backwards incorrectly. by @bharathv in #18576
  • #18602 rpk: fixes an error in rpk topic consume that prevented the usage of the --regex flag. by @r-vasquez in #18628
  • concurrent requests of set_log_level + expiration now work as expected by @andijcr in #18397
  • correctly convert URL strings to net::unresolved_address in azure_aks_refresh_impl by @andijcr in #18035
  • do not return aborted transaction ranges when reading with read_uncommitted isolation level. by @mmaslankaprv in #18852
  • fixed handling of delayed snapshot requests that might lead to an assertion by @mmaslankaprv in #19964
  • fixed overflow that may lead to unnecessary moves by @mmaslankaprv in #19794
  • fixes possible stall in raft::state_machine_manger by @mmaslankaprv in #18626
  • rpk cluster config get: does not round float numbers anymore. by @r-vasquez in #18841
  • rpk: fixed a bug that prevented --any-port from working with Redpanda Console when using a cluster with more than 1 node. by @r-vasquez in #19948

Full Changelog: v24.1.1...v24.2.1