Timeout in cargo test: mz-environmentd::sql test_utilization_hold #29299

def- · 2024-08-30T13:56:34Z

What version of Materialize are you using?

4de94aa

What is the issue?

2024-08-30T13:41:54.351864Z  INFO sql: since has not yet advanced to expected time, retrying now_millis=619388520000 since=616796520001 query="SELECT * FROM mz_internal.mz_cluster_replica_statuses"
2024-08-30T13:41:58.972781Z  INFO mz_environmentd::test_util: connection closed
2024-08-30T13:41:58.977524Z  INFO mz_environmentd::test_util: connection error: error communicating with the server: Connection reset by peer (os error 104)
2024-08-30T13:41:58.977562Z  INFO mz_environmentd::test_util: connection closed
test test_utilization_hold has been running for over 60 seconds
2024-08-30T13:42:29.525230Z  WARN mz_adapter::coord: coordinator stuck for 30s last_message_kind=group_commit_initiate last_message_sql=<none>

This looks interesting to me.
ci-regexp: TIMEOUT .* mz-environmentd::sql test_utilization_hold

The text was updated successfully, but these errors were encountered:

chaas · 2024-09-05T22:01:17Z

@def- was this a recurring flake? Seems like something caused the coordinator to get stuck, and wondering if it was a transient issue or an issue with recent changes to the coordinator.

def- · 2024-09-06T06:53:43Z

Checking ci-failures it first occurred on August 21 (in #27720) and then 10 times in other branches after that. Maybe that is already the responsible change?

chaas · 2024-09-06T14:11:51Z

@ParkMyCar Is there anything in that SHOW commands PR that could cause coord stalls? It's especially strange since we're only seeing it in this retain history test case, unless @def- have you seen any other test failures due to timeouts/ the coordinator stuck error?
The catalog object being tested (mz_cluster_replica_statuses) hasn't been modified recently, so I'm thinking perhaps it's an issue with the coord taking a long time when querying retained history objects. Not sure what would've caused that though, will have to ask around

def- added C-bug Category: something is broken ci-flake labels Aug 30, 2024

def- assigned ParkMyCar Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timeout in cargo test: mz-environmentd::sql test_utilization_hold #29299

Timeout in cargo test: mz-environmentd::sql test_utilization_hold #29299

def- commented Aug 30, 2024

chaas commented Sep 5, 2024

def- commented Sep 6, 2024

chaas commented Sep 6, 2024

Timeout in cargo test: mz-environmentd::sql test_utilization_hold #29299

Timeout in cargo test: mz-environmentd::sql test_utilization_hold #29299

Comments

def- commented Aug 30, 2024

What version of Materialize are you using?

What is the issue?

chaas commented Sep 5, 2024

def- commented Sep 6, 2024

chaas commented Sep 6, 2024