You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2024-08-30T13:41:54.351864Z INFO sql: since has not yet advanced to expected time, retrying now_millis=619388520000 since=616796520001 query="SELECT * FROM mz_internal.mz_cluster_replica_statuses"
2024-08-30T13:41:58.972781Z INFO mz_environmentd::test_util: connection closed
2024-08-30T13:41:58.977524Z INFO mz_environmentd::test_util: connection error: error communicating with the server: Connection reset by peer (os error 104)
2024-08-30T13:41:58.977562Z INFO mz_environmentd::test_util: connection closed
test test_utilization_hold has been running for over 60 seconds
2024-08-30T13:42:29.525230Z WARN mz_adapter::coord: coordinator stuck for 30s last_message_kind=group_commit_initiate last_message_sql=<none>
This looks interesting to me.
ci-regexp: TIMEOUT .* mz-environmentd::sql test_utilization_hold
The text was updated successfully, but these errors were encountered:
@def- was this a recurring flake? Seems like something caused the coordinator to get stuck, and wondering if it was a transient issue or an issue with recent changes to the coordinator.
Checking ci-failures it first occurred on August 21 (in #27720) and then 10 times in other branches after that. Maybe that is already the responsible change?
@ParkMyCar Is there anything in that SHOW commands PR that could cause coord stalls? It's especially strange since we're only seeing it in this retain history test case, unless @def- have you seen any other test failures due to timeouts/ the coordinator stuck error?
The catalog object being tested (mz_cluster_replica_statuses) hasn't been modified recently, so I'm thinking perhaps it's an issue with the coord taking a long time when querying retained history objects. Not sure what would've caused that though, will have to ask around
What version of Materialize are you using?
4de94aa
What is the issue?
Seen in Cargo test:
This looks interesting to me.
ci-regexp: TIMEOUT .* mz-environmentd::sql test_utilization_hold
The text was updated successfully, but these errors were encountered: