Conflict with pytest-django and pytest-xdist #277

raphaelm · 2024-10-25T19:39:43Z

We just spent a lot of time debugging a really weird issue and we still haven't quite understood it. However, it occurs only in the combination of

pytest-django
pytest-rerunfailures
pytest-xdist
GitHub actions

Two months ago, a flaky test made it into our codebase. Since then, a good 75% of our GitHub actions runs failed, however only on our test matrix elements that test against PostgreSQL. All of these test failures had dozens, if not hundreds, of error messages like this:

| __________________ ERROR at teardown of test_position_queries __________________
| [gw2] linux -- Python 3.9.20 /opt/hostedtoolcache/Python/3.9.20/x64/bin/python
| 
| self = <DatabaseWrapper vendor='postgresql' alias='default'>, name = None
| 
|     def _cursor(self, name=None):
|         self.close_if_health_check_failed()
|         self.ensure_connection()
|         with self.wrap_database_errors:
| >           return self._prepare_cursor(self.create_cursor(name))
| 
| /opt/hostedtoolcache/Python/3.9.20/x64/lib/python3.9/site-packages/django/db/backends/base/base.py:308: 
| _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
| /opt/hostedtoolcache/Python/3.9.20/x64/lib/python3.9/site-packages/django/utils/asyncio.py:26: in inner
|     return func(*args, **kwargs)
| _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
| 
| self = <DatabaseWrapper vendor='postgresql' alias='default'>, name = None
| 
|     @async_unsafe
|     def create_cursor(self, name=None):
|         if name:
|             # In autocommit mode, the cursor will be used outside of a
|             # transaction, hence use a holdable cursor.
|             cursor = self.connection.cursor(
|                 name, scrollable=False, withhold=self.connection.autocommit
|             )
|         else:
| >           cursor = self.connection.cursor()
| E           psycopg2.InterfaceError: connection already closed
...

A sample run with full log can be found e.g. here:
https://github.com/pretix/pretix/actions/runs/11322864056/job/31484383103

Which is a run of the repository at this commit:
https://github.com/pretix/pretix/tree/40c8d014dfba6e97af0ad40d0b7f4abfd087082a

As you can see, the test ends with a summary of

= 5365 passed, 17 skipped, 2 xfailed, 26 errors, 41 rerun in 772.74s (0:12:52) =

This is the environment running in there:

platform linux -- Python 3.9.20, pytest-8.3.3, pluggy-1.5.0
django: version: 4.2.16, settings: tests.settings (from ini)
rootdir: /home/runner/work/pretix/pretix/src
configfile: setup.cfg
plugins: django-4.9.0, asyncio-0.24.0, rerunfailures-14.0, xdist-3.6.1, cov-5.0.0, mock-3.14.0
asyncio: mode=strict, default_loop_scope=None
created: 3/3 workers
3 workers [5397 items]

However, in reality, only one test should be failing – and this incorrect test is not even part of the failures listed in the output. Probably because it was retried successfully and therefore not listed as a failure.

All of the listed failures are from the same pytest-xdist worker. It looks like the failing test is somehow leaving the database connection in a broken state, and all tests subsequently run on the same worker are failing.

Now, after days of search, we have found the faulty test and after we fixed it, all of the other failures vanished as well.

We have now put some more research in and discovered: If we roll back our fix and then set --reruns 0 or uninstall pytest-rerunfailures, only the failing test fails, and no other tests, as it should be.

In conclusion, I believe that when pytest-rerunfailures causes a test to be retried, not all necessary setup/teardown logic is called. (This could of course just as well be a bug in pytest-django, I did not figure out a way of determining that.)

Has anyone experienced something similar before?

The text was updated successfully, but these errors were encountered:

asottile-sentry · 2024-10-29T18:25:52Z

dupe here: #267 (comment)

raphaelm mentioned this issue Oct 25, 2024

Tests: Remove pytest-rerunfailures pretix/pretix#4572

Merged

icemac closed this as not planned Won't fix, can't repro, duplicate, stale Oct 30, 2024

icemac added the invalid label Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conflict with pytest-django and pytest-xdist #277

Conflict with pytest-django and pytest-xdist #277

raphaelm commented Oct 25, 2024

asottile-sentry commented Oct 29, 2024

Conflict with pytest-django and pytest-xdist #277

Conflict with pytest-django and pytest-xdist #277

Comments

raphaelm commented Oct 25, 2024

asottile-sentry commented Oct 29, 2024