Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testFailingPodmanService fails very often with Firefox 135 #1999

Closed
martinpitt opened this issue Feb 11, 2025 · 7 comments · Fixed by cockpit-project/cockpit#21609
Closed
Assignees

Comments

@martinpitt
Copy link
Member

martinpitt commented Feb 11, 2025

Yesterday we got several failures from upstream podman PRs, e.g. containers/podman#25277 (comment) or containers/podman#25280 (comment) or containers/podman#25281 (comment) . The failure always looks the same:

Traceback (most recent call last):
  File "/source/test/check-application", line 1650, in testFailingPodmanService
    b.click("#app .pf-v5-c-empty-state button")
[...] TimeoutError

The screenshot shows that it's on the Services page, not on the podman page. This seems like a bug in the test itself, not a regression in rawhide. It also happened on Fedora 41

This is urgent as it makes a lot of noise in podman.

I also just saw it in cockpit-project/cockpit#21606

@martinpitt
Copy link
Member Author

This isn't visible on the weather report, that's completely sunny.

@martinpitt
Copy link
Member Author

This smells like an issue with Firefox 135? The screenshot tells us that the click into the empty state clearly worked, as the screenshot shows the services page. But somehow that click timed out as it messed up the internal state.

To my shame the PR that moved to the new container #1998 already showed that issue, but only on centos-9/aarch64. I've written that off too fast as a weird rare flake, but it turns out it 's a rather harmful flake.

As immediate mitigation I'll revert the tasks update.

martinpitt added a commit to martinpitt/cockpit-podman that referenced this issue Feb 11, 2025
This causes weird failures in testFailingPodmanService with Firefox 135,
see cockpit-project#1999. This breaks too many podman upstream PRs (which run
cockpit-podman reverse dependency tests).

This reverts commit 8f78a848de6c49612t 947559fbd46893c1d72141d.
@martinpitt martinpitt changed the title testFailingPodmanService fails very often testFailingPodmanService fails very often with Firefox 135 Feb 11, 2025
@martinpitt
Copy link
Member Author

I reproduced this locally:

while TEST_OS=fedora-rawhide TEST_BROWSER=firefox test/check-application TestApplication.testFailingPodmanService -stv $RUNC; do : ; done

It failed 2 out of 5 runs, so easy enough to catch.

The ph_find_scroll_into_view() for the empty state button still succeeded, then it does the click:

INFO:bidi.command:← input.performActions({'context': '12884901891', 'actions': [{'id': 'pointer-36', 'type': 'pointer', 'parameters': {'pointerType': 'mouse'}, 'actions': [{'type': 'pointerMove', 'x': 0, 'y': 0, 'origin': {'type': 'element', 'element': {'type': 'node', 'sharedId': '170bd294-9c78-4b0b-ac77-9c10ee028a79', 'value': {'nodeType': 1, 'localName': 'button', 'namespaceURI': 'http://www.w3.org/1999/xhtml', 'childNodeCount': 1, 'attributes': {'aria-disabled': 'false', 'class': 'pf-v5-c-button pf-m-primary', 'type': 'button', 'data-ouia-component-type': 'PF5/Button', 'data-ouia-safe': 'true', 'data-ouia-component-id': 'OUIA-Generated-Button-primary-1'}, 'shadowRoot': None}}}}, {'type': 'pointerDown', 'button': 0}, {'type': 'pointerUp', 'button': 0}]}]}) [id 36]

and that never gets a response:

  File "/var/home/martin/upstream/cockpit-podman/test/common/testlib.py", line 586, in mouse
    self.bidi("input.performActions", context=self.driver.context, actions=keys_pre + actions + keys_post)
    ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/martin/upstream/cockpit-podman/test/common/testlib.py", line 337, in bidi
    return self.run_async(self.driver.bidi(method, **params))
           ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/martin/upstream/cockpit-podman/test/common/testlib.py", line 319, in run_async
    return asyncio.run_coroutine_threadsafe(coro, self.loop).result()
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/usr/lib64/python3.13/concurrent/futures/_base.py", line 456, in result
    return self.__get_result()
           ~~~~~~~~~~~~~~~~~^^
  File "/usr/lib64/python3.13/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/var/home/martin/upstream/cockpit-podman/test/common/webdriver_bidi.py", line 292, in bidi
    res = await asyncio.wait_for(future, timeout=timeout)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.13/asyncio/tasks.py", line 506, in wait_for
    async with timeouts.timeout(timeout):
               ~~~~~~~~~~~~~~~~^^^^^^^^^
  File "/usr/lib64/python3.13/asyncio/timeouts.py", line 116, in __aexit__
    raise TimeoutError from exc_val

There are also no BiDi events after this click. So this is very clearly a firefox bug -- if the click changes the URL or frame (or whatever else makes it special), it never gets a proper response from the BiDi driver.

@martinpitt
Copy link
Member Author

I tried withwebdriver_bidi.log_proto debug logging. Like every good Heisenbug this makes it much harder to catch. However, it doesn't show anything new: the only message from BiDi is this log message (which is expected as we just killed podman.service):

DEBUG:bidi.proto:ws TEXT → {'type': 'event', 'method': 'log.entryAdded', 'params': {'type': 'console', 'method': 'log', 'source': {'realm': '893383c8-a40a-4db4-af4b-c679a534d8f8', 'context': '12884901889'}, 'args': [{'type': 'object', 'value': [['problem', {'type': 'null'}], ['name', {'type': 'string', 'value': 'org.freedesktop.DBus.Error.UnknownInterface'}], ['message', {'type': 'string', 'value': "Unknown interface 'org.freedesktop.systemd1.Service'."}], ['toString', {'type': 'function'}]]}], 'level': 'info', 'text': '[object Object]', 'timestamp': 1739257737558}}
> info: [object Object]

So that doesn't help -- Firefox is broken 😢

@martinpitt
Copy link
Member Author

I reported this to https://bugzilla.mozilla.org/show_bug.cgi?id=1947402

jelly pushed a commit that referenced this issue Feb 11, 2025
This causes weird failures in testFailingPodmanService with Firefox 135,
see #1999. This breaks too many podman upstream PRs (which run
cockpit-podman reverse dependency tests).

This reverts commit 8f78a848de6c49612t 947559fbd46893c1d72141d.
@martinpitt martinpitt moved this from urgent to detriment in Pilot tasks Feb 11, 2025
@martinpitt
Copy link
Member Author

The revert in #2000 landed, so downgrading from urgent to detriment.

@martinpitt martinpitt moved this from detriment to minor in Pilot tasks Feb 11, 2025
@martinpitt martinpitt moved this from minor to detriment in Pilot tasks Feb 11, 2025
martinpitt added a commit to martinpitt/cockpit that referenced this issue Feb 11, 2025
Firefox 135 enabled async BiDi event dispatching [1]. This causes some
regression where sometimes input events never get a response [2], which
often breaks e.g. cockpit-podman's `testFailingPodmanService`.

Re-disable async events for the time being to stabilize tests, until
this gets debugged and fixed properly.

Fixes cockpit-project/cockpit-podman#1999

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1922077
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1947402
@martinpitt
Copy link
Member Author

Worked around in cockpit-project/cockpit#21609 , and I'll communicate directly on the upstream bug with further tracking down the bug with async events.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant