Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protocol error from tungstenite on send is bug in ws_stream_tungstenite #7

Open
Thoralf-M opened this issue Oct 31, 2021 · 23 comments
Open

Comments

@Thoralf-M
Copy link

Just got this error

thread '<unnamed>' panicked at 'internal error: entered unreachable code: protocol error from tungstenite on send is bug in ws_stream_tungstenite, please report', /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/ws_stream_tungstenite-0.6.1/src/tung_websocket.rs:537:13

in a test from our workflow https://github.com/iotaledger/iota.rs/runs/4060223475?check_suite_focus=true

@najamelan
Copy link
Owner

Thanks for reporting. I should have time to look into it tomorrow. ws_stream_tungstenite needs an update anyway, but I have to debug a stack overflow coming from tungstenite with the latest version.

@najamelan
Copy link
Owner

najamelan commented Oct 31, 2021

Do I get it right that it's rumqttc that is invoking ws_stream_tungstenite? Could you point me to the code that is invoking rumqttc? Also, does this only happen with the new version (v0.10.0) of rumqttc?

@Thoralf-M
Copy link
Author

Yes, it's with v0.10.0, the MQTT code is here https://github.com/iotaledger/iota.rs/blob/dev/iota-client/src/node/mqtt.rs and for the python binding here https://github.com/iotaledger/iota.rs/blob/dev/bindings/python/native/src/client/mqtt.rs
(the panic happened only once yet, worked after a retry)

@najamelan
Copy link
Owner

Thanks for clarifying.

(the panic happened only once yet, worked after a retry)

That's unfortunate. Ideally we find a way to reproduce consistently. Is there a command which I can run on the repo offline to reproduce the failing test?

@Thoralf-M
Copy link
Author

Thoralf-M commented Oct 31, 2021

Ideally we find a way to reproduce consistently. Is there a command which I can run on the repo offline to reproduce the failing test?

Agree, unfortunately I couldn't reproduce it yet
You can close the issue if you want and if it happens again and I can reproduce it, I can report it again

@najamelan
Copy link
Owner

Thanks for trying. We can leave it open for now. I will try to look into it anyways, because this should never happen. Maybe tungstenite have changed something since I wrote it.

@najamelan
Copy link
Owner

Im about to roll out a new version with up to date tungstenite. I have not been able to find what provoked this bug. From what I can see in the tungstenite code, the only way to trigger a protocol error on send is by sending after closing the websocket. AFAICT we never do that. So a bug is hiding somewhere... @tekjar if you have any idea how this can happen?

In any case I have added a bit more information on the error variant in case this happens again. If someone can find a way to reproduce it would really help. If not at least a backtrace would already help. I'll leave this open until we find it.

@jyhi
Copy link

jyhi commented Nov 18, 2021

Sadly v0.6.2 is causing build failure with rumqttc specifying a ~0.6 constraint (https://github.com/bytebeamio/rumqtt/blob/master/rumqttc/Cargo.toml#L25), as the updated async_tungstenite v0.16.0 (9036d9b) has data structures that cannot be passed to async_tungstenite v0.13.1 used by rumqttc (https://github.com/bytebeamio/rumqtt/blob/master/rumqttc/Cargo.toml#L24):

error[E0308]: mismatched types
--> /home/lmy441900/repositories/rumqtt/rumqttc/src/eventloop.rs:299:40
|
299 |             Network::new(WsStream::new(socket), options.max_incoming_packet_size)
|                                        ^^^^^^ expected struct `async_tungstenite::WebSocketStream`, found struct `WebSocketStream`
|
= note: expected struct `async_tungstenite::WebSocketStream<_>`
found struct `WebSocketStream<async_tungstenite::stream::Stream<TokioAdapter<tokio::net::TcpStream>, TokioAdapter<tokio_rustls::client::TlsStream<tokio::net::TcpStream>>>>`
= note: perhaps two different versions of crate `async_tungstenite` are being used?

error[E0308]: mismatched types
--> /home/lmy441900/repositories/rumqtt/rumqttc/src/eventloop.rs:316:40
|
316 |             Network::new(WsStream::new(socket), options.max_incoming_packet_size)
|                                        ^^^^^^ expected struct `async_tungstenite::WebSocketStream`, found struct `WebSocketStream`
|
= note: expected struct `async_tungstenite::WebSocketStream<_>`
found struct `WebSocketStream<async_tungstenite::stream::Stream<TokioAdapter<tokio::net::TcpStream>, TokioAdapter<tokio_rustls::client::TlsStream<tokio::net::TcpStream>>>>`
= note: perhaps two different versions of crate `async_tungstenite` are being used?

For more information about this error, try `rustc --explain E0308`.
error: could not compile `rumqttc` due to 2 previous errors

Thus, it's a breaking change and the minor version number should be incremented instead of the patch version number (i.e. v0.7.0).

Also bytebeamio/rumqtt#324

@najamelan
Copy link
Owner

@lmy441900 Oh. Sorry my bad. Updating async-tungstenite is a breaking change. I will fix that.

@najamelan
Copy link
Owner

0.7.0 has been released and 0.6.2 has been yanked.

@najamelan
Copy link
Owner

Hi, can you still reproduce this with the latest version? If not I would like to close the issue.

@Thoralf-M
Copy link
Author

Thoralf-M commented Oct 7, 2023

rumqttc isn't updated to this version yet, but I also never saw this occur again, so I guess it's fine to close

@ondrowan
Copy link

ondrowan commented Dec 7, 2023

The same error has just happened to me while using rumqttc 0.22.0. I've had this version deployed to multiple IoT devices and it has happened on just one of them after a couple of days. I don't really have any more details I could provide at this moment, but I'll try to monitor it more closely.

@najamelan
Copy link
Owner

ah, damn Im sorry to hear that. Can you update to the latest version? The code throwing this error has changed.

@swanandx
Copy link
Contributor

swanandx commented Dec 7, 2023

rumqttc 0.22.0` uses ws_stream_tungstenite "0.10", in rumqttc 0.23.0 ( latest ) we are using "0.11" ( latest ), so updating rumqttc might work as said by @najamelan . Also there aren't any major breaking changes, so update should be smooth.

@ondrowan
Copy link

ondrowan commented Dec 7, 2023

Sorry, I have mistyped and I'm already using rumqttc 0.23.0. It's however a fork that adds support for native-tls when using websocket transport (see bytebeamio/rumqtt#742).

@najamelan
Copy link
Owner

Do you have the exact line number that panicked?

@swanandx
Copy link
Contributor

swanandx commented Dec 7, 2023

OOH that PR! didn't realize it was yours haha I did work on that PR after our discussion ended, even in last week, with / without tokio-native-tls in async_tungstenite as suggested by your desc or PR. But no luck. I couldn't get it working and go so much overwhelmed that decided to tackle it later 😅

will get back with more details :)

@swanandx
Copy link
Contributor

swanandx commented Dec 7, 2023

Do you have the exact line number that panicked?

btw, I didn't use the exact fork ( had some of my changes as mentioned in comments of PR & here ), but I didn't face such panic.

@ondrowan
Copy link

ondrowan commented Dec 7, 2023

Do you have the exact line number that panicked?

This is all the information I've found in logs:

thread 'main' panicked at 'internal error: entered unreachable code: protocol error from tungstenite on send is a bug in ws_stream_tungstenite, please report at http://github.com/najamelan/ws_stream_tungstenite/issues. The error from tungstenite is Sending after closing is not allowed', /home/pi/.cargo/registry/src/index.crates.io-1cd66030c949c28d/ws_stream_tungstenite-0.11.0/src/tung_websocket.rs:540:13

@najamelan
Copy link
Owner

Thanks I'll have a look at that.

@najamelan najamelan reopened this Dec 7, 2023
@ondrowan
Copy link

It seems this isn't as random as I thought it'd be. It eventually happened on all 6 RPis I've deployed this code on - it always took a couple of weeks to emerge though. Unfortunatelly, I still don't have any more clues I could provide :/

@najamelan
Copy link
Owner

ah, sorry to hear. I think there is 2 things we can do, given this should mean that ws_stream_tungstenite violates the WS protocol:

  • code review
  • stress/fuzz testing to make it reproducible.

I just need to find time to dedicate to it as life is a bit hectic ATM. Feel free to review the code, and see the tungstenite code for when they throw this error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants