Skip to content

fix(pubsub): avoid UnicodeDecodeError on reconnect with binary channel names#3944

Open
skylarkoo7 wants to merge 2 commits intoredis:masterfrom
skylarkoo7:fix-3912-binary-channel-reconnect
Open

fix(pubsub): avoid UnicodeDecodeError on reconnect with binary channel names#3944
skylarkoo7 wants to merge 2 commits intoredis:masterfrom
skylarkoo7:fix-3912-binary-channel-reconnect

Conversation

@skylarkoo7
Copy link

Summary

When a binary channel name that is not valid UTF-8 is subscribed via positional arguments (without a callback handler), the PubSub.on_connect reconnection logic raises UnicodeDecodeError. This happens because on_connect force-decodes every stored channel key to pass it as a keyword argument to subscribe()/psubscribe().

The fix splits channels into two groups during reconnection:

  • With handlers (originally passed as kwargs): decoded and re-passed as kwargs — their names are guaranteed decodable since Python requires string keyword arguments
  • Without handlers (originally passed as positional args): passed as raw positional args — no decoding required, preserving arbitrary binary names

Changes

  • redis/client.py — sync PubSub.on_connect: split channels, patterns, and shard_channels by handler presence
  • redis/asyncio/client.py — async PubSub.on_connect: same split for channels and patterns
  • tests/test_pubsub.py — added test_resubscribe_binary_channel_on_reconnection and test_resubscribe_binary_pattern_on_reconnection
  • tests/test_asyncio/test_pubsub.py — async equivalents of the above tests

Test plan

  • New tests subscribe to deliberately invalid UTF-8 channel/pattern names, force a disconnect, then verify get_message() triggers reconnection and resubscription without raising UnicodeDecodeError
  • Existing reconnection tests (test_resubscribe_to_channels_on_reconnection, test_resubscribe_to_patterns_on_reconnection) continue to pass, confirming backward compatibility for string channel names with and without handlers

Fixes #3912

…l names

Channels subscribed as positional arguments (without a callback handler)
may carry binary names that are not valid in the connection's encoding
(e.g. arbitrary bytes that are not valid UTF-8).  The existing
`on_connect` method decoded every channel name via `force=True` to pass
them as keyword arguments to `subscribe`/`psubscribe`, which raised
`UnicodeDecodeError` for these channels.

Split the reconnection logic: channels with handlers are decoded and
passed as kwargs (they were originally subscribed as kwargs, so their
names are guaranteed decodable); channels without handlers are passed as
positional args, preserving the original bytes.

Applied the same fix to:
- async PubSub.on_connect (channels and patterns)
- sync PubSub.on_connect (channels, patterns, and shard_channels)

Added tests for binary channel and pattern reconnection in both sync
and async test suites.

Fixes redis#3912
@jit-ci
Copy link

jit-ci bot commented Feb 9, 2026

Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset.

In case there are security findings, they will be communicated to you as a comment inside the PR.

Hope you’ll enjoy using Jit.

Questions? Comments? Want to learn more? Get in touch with us.

@petyaslavova
Copy link
Collaborator

Hi @skylarkoo7, thank you for your contribution! We will have a look at it.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes PubSub reconnection resubscribe logic to avoid UnicodeDecodeError when previously-subscribed channel/pattern names are arbitrary binary (non-UTF-8) and were originally subscribed without handlers (positional args).

Changes:

  • Update sync and asyncio PubSub.on_connect to split stored subscriptions into “with handlers” (replayed as decoded **kwargs) vs “without handlers” (replayed as raw positional args).
  • Add sync + asyncio tests covering reconnect/resubscribe for invalid UTF-8 binary channels and patterns.
  • Minor test assertion formatting adjustment.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
redis/client.py Sync PubSub.on_connect now resubscribes binary/no-handler channels/patterns/shard channels as positional args to avoid forced decoding.
redis/asyncio/client.py Async PubSub.on_connect uses the same handler-aware split for channels and patterns.
tests/test_pubsub.py Adds regression tests for binary channel/pattern resubscription on reconnect (sync).
tests/test_asyncio/test_pubsub.py Adds async equivalents of the new binary reconnect/resubscribe regression tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +215 to +221
messages = []
for _ in range(1):
messages.append(wait_for_message(p))

assert len(messages) == 1
assert messages[0]["type"] == "subscribe"
assert messages[0]["channel"] == binary_channel
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait_for_message(p) can return None on timeout; appending it and then indexing messages[0]["type"] will raise a TypeError and obscure the real failure. Consider asserting the returned message is not None (or directly asserting equality vs an expected subscribe/psubscribe message) before subscripting it.

Copilot uses AI. Check for mistakes.
Comment on lines +178 to +185
messages = []
for _ in range(1):
messages.append(await wait_for_message(p))

assert len(messages) == 1
assert messages[0]["type"] == "subscribe"
assert messages[0]["channel"] == binary_channel

Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait_for_message(p) can return None on timeout; appending it and then indexing messages[0]["type"] will raise a TypeError and obscure the underlying failure. Consider asserting the returned message is not None (or directly asserting equality vs an expected subscribe/psubscribe message) before subscripting it.

Copilot uses AI. Check for mistakes.
for _ in range(1):
messages.append(await wait_for_message(p))

assert len(messages) == 1
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the channel test: if wait_for_message(p) times out and returns None, subscripting messages[0] will raise TypeError rather than producing a clear assertion failure. Add an explicit is not None assertion (or compare to an expected psubscribe message) before accessing keys.

Suggested change
assert len(messages) == 1
assert len(messages) == 1
assert messages[0] is not None

Copilot uses AI. Check for mistakes.
Comment on lines +237 to +243
messages = []
for _ in range(1):
messages.append(wait_for_message(p))

assert len(messages) == 1
assert messages[0]["type"] == "psubscribe"
assert messages[0]["channel"] == binary_pattern
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the channel test: if wait_for_message(p) times out and returns None, subscripting messages[0] will raise TypeError and make failures harder to diagnose. Add an explicit is not None assertion (or compare to an expected psubscribe message) before accessing messages[0].

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Subscribing to a binary channel name triggers UnicodeDecodeError on reconnect

2 participants