-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Observed behavior
Problem Description
When two different MQTT users (mapped via mTLS certificate CN) attempt pub/sub communication:
- Publisher sends QoS 1 message and receives PUBACK (message stored in JetStream)
- Subscriber's JetStream consumer receives the message (consumer sequence increments)
- However the subscriber's MQTT client never receives the message
- Messages accumulate as outstanding acks and get redelivered indefinitely
The internal $MQTT.sub.* delivery subject binding appears broken for cross-user scenarios.
Initial though: a Wildcard Translation Bug
Fun fact - it's not a wildcard translation issue.
Initial investigation suggested MQTT # wildcards weren't being translated to NATS .>. This was incorrect. Further testing revealed:
- Consumer filters ARE correct - Both
.>and exact-match consumers are created - Messages ARE delivered to consumers - Consumer sequence increments
- But acknowledgments never come - Ack floor stays at 0
- The bug is in delivery subject binding - Messages go to
$MQTT.sub.xxxbut the client isn't receiving on that subject
Diagnostics
Consumer State Shows Delivery Without Acknowledgment
$ nats consumer info '$MQTT_msgs' '111SUQEl_UO4iyguny1Jz4clB1xwhjy'
Configuration:
Delivery Subject: $MQTT.sub.UO4iyguny1Jz4clB1xwhgO
Filter Subject: $MQTT.msgs.bridge.alex-garage-1.down.> # <-- Filter IS correct!
State:
Last Delivered Message: Consumer sequence: 9 Stream sequence: 289769
Acknowledgment floor: Consumer sequence: 0 Stream sequence: 0 # <-- NOTHING acknowledged!
Outstanding Acks: 3 out of maximum 1024 # <-- Messages stuck
Redelivered Messages: 3 # <-- Being redelivered
Unprocessed Messages: 0
This proves:
- The consumer filter is correct (includes
.>) - Messages ARE being delivered (consumer sequence = 9)
- But the client NEVER acknowledges (ack floor = 0)
- Messages accumulate and get redelivered indefinitely
Consumer Lifecycle is Correct
Testing confirmed that consumers are properly cleaned up on disconnect and recreated on connect:
# Before reconnect:
0EuYuueO_5epcT7g45uz49HkuBdAjBh (Delivery: $MQTT.sub.5epcT7g45uz49HkuBdAj8M)
# After reconnect - old consumer deleted, new one created:
0EuYuueO_UO4iyguny1Jz4clB1xwds4 (Delivery: $MQTT.sub.UO4iyguny1Jz4clB1xwdoU)
MQTT Wildcard Translation is Correct
NATS correctly creates two consumers for # wildcard subscriptions:
| Consumer | Filter Subject | Purpose |
|---|---|---|
...hjy |
$MQTT.msgs.bridge.alex-garage-1.down.> |
Matches subtopics |
...hyI |
$MQTT.msgs.bridge.alex-garage-1.down |
Matches exact topic |
This is correct behavior since MQTT # matches zero or more levels, but NATS .> matches one or more.
NATS Debug Logs
With -DV flags enabled:
# Device subscribes
[TRC] "[email protected]" - <<- [SUBSCRIBE [bridge/alex-garage-1/down/# QoS=1] pi=53284]
[TRC] "[email protected]" - ->> [SUBACK pi=53284]
# Service publishes
[TRC] "[email protected]" - <<- [PUBLISH bridge/alex-garage-1/down/keys/response QoS=1 size=152 pi=4]
[TRC] "[email protected]" - ->> [PUBACK pi=4]
# NOTE: No message forwarded to device!
Theory
I'm not the most privy to NATs jetstream user/subject bindings, however, something smells with how NATS 2.12 binds the JetStream consumer's delivery subject to the MQTT client's session in cross-user scenarios.
- Device subscribes to
bridge/alex-garage-1/down/# - NATS creates JetStream consumer with delivery subject
$MQTT.sub.xxx - In same-user scenarios, the internal subscription on
$MQTT.sub.xxxis properly connected - In cross-user scenarios, the binding is broken - messages are delivered to
$MQTT.sub.xxxbut the MQTT session isn't receiving from that subject - Messages accumulate as outstanding acks since the client can't acknowledge what it never received
Additional Observations
- Cross-user delivery fails: Messages between different MQTT users (mapped via mTLS certs) are not delivered
- Same-user delivery works: When publisher and subscriber use the same certificate, delivery works perfectly
- Messages are stored: The
$MQTT_msgsstream receives and stores the messages correctly - Consumer filters are correct: The
.>wildcard is properly added to filter subjects - Consumer lifecycle is correct: Consumers are properly created/deleted on connect/disconnect
- Delivery attempts happen: Consumer sequence increments, showing NATS tries to deliver
- Acknowledgments never come: Ack floor stays at 0, messages redelivered indefinitely
Workaround
None known. Downgrading to NATS 2.11.11 restores correct behavior
Expected behavior
Test Scenario
Subscriber (Device):
- Connects via MQTT with mTLS (certificate CN:
[email protected]) - Subscribes to:
bridge/alex-garage-1/down/#with QoS 1
Publisher (Service):
- Connects via MQTT with mTLS (certificate CN:
[email protected]) - Publishes to:
bridge/alex-garage-1/down/keys/responsewith QoS 1
Result:
- Publisher receives PUBACK (message stored in JetStream)
- Subscriber receives the message
- Consumer shows messages delivered and acknowledged
Server and client version
Environment
- NATS Server Version: 2.12.3-alpine
- Previous Working Version: 2.11.x (issue appeared after upgrade)
- Protocol: MQTT over TLS (port 8883)
- Authentication: mTLS with
verify_and_map: true - JetStream: Enabled
- Accounts: Using multi-account setup (SYS + APP accounts)
Host environment
Running in a docker container on an ec2 instance running Amazon Linux 2023
Steps to reproduce
Reproduction Steps
1. NATS Configuration
# nats.conf
server_name: nats-mqtt
port: 4222
http: 8222
jetstream: {
store_dir: "/data/jetstream"
}
include "/includes/users.inc"
mqtt {
port: 1883
}
mqtt {
host: 0.0.0.0
port: 8883
tls {
cert_file: "/etc/nats/certs/server-cert.pem"
key_file: "/etc/nats/certs/server-key.pem"
ca_file: "/etc/nats/certs/root-ca.pem"
verify_and_map: true
}
}
2. Users Configuration (users.inc)
accounts {
SYS {
users = [
{ user: "admin", password: "...", permissions: { publish: [">"], subscribe: [">"] } }
]
}
APP {
jetstream { max_file: 25Gb }
users = [
{ user: "[email protected]", permissions: {"publish": [">"], "subscribe": [">", "$MQTT.sub.>"]}, allowed_connection_types: ["MQTT"] },
{ user: "[email protected]", permissions: {"publish": [">"], "subscribe": [">", "$MQTT.sub.>"]}, allowed_connection_types: ["MQTT"] }
]
}
}
system_account: SYS
3. Test Scenario
Subscriber (Device):
- Connects via MQTT with mTLS (certificate CN:
[email protected]) - Subscribes to:
bridge/alex-garage-1/down/#with QoS 1
Publisher (Service):
- Connects via MQTT with mTLS (certificate CN:
[email protected]) - Publishes to:
bridge/alex-garage-1/down/keys/responsewith QoS 1
Result:
- Publisher receives PUBACK (message stored in JetStream)
- Subscriber NEVER receives the message
- Consumer shows messages delivered but never acknowledged
4. Same-User Test (Works)
When both publisher and subscriber use the same certificate/user, message delivery works correctly. This rules out permission issues and confirms the bug is specific to cross-user scenarios.