Skip to content

OSError: Connection Unstable #161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sgbaird opened this issue Jan 8, 2025 · 1 comment
Open

OSError: Connection Unstable #161

sgbaird opened this issue Jan 8, 2025 · 1 comment

Comments

@sgbaird
Copy link
Contributor

sgbaird commented Jan 8, 2025

Someone reported the following to me after running the example from https://ac-microcourses.readthedocs.io/en/latest/courses/hello-world/1.4-hardware-software-communication.html on a Raspberry Pi Pico W.

Obtaining CA Certificate
Checking WiFi integrity.
Traceback (most recent call last):
File "", line 119, in
File "asyncio/core.py", line 1, in run
File "asyncio/core.py", line 1, in run_until_complete
File "asyncio/core.py", line 1, in run_until_complete
File "", line 102, in main
File "mqtt_as.py", line 654, in connect
File "mqtt_as.py", line 646, in wifi_connect
OSError: Connection Unstable

my_secrets.py

SSID = "Enter your SSID here"
PASSWORD = "Enter your WiFi password here"
HIVEMQ_USERNAME = "sgbaird"
HIVEMQ_PASSWORD = "D.Pq5gYtejYbU#L"
HIVEMQ_HOST = "248cc294c37642359297f75b7b023374.s2.eu.hivemq.cloud"

mqtt_led.py

from mqtt_as import MQTTClient, config
from machine import Pin, ADC
import asyncio
from netman import connectWiFi
import ssl
import ntptime
from time import time

from my_secrets import (
    HIVEMQ_HOST,
    HIVEMQ_PASSWORD,
    HIVEMQ_USERNAME,
    PASSWORD,
    SSID,
)

connectWiFi(SSID, PASSWORD, country="US")

# usually would be a device-specific ID, but using course ID for now
COURSE_ID = "<your_id_here>"  # UPDATE THIS TO YOUR ID

# To validate certificates, a valid time is required
ntptime.timeout = 30  # type: ignore
ntptime.host = "pool.ntp.org"
ntptime.settime()

print("Obtaining CA Certificate")
# generated via https://colab.research.google.com/github/sparks-baird/self-driving-lab-demo/blob/main/notebooks/7.2.1-hivemq-openssl-certificate.ipynb # noqa: E501
with open("hivemq-com-chain.der", "rb") as f:
    cacert = f.read()
    f.close()

# Local configuration
config.update(
    {
        "ssid": SSID,
        "wifi_pw": PASSWORD,
        "server": HIVEMQ_HOST,
        "user": HIVEMQ_USERNAME,
        "password": HIVEMQ_PASSWORD,
        "ssl": True,
        "ssl_params": {
            "server_side": False,
            "key": None,
            "cert": None,
            "cert_reqs": ssl.CERT_REQUIRED,
            "cadata": cacert,
            "server_hostname": HIVEMQ_HOST,
        },
        "keepalive": 3600,
    }
)

onboard_led = Pin("LED", Pin.OUT)  # Pico W is slightly different than Pico

command_topic = f"{COURSE_ID}/onboard_led"
sensor_data_topic = f"{COURSE_ID}/onboard_temp"

adcpin = 4
sensor = ADC(adcpin)


def ReadTemperature():
    adc_value = sensor.read_u16()
    volt = (3.3 / 65535) * adc_value
    temperature = 27 - (volt - 0.706) / 0.001721
    # internal temp sensor has low precision, so round to 1 decimal place
    return round(temperature, 1)


async def messages(client):  # Respond to incoming messages
    async for topic, msg, retained in client.queue:
        try:
            topic = topic.decode()
            msg = msg.decode()
            retained = str(retained)
            print((topic, msg, retained))

            if topic == command_topic:
                if msg == "on":
                    onboard_led.on()
                elif msg == "off":
                    onboard_led.off()
                elif msg == "toggle":
                    onboard_led.toggle()
                temperature = ReadTemperature()
                print(f"Publish {temperature} to {sensor_data_topic}")
                # If WiFi is down the following will pause for the duration.
                await client.publish(sensor_data_topic, f"{temperature}", qos=1)
        except Exception as e:
            print(e)


async def up(client):  # Respond to connectivity being (re)established
    while True:
        await client.up.wait()  # Wait on an Event
        client.up.clear()
        await client.subscribe(command_topic, 1)  # renew subscriptions


async def main(client):
    await client.connect()
    for coroutine in (up, messages):
        asyncio.create_task(coroutine(client))

    start_time = time()
    # must have the while True loop to keep the program running
    while True:
        await asyncio.sleep(5)
        elapsed_time = round(time() - start_time)
        print(f"Elapsed: {elapsed_time}s")


config["queue_len"] = 2  # Use event interface with specified queue length
MQTTClient.DEBUG = True  # Optional: print diagnostic messages
client = MQTTClient(config)
del cacert  # to free memory
try:
    asyncio.run(main(client))
finally:
    client.close()  # Prevent LmacRxBlk:1 errors

Expected output:

MPY: soft reboot
MAC address: <...>
connected
ip = <...>
Obtaining CA Certificate
Checking WiFi integrity.
Got reliable connection
Connecting to broker.
Connected to broker.
Elapsed: 10s
Elapsed: 20s
RAM free 119776 alloc 57504
Elapsed: 30s
...

They mentioned they also tried with another WiFi (phone hotspot).

Any ideas or hints here? I don't think I've seen this one before. Interestingly, mqtt.simple was working fine for them in a separate example (https://github.com/sparks-baird/self-driving-lab-demo/blob/cadf5cd8f5ccaff3394afa1791b9eac55fed94d0/src/public_mqtt_sdl_demo/main.py#L278).

The error seems to be raised here: https://github.com/peterhinch/micropython-mqtt/blob/94e4814aace5a1ea1fa4f500c71133fa03342ae2/mqtt_as/__init__.py#L779-788

@peterhinch
Copy link
Owner

This is indicative of marginal signal strength. When mqtt_as initially connects to WiFi it repeatedly verifies connectivity for about five seconds. If connectivity is lost in this period, the reported error is thrown. If the test passes, the module goes on to establish a link with the broker. From that point reconnections are handled automatically.

The main purpose of the test is to cater for clients which can move around. If a client moves out of range the link fails. If it moves back to a point where connectivity is marginal, the reconnection attempt is aborted and re-tried. The aim is to delay reconnection until WiFi is reasonably stable.

In the case of initial connection the policy is to abandon if any error occurs. This is because errors on initial connection (rather than reconnection) usually require user intervention (password errors and suchlike). It is, of course, possible for the application to trap the exception.

The case of initialising with a marginal WiFi network is something of a special case. It is possible to prevent the test from taking place (see docs) but the purpose of this is to save time when the WiFi is known good. I don't recommend this in your situation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants