-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"SOCKET timeouts" causing lockups of an entire device when communicating with backend. #48
Comments
I have some of these NB boards too and have the same issue! Would love to know how to fix this. |
Hi, do you need your devices to be permanently connected to the platform? NB devices use to sleep most of the time, then wake up, connect to the internet, and transmit data, especially if they are powered by batteries. Building reliable NB-IOT solutions requires some more engineering according to the specific use case, and probably the general-purpose Arduino library for thinger.io is not the best approach here.
|
Hi @alvarolb Thanks for your reply. Yes, we do indeed need to be connected to the platform permanently. Realise the NB device is not a good system to be using long term but we chose it so that we could get our system up and running as fast as possible and iterate quickly from there. The reliability doesn't need to be perfect but right now we are dependent on getting or current solution working as best we can so that we can demo it for an investment round. For that reason we would love to find a viable workaround or solution. The requirements are:
Currently the problem is that the device loses connection so regularly that it needs to be reset with a watch dog timer so many times that it is unpractical and burns extra battery. Could you point me in a rough direction to try and fix this lockup? Like how could I get it to try again if I get this socket fail error? I haven't tried to monitor the connection with AT+CEREG yet but I will try that. Could I then easily trigger a reconnect if I detect it has been lost? We have tested the library without peripherals a while ago but will try test in the same scenario we are getting these issues. I have a basic RTOS in place yes. There are not a crazy amount of tasks although the GPS task can take up to 100ms. What is the maximum time you would recommend between handle() runs? All of our NB SARA chips get the latest firmware version (at least I think, it's L0.0.00.00.05.08,A.02.04) before we use them. Thanks in advance! |
Please, review the firmware as I think the latest is 05.12. I have read many issues regarding the MKRNB1500 stability, especially when the modem hangs. In the meanwhile, I have released a new Arduino Library 2.26.0 to try to improve the connection stability. It has not been tested properly, so, try it and let me if it improves something.
100ms will not be a problem. You can call practically at any rate under a minute. But It will make the device less responsive to API requests, i.e., calling it every 5 seconds, you can expect a 5 seconds delay when calling a device function. |
Hi @alvarolb Thanks for the info, we have tried upgrading the firmware to 0.5.12 (was a mission) but it does not solve the issue. The new version of which library exactly? How do I find it? Thanks! |
Hi, I released a new Arduino library for Thinger.io with version 2.26.0. Update it via Arduino IDE. |
Hi @alvarolb Just to update you, we have updated to the latest library version and we are still getting the same errors. Is there anything/anywhere you could point us to so that we could try and get to the bottom of this issue ourselves. Thanks in advance. Chris |
I have an MKRNB1500 here and will test it today. Just curious, what is your network provider? |
Just received an MKRNB1500 and have it connected with a basic sketch. I'll update you on its performance. Have you experimented with different SIM cards or antennas? On another note, I've come across some issues related to the MKRNB1500, with numerous customers reporting errors, firmware problems, and hangs. It's concerning that Arduino doesn't seem to maintain or support this hardware, and there are no responses on their forums. At thinger.io, we're using custom NB-IOT hardware based on ESP32 and Quectel BC660K for two different projects. Is there a specific reason you need to use the MRKNB1500? Perhaps we could explore alternative options. |
Hi @alvarolb Our network provider is KPN here in The Netherlands. We experimented with Tele2 but found KPN to be more reliable. We haven't experimented with antennas yet. Is there anything you would recommend? Aware of the issues with the MKRNB, we also have problems with the device locking up and we have built circuitry in our device to perform a hard reset of the SARA module when we detect this issue and that seems to fix it. The problem outlined in this thread though I am reasonably certain that it is a software issue on the arduino side (and I think in the thinger library) as it is fixed buy just a software reset on the arduino. We have chose the MKRNB systems for their speed to develop on for the particular prototypes we are building. We need these to work until the end of October so we can get investment and then we will look for more reliable alternatives so would be happy to discuss your solution then. Curious on the results from your testing with the MKRNB? Chris |
I think it is not a problem with the Thinger.io Arduino library, but a bad implementation on the MKRNB libraries, those that are responsible for talking to the modem via AT commands. You can make your own tests: just create a simple sketch with other protocols, like MQTT, and check how it behaves. Looking at the number of issues on the forums with the MKRNB1500 (that are not using tinger.io), I am certainly sure the library is stuck somewhere else waiting for a response from the modem or something similar. In my first attempt, the MKRNB1500 was connected for 8 hours, then disconnected. Will keep checking it those days. |
I am using currently using the thinger library on about 20 MKR NB devices, that connect over LTE-M or NBIOT, which is soon about to jump up to 80 devices. For that reason I desperately need a solution to this problem that I am having.
Basically, there seems to be three scenarios where the thinger library causes my devices to lockup and the only way to recover them is to use a watch-dog timer and reset the devices when detected. This has been an OK solution until now however it is happening so frequently (once per hour per device on average) that it effects the the battery life of my devices as they need to go through the startup sequence every time.
These three different errors that I get in this scenario are "Writing bytes [FAIL]", “[_SOCKET] cannot read from socket!” and "[_SOCKET] Timeout!". All cause my device to lock up indefinitely. Screenshots are attached.
The SOCKET timeouts seems to happen more frequently in some of the afternoons. It seems like then there are more people in our office building and potentially in the buildings around us (more devices connecting using the network?)
It has been a problem the whole time I have been using this library with this device but I solved it temporarily with a watchdog reset.
Someone else also seems to have had a similar issue when using the GSM version on the MKR https://community.thinger.io/t/mkr-gsm-1400-losing-connection-to-thinger-io/2991
Can someone help with this issue ASAP as it is causing us a lot of downstream problems with our product.
Thanks!
The text was updated successfully, but these errors were encountered: