-
Notifications
You must be signed in to change notification settings - Fork 7.6k
Error: Core 1 panic'ed (Unhandled debug exception) #6010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
update: Guru Meditation Error: Core 1 panic'ed (Unhandled debug exception) ELF file SHA256: 0000000000000000 Backtrace: 0x400d3254:0x3ffb0030 0x400d3821:0x3ffb0170 0x400d3821:0x3ffb02b0 0x400d3821:0x3ffb03f0 0x400d3821:0x3ffb0530 0x400d3821:0x3ffb0670 0x400d3821:0x3ffb07b0 0x400d3821:0x3ffb08f0 0x400d3821:0x3ffb0a30 0x400d3821:0x3ffb0b70 0x400d3821:0x3ffb0cb0 0x400d3821:0x3ffb0df0 0x400d3821:0x3ffb0f30 0x400d3821:0x3ffb1070 0x400d3821:0x3ffb11b0 0x400d3821:0x3ffb12f0 0x400d3821:0x3ffb1430 0x400d3821:0x3ffb1570 0x400d3821:0x3ffb16b0 0x400d3821:0x3ffb17f0 0x400d3821:0x3ffb1930 0x400d3821:0x3ffb1a70 0x400d3821:0x3ffb1bb0 0x400d4968:0x3ffb1cf0 0x400d6867:0x3ffb1e30 0x400d71d0:0x3ffb1ef0 0x400e3559:0x3ffb1fb0 0x4008a1fe:0x3ffb1fd0 Rebooting... |
@dsyleixa Some nice explanation about possible reasons for this error can be found at this link: Good Luck! |
This tool will help you in debugging this issue: You can decode the backtrace message and find out where the exception was thrown. |
I did this, but I don't understand what to c+p into the field and how to proceed 0x400d3254:0x3ffb0030 0x400d3821:0x3ffb0170 0x400d3821:0x3ffb02b0 0x400d3821:0x3ffb03f0 0x400d3821:0x3ffb0530 0x400d3821:0x3ffb0670 0x400d3821:0x3ffb07b0 0x400d3821:0x3ffb08f0 0x400d3821:0x3ffb0a30 0x400d3821:0x3ffb0b70 0x400d3821:0x3ffb0cb0 0x400d3821:0x3ffb0df0 0x400d3821:0x3ffb0f30 0x400d3821:0x3ffb1070 0x400d3821:0x3ffb11b0 0x400d3821:0x3ffb12f0 0x400d3821:0x3ffb1430 0x400d3821:0x3ffb1570 0x400d3821:0x3ffb16b0 0x400d3821:0x3ffb17f0 0x400d3821:0x3ffb1930 0x400d3821:0x3ffb1a70 0x400d3821:0x3ffb1bb0 0x400d4968:0x3ffb1cf0 0x400d6867:0x3ffb1e30 0x400d71d0:0x3ffb1ef0 0x400e3559:0x3ffb1fb0 0x4008a1fe:0x3ffb1fd0 then nothing happens. |
PS, just to mention, |
@dsyleixa This has nothing to do with the task scheduler, this is entirely on using more than 8kb of stack space in |
Hmmmh... Apart from Chess(), the entire program runs fine on the ESP32 through setup() and loop() and also may call other subprograms such as Paint() or Pong() without any issues. So all runs fine >>>>>>>> untill Chess() is run. But also the first couples of chess moves are always fine though, so also Chess() does not violate the RAM size limit when starting. Furthermore for Chess(), the error happens not always and not reproducably, OTOH, I meanwhile tested the Chess subprogram also on my Mega2560 too (because of smaller RAM than on Due or Raspi) , and also over there it always runs fine, when compiling, the IDE says:
I am completely at a loss, tbh... |
IDE output of memory usage is not applicable to task stack sizing. It is only applicable to global variable allocations (one that you don't create via new/malloc/etc) and for overall size of the program with respect to the partition size. Comparing ESP32 to an AVR Mega2650 is not a good argument for "it works", they are entirely different architectures AND the Mega2650 does not use task stacks but instead allocates on heap directly which is not applicable in an RTOS environment. Since you have not shared much in the way of code nobody will be able to point out where your program is going awry other than general ideas like @SuGlider and I've posted. |
I actually already shared the code above, in the TOP: |
My general guess is about Stack Overwflow because of potential Chess recursion. The main difference from ESP32 Arduino to other Chips Arduino is that in ESP32 everything is running under FreeRTOS, thus, as @atanisoft said, For the other Chips, Arduino is built as a pure Bare Metal application and Stack can possibly reach higher limits in available RAM, depending on the way it was built and configured. So it could explain why you don't see any errors with other "Chip Arduinos". From the link I posted there is a general explanation:
int count(i) {
i--;
if(i > 0) {
Serial.println(count(i));
}
return i;
}
void loop() {
count(8000);
} Each time a function recurses, its return address and its arguments and local variables are all stored on the stack. If it recurses too many times it will use more storage than is allocated to the stack. |
well, as already stated, if it was a RAM issue then it's supposed to happen always reproducably at the same time, but it does not! sometimes the move generator crashes already at the 2nd or 5th recursive ply after ~50000 move computations or even less,
|
update:
Nonetheless, after pasting the Backtrace into the Exception decoder then still nothing happens at all... you may check a downstripped standallone version here (no TFT hardware etc): BTW, is it possible to disable this eff*** "Guru"...? |
Suffice to say, recursion within tasks is not an easy problem to solve. It's not really a task that is designed to run on an embedded RTOS platform entirely. However, as noted previously, you can create a task with a larger stack size to run your recursion process.
This is coming from the pre-built ESP-IDF code with the default setting of |
I do not run a recursion in a task. But that eff*** Guru error happens at either program - nonetheless, never any runtime errors e.g. on my MEGA or my DUE. |
Both https://github.com/dsyleixa/Arduino/blob/master/Chess/chess0048e32/chess0048e32.ino#L9 describes using recursion as part of it's algorithm. Recursion points:
Each level of recursion depth will use at least 175b of stack plus any additional required for making function calls. At some point in the recursion depth it will fail as you have found.
AVR doesn't use RTOS and doesn't have the same concept of task stack. It uses all free heap/SRAM for the recursion usage, very likely at a certain depth of recursion it will start randomly overwriting areas of SRAM or perhaps simply crash. |
oh, I expected both setup() and loop() are just parts of main(), just like in Arduino, for RTOS then running in the same main() task, with access to the entire RAM :
anyway, recursion is mandatory, and as I do this |
https://github.com/espressif/arduino-esp32/blob/master/cores/esp32/main.cpp#L51 is the entrypoint from ESP-IDF (which boots up prior to the app starting). https://github.com/espressif/arduino-esp32/blob/master/cores/esp32/main.cpp#L67 shows where You can also call |
I would need to code the available memory size in my program, not by patching the ESP API. |
You can call the same APIs used in the links above from your code without altering the arduino-esp32 code. |
I have no clue how to do that, I am just used to programming by the common original Arduino API methods. updated demo code, chess() running also in setup() now: https://github.com/dsyleixa/Arduino/tree/master/Chess/chess0049e32 |
if I put at the end of setup(): or is it better to put all code into setup() and clear all in loop()? |
@me-no-dev @SuGlider It seems like overriding the default value of stack size for the main task (without having to fall back to arduino-as-IDF-component) could be useful. What do you think about adding a simple way for the user to adjust the main task stack size, something along these lines: /* in arduino-esp32 main.cpp: */
__attribute__((weak)) size_t getArduinoLoopTaskStackSize(void) {
return ARDUINO_LOOP_STACK_SIZE;
}
/* later... */
xTaskCreateUniversal(loopTask, "loopTask", getArduinoLoopTaskStackSize(), NULL, 1, &loopTaskHandle, ARDUINO_RUNNING_CORE);
/* in Arduino.h */
#define ESP_LOOP_TASK_STACK_SIZE(sz) \
size_t getArduinoLoopTaskStackSize(void) { \
return sz; \
}
/* in sketch code */
#include <Arduino.h>
ESP_LOOP_TASK_STACK_SIZE(16384);
void setup() { }
void loop() { }
Edit: alternatively, as a more general solution, we could consider a user-provided "build options" header file |
@SuGlider let's align this with @pedrominatel and have it documented as well |
thanks guys for your interest in this topic and for find a fix. Perhaps allow me to propose a solution: I would tend to define the stack size within the threads myself at the beginning, similar to setting a thread priority, e.g. via We now may close it or keep it open until there is a fix, as you wish. |
Unfortunately there is no such function in FreeRTOS, the only time the stack size can be set is during creation. The solution which @igrr has proposed (weak function you can override) is likely the best option as it will work with both IDF+Arduino and Arduino (standalone). |
reworked my basic chess code, both by the identical algorithm, 2 UI versions: first observations: the Raspi Xterminal console output for the 1st move, after WHITE manual move d2d4,
whilst the Serial console output of the ESP32 is:
...that is really puzzling and IMO that might be a reason for massive unexpected RAM consumptions.... :?: |
## Summary Arduino ```setup()``` and ```loop()``` run under a Task with a fixed Stack size of 8KB. Users may want to change this size. This PR adds this possibility by just adding a line of code, as for example: ``` dart ESP_LOOP_TASK_STACK_SIZE(16384); void setup() { } void loop() { } ``` ## Impact None. It adds a new functionality to ESP32 Arduino. If ```ESP_LOOP_TASK_STACK_SIZE(newSize);``` is not declared/used, it will compile the sketch with the default stack size of 8KB. ## Related links fix #6010 #6010 (comment) Thanks @igrr for the suggestion!
I actually doubt that the TOP issue is only caused by too little STACK (edited). If it was, then the program wouldn't behave so extremely different from the same program running on a RaspberryPi. |
It's caused by stack exhaustion in the
It has nothing to do with free RAM and everything to do with stack. Comparing to a linux host (rPi) is not a fair comparison since they don't operate in the same fashion. |
I have to disagree as the programs (move generator, Negamax) are totally identical now: |
you are free to disagree but the backtrace does not agree with you. |
I don't have a backtrace anymore. |
@dsyleixa I'd recommend checking the code again, it seems that it relies on some undefined behaviors, so its execution is not very predictable. On Linux (compiled with (output)
Integer overflow is also reported on ESP32, if we add (output)
Besides, compiling this code on Linux with (compiler output)
In general, if you see that a certain non-platform-specific piece of code works differently on Linux and on a microcontroller, first try to make sure it compiles and works correctly on Linux with If you narrow the issue down to a small fragment of code (MCVE) which still works differently on Linux and ESP while passing compiler and sanitizer checks, please post that fragment of code here, we will try to help you figure out the issue. |
On Raspi and ESP32 and original Arduino it's always compiled by gcc, and operator precedence for C/C++ hasn't changed since C99 or even before. |
as to signed int overflow: I don't see any signed ints in my code, just int and int32_t (CMIIW) |
as the error happens in a code of multiple recursions (which always are computed in identical follow-up series though) and by admittedly multiple recursive stack allocations (correct idiom?) this code cannot be shrinked down unfortunately. |
FWIW, your Raspberry Pi version of the code produces same result for me on the ESP32 as it does on Linux. I only had to replace the platform-dependent |
that is amazing, thank you very much for your contributions! Now these results have now dispelled all of my concerns finally. I would never have considered that random initializing a hashtable for known positions could lead to these different results. |
Note that you can also specify a larger stack size for the main loop task for ESP32. If you're using the Arduino Espressif library (as opposed to native ESP32 code), before your setup() in main.cpp, you can add the following to double the stack size for the main loop task from 8KB to 16KB:
Within Arduino.h, there is a macro setup that will use the value you give it above:
|
Arduino IDE 1.8.9
ESP32 board 1.0.6 (edit; meanwhile updated to 2.0.1)
default settings
generally my program runs fine, but sometimes, unexpectedly, I get this error -
but why and what does that mean....?
Guru Meditation Error: Core 1 panic'ed (Unhandled debug exception)
Debug exception reason: Stack canary watchpoint triggered (loopTask)
Core 1 register dump:
PC : 0x400d3254 PS : 0x00060636 A0 : 0x800d3824 A1 : 0x3ffb0030
A2 : 0x0000000e A3 : 0x3ffcc4cc A4 : 0xfffffe77 A5 : 0x00000080
A6 : 0x00000000 A7 : 0x00000001 A8 : 0x3ffc1c24 A9 : 0x00000008
A10 : 0xfffffce1 A11 : 0x00000002 A12 : 0x00000002 A13 : 0x00000353
A14 : 0x0000000a A15 : 0x3ffb08d0 SAR : 0x00000011 EXCCAUSE: 0x00000001
EXCVADDR: 0x00000000 LBEG : 0x400014fd LEND : 0x4000150d LCOUNT : 0xfffffff8
ELF file SHA256: 0000000000000000
Backtrace: 0x400d3254:0x3ffb0030 0x400d3821:0x3ffb0170 0x400d3821:0x3ffb02b0 0x400d3821:0x3ffb03f0 0x400d3821:0x3ffb0530 0x400d3821:0x3ffb0670 0x400d3821:0x3ffb07b0 0x400d3821:0x3ffb08f0 0x400d3821:0x3ffb0a30 0x400d3821:0x3ffb0b70 0x400d3821:0x3ffb0cb0 0x400d3821:0x3ffb0df0 0x400d3821:0x3ffb0f30 0x400d3821:0x3ffb1070 0x400d3821:0x3ffb11b0 0x400d3821:0x3ffb12f0 0x400d3821:0x3ffb1430 0x400d3821:0x3ffb1570 0x400d3821:0x3ffb16b0 0x400d3821:0x3ffb17f0 0x400d3821:0x3ffb1930 0x400d3821:0x3ffb1a70 0x400d3821:0x3ffb1bb0 0x400d493c:0x3ffb1cf0 0x400d683b:0x3ffb1e30 0x400d71a4:0x3ffb1ef0 0x400e352d:0x3ffb1fb0 0x4008a1fe:0x3ffb1fd0
Rebooting...
the program is this one:
https://github.com/dsyleixa/Arduino/tree/master/ESP32_GBox/ESP32_Box023
the error happens sometimes when running the "chess" subroutine.
just to mention:
the chess program (i.e., the move generator) is the same as for my Arduino Due and for my Raspberry Pi, and there it works absolutely fine without any problem ever. So IMO the issue here on my ESP32 is probably not related to the chess algorithm itself as far as I can see.
PS, to clraify:
sometimes the move generator crashes already at the 2nd or 5th recursive ply after ~50000 move computations or even less,
sometimes it runs fine through the 7th recursive ply by more than 1 or 2 millions move computations and returns a valid and smart move, e.g.:
The text was updated successfully, but these errors were encountered: