Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: M595: crashes for very large input length #939

Closed
2 of 25 tasks
timschneider opened this issue Jan 3, 2024 · 3 comments
Closed
2 of 25 tasks

[Bug]: M595: crashes for very large input length #939

timschneider opened this issue Jan 3, 2024 · 3 comments
Assignees
Labels
bug Bug that has been reproduced

Comments

@timschneider
Copy link

Duet Forum Discussion Thread

https://forum.duet3d.com/topic/34266/bug-m595-crashes-for-very-large-input-length

Which Duet products are you using?

  • Duet2-Wifi
  • Duet2-Ethernet
  • Duet Expansion Breakout Board
  • Duex2
  • Duex5
  • Duet2-Maestro
  • Maestro Dual Driver Expansion
  • Duet3-6HC
  • Duet3-3HC
  • Duet3-1XD
  • Duet3-1LC
  • Duet3-Tool Distribution Board
  • Duet3-Mini5+
  • Duet3-Mini2+
  • Raspberry Pi or other SBC
  • SmartEffector
  • Magnetic Filament Sensor
  • Laser Filament Sensor
  • PT100 Daughterboard
  • Thermocouple Daughterboard
  • PanelDue
  • Other
  • None

Firmware Version

RepRapFirmware 3.5.0-rc.2

Duet Web Control Version

DWC 3.5.0-rc.2

Are you using a Single Board Computer (RaspberryPi) with your Duet?

  • Yes I use a SBC.
  • No I do not use a SBC.

Please upload the results of sending M122 in the gcode console.

M122
=== Diagnostics ===
RepRapFirmware for Duet 3 MB6HC version 3.5.0-rc.2 (2023-12-14 10:32:22) running on Duet 3 MB6HC v1.02 or later (SBC mode)
Board ID: 08DJM-9P63L-DJ3T8-6JKD4-3SJ6K-9A77A
Used output buffers: 1 of 40 (17 max)
=== RTOS ===
Static ram: 154844
Dynamic ram: 87544 of which 4776 recycled
Never used RAM 95972, free system stack 206 words
Tasks: SBC(2,ready,0.9%,427) HEAT(3,nWait,0.0%,332) Move(4,nWait,0.0%,339) CanReceiv(6,nWait,0.0%,942) CanSender(5,nWait,0.0%,334) CanClock(7,delaying,0.0%,343) TMC(4,nWait,7.7%,61) MAIN(2,running,89.9%,103) IDLE(0,ready,1.5%,30), total 100.0%
Owned mutexes: HTTP(MAIN)
=== Platform ===
Last reset 00:00:22 ago, cause: software
Last software reset at 2024-01-03 15:51, reason: OutOfMemory, Gcodes spinning, available RAM 428, slot 0
Software reset code 0x41c3 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00435000 BFAR 0x00000000 SP 0x2041b9a0 Task MAIN Freestk 2054 ok
Stack: 2041808c 00419f61 00000103 2043229c 2043c740 0044659f 00000003 00000000 0000007c 01000003 ffffffff 00000000 00000253 2041bccc 2042bb20 204251b0 204251b0 00471301 00000000 00497caf 0000000a 00000000 00000000 00000000 00000000 00000000 004aea70
Error status: 0x00
MCU temperature: min 28.6, current 29.7, max 29.8
Supply voltage: min 24.1, current 24.1, max 24.2, under voltage events: 0, over voltage events: 0, power good: yes
12V rail voltage: min 11.8, current 12.3, max 12.7, under voltage events: 0
Heap OK, handles allocated/used 99/4, heap memory allocated/used/recyclable 2048/104/0, gc cycles 0
Events: 0 queued, 0 completed
Driver 0: standstill, SG min n/a, mspos 8, reads 57710, writes 22 timeouts 0
Driver 1: standstill, SG min n/a, mspos 8, reads 57710, writes 22 timeouts 0
Driver 2: standstill, SG min n/a, mspos 8, reads 57711, writes 21 timeouts 0
Driver 3: standstill, SG min n/a, mspos 8, reads 57711, writes 21 timeouts 0
Driver 4: standstill, SG min n/a, mspos 8, reads 57711, writes 21 timeouts 0
Driver 5: standstill, SG min n/a, mspos 4, reads 57713, writes 19 timeouts 0
Date/time: 2024-01-03 15:51:56
Slowest loop: 1.97ms; fastest: 0.07ms
=== Storage ===
Free file entries: 20
SD card 0 not detected, interface speed: 37.5MBytes/sec
SD card longest read time 0.0ms, write time 0.0ms, max retries 0
=== Move ===
DMs created 125, segments created 0, maxWait 0ms, bed compensation in use: none, height map offset 0.000, max steps late 0, ebfmin 0.00, ebfmax 0.00
no step interrupt scheduled
Moves shaped first try 0, on retry 0, too short 0, wrong shape 0, maybepossible 0
=== DDARing 0 ===
Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== DDARing 1 ===
Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== Heat ===
Bed heaters 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0
=== GCodes ===
Movement locks held by null, null
HTTP* is doing "M122" in state(s) 0
Telnet is idle in state(s) 0
File is idle in state(s) 0
USB is idle in state(s) 0
Aux is idle in state(s) 0
Trigger* is idle in state(s) 0
Queue is idle in state(s) 0
LCD is idle in state(s) 0
SBC is idle in state(s) 0
Daemon* is idle in state(s) 0 0, running macro
Aux2 is idle in state(s) 0
Autopause is idle in state(s) 0
File2 is idle in state(s) 0
Queue2 is idle in state(s) 0
Q0 segments left 0, axes/extruders owned 0x0000000
Code queue 0 is empty
Q1 segments left 0, axes/extruders owned 0x0000000
Code queue 1 is empty
=== Filament sensors ===
check 0 clear 195994
Extruder 0: pos 2160.00, errs: frame 0 parity 0 ovrun 0 pol 0 ovdue 0
=== CAN ===
Messages queued 196, received 0, lost 0, errs 104050, boc 0
Longest wait 0ms for reply type 0, peak Tx sync delay 0, free buffers 50 (min 50), ts 111/0/0
Tx timeouts 0,0,110,0,0,84 last cancelled message type 30 dest 127
=== SBC interface ===
Transfer state: 5, failed transfers: 0, checksum errors: 0
RX/TX seq numbers: 3276/893
SPI underruns 0, overruns 0
State: 5, disconnects: 0, timeouts: 0 total, 0 by SBC, IAP RAM available 0x258a4
Buffer RX/TX: 0/0-0, open files: 0
=== Duet Control Server ===
Duet Control Server version 3.5.0-rc.2 (2023-12-18 12:42:49)
Failed to deserialize the following properties:
- MoveSegmentation -> Int32 from 2.0
Daemon:
>> Doing macro daemon.g, started by system
Code buffer space: 4096
Configured SPI speed: 8000000Hz, TfrRdy pin glitches: 1
Full transfers per second: 0.01, max time between full transfers: 589.4ms, max pin wait times: 61.0ms/1.5ms
Codes per second: 0.00
Maximum length of RX/TX data transfers: 7244/892

Please upload the content of your config.g file.

Config.g

Please upload the content of any other releveant macro files.

No response

Details specific to your printer.

No response

Links to additional info.

No response

What happened?

Expected result
Set the queue length to an invalid value should refuse the value and report an error.

Observed result
Warning: Lost connection to Duet (Timeout while waiting for transfer ready pin)

Steps to reproduce
M595 P4294967295

@timschneider timschneider added the bug Bug that has been reproduced label Jan 3, 2024
@dc42
Copy link
Collaborator

dc42 commented Jan 5, 2024

The problem is, it's difficult to decide what values of queue lengths are sensible and what are not. A value that might be sensible on one system might cause another one to run out of memory, due to it having less free memory than the first. We could allocate additional queue entries until either the request has been satisfied fully or no more memory is available, but in the latter case the system would be likely to run out of memory the next time a configuration command is used that requires memory, or after a few ore variables have been created.

@timschneider
Copy link
Author

Thank you for the detailed reply - I was maybe a little vage what I would expect from the firmware.

grafik

It looks like there is some code checking if there is enough ram available, but from some number onwards this check will fail and RRF will just crash.

M595 P1000000 <- will fail with an error
M595 P2000000000 <- will fail with an error
M595 P2019866849 <- will fail with an error
M595 P2019866850 <- will crash

so it looks like there is a buffer overflow in the check.

M595 P2019866849 
Error: M595: insufficient RAM (available 95972, needed 2147483272)

P2019866850 will overflow the needed (int32) value and the check will fail and RRF will try to allocate to much memory.

@dc42
Copy link
Collaborator

dc42 commented Jan 8, 2024

Thanks for the clarification. I had forgotten that M595 included a check for available memory. I have fixed this in the 3.5-dev source code.

@dc42 dc42 closed this as completed Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug that has been reproduced
Projects
None yet
Development

No branches or pull requests

3 participants