From 023fdc49fed2d8f4429e9c97dcf0bacacb80e702 Mon Sep 17 00:00:00 2001 From: Yanfeng Liu Date: Fri, 31 May 2024 12:07:45 +0800 Subject: [PATCH] add leakless ostest story --- docs/README.md | 2 +- docs/leakless-ostest.md | 75 ++++++++++++++++++++++++++++++++++++++ leakless-ostest.md | 81 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 157 insertions(+), 1 deletion(-) create mode 100644 docs/leakless-ostest.md create mode 100644 leakless-ostest.md diff --git a/docs/README.md b/docs/README.md index 162f9413aa0b3..8a8c8a069f187 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,4 +1,4 @@ Some notes on NuttX -- How [clean](leakless-ostest.md) is NuttX? +How [clean](leakless-ostest) is NuttX? diff --git a/docs/leakless-ostest.md b/docs/leakless-ostest.md new file mode 100644 index 0000000000000..f5dcf4fd19f72 --- /dev/null +++ b/docs/leakless-ostest.md @@ -0,0 +1,75 @@ +# Leakless ostest + +This document findings on the kernel memory growth issue when running `ostest` app. + +## The Problem + +The NSH command `free` shows memory information like below: + +``` +ABC +nsh> free + total used free maxused maxfree nused nfree + Umem: 33378172 7036 33371136 7020 33371120 23 2 +``` + +Where we can notice the used memory in kernel is `7036` bytes after a boot. After using some built-in commands or the simple `hello` app, it doesn't change. + +However, after using the `getprime 4` app, the used memory grows: + +``` +nsh> free + total used free maxused maxfree nused nfree + Umem: 33377308 7052 33370256 18324 33363472 23 4 +``` + +It grows further after running the `ostest` app: + +``` +nsh> free + total used free maxused maxfree nused nfree + Umem: 33377308 9140 33368168 46012 33363616 42 7 +``` + +Are these implying that we are leaking memory? + +## Findings + +After some investigations, we have findings below: + +- There is undelivered message leakage in timed mqueue that is fixed in [patch 12402](https://github.com/apache/nuttx/pull/12402). This is a true leak that is triggered by the `ostest` app. + +- There is a global list of free sigaction objects in NuttX which are allocated dynamically and never freed after use. Thus when `ostest` app runs for the first time, used memory grows due to their allocation, but they don't grow if `ostest` app runs again. [Patch 12406](https://github.com/apache/nuttx/pull/12406) adds pre-allocated sigaction list and reclaims dynamically allocated ones timely thus we won't see the initial growth any more. + +- There is a pid hash table which is a dynamic array of TCB pointers. The array has a very small initial length, thus when multi-threading apps like `getprime` or `ostest` are used, it grows quickly to accomdate more thread ids. The array never shrinks currently. The frequent initial growth feels like leakage. so [patch 12427](https://github.com/apache/nuttx/pull/12427) contains a solution to avoid frequent growth. + +- There are a few folders created by `ostest` when doing mqueue tests, they are not cleaned timely thus their inodes are still using kernel memory. + +## Result + +After applying above patches, with proper configuration and clean up commands, we can see stable system memory usage after running `ostest` like below: + +``` +nsh> free + total used free maxused maxfree nused nfree + Umem: 33374748 7084 33367664 7068 33367632 21 2 +nsh> ostest >/dev/null >>/dev/null +stdio_test: write fd=2 +stdio_test: Standard I/O Check: fprintf to stderr +setvbuf_test: Using NO buffering +setvbuf_test: Using default FULL buffering +setvbuf_test: Using FULL buffering, buffer size 64 +setvbuf_test: Using FULL buffering, pre-allocated buffer +setvbuf_test: Using LINE buffering, buffer size 64 +setvbuf_test: Using FULL buffering, pre-allocated buffer +nsh> echo $? +0 +nsh> rm -r /var +nsh> free + total used free maxused maxfree nused nfree + Umem: 33374748 7084 33367664 46108 33367632 21 2 +``` + +So with a few improvements, our NuttX build can finish `ostest` cleanly. + +Though real world app situations may vary, but for simple apps like `ostest`, being clean makes people more confident about NuttX. diff --git a/leakless-ostest.md b/leakless-ostest.md new file mode 100644 index 0000000000000..a62dd04ed9cab --- /dev/null +++ b/leakless-ostest.md @@ -0,0 +1,81 @@ + + +## Problem + +The NSH command `free` shows memory information like below: + +``` +nsh> free + total used free maxused maxfree nused nfree + Umem: 33378172 7036 33371136 7020 33371120 23 2 +``` + +Where we can notice the `used` memory in kernel heap is 7036 bytes after a clean boot. + +How this number changes during the use of NuttX? After using some built-in commands or the simple `hello` app, it doesn't change. + +However, after using the `getprime 4` app, the used memory grows: + +``` +nsh> free + total used free maxused maxfree nused nfree + Umem: 33377308 7052 33370256 18324 33363472 23 4 +``` + +It grows further after running the `ostest` app once: + +``` +nsh> free + total used free maxused maxfree nused nfree + Umem: 33377308 9140 33368168 46012 33363616 42 7 +``` + +Are these implying that the kernel is leaking memory? + +## Findings + +After some investigations, we have a few findings: + +- There is a pid hash table which is actually dynamic array of TCB pointers. The table has a very small initial length, thus when multi-threading apps like `getprime` or `ostest` are used, the table grows to accomdate more thread ids. The table doesn't shrink currently. This isn't a real leakage but it is a little misleading, and frequent growth may lead to more memory fragements. So patch 12427 adds a `PIDHASH_INITIAL_LENGTH` so that to avoid unnecessary growth. + +- There is undelivered message leakage in timed mqueue source, as resolved in patch 12402. This is a true leak. This is triggered by `ostest` app usage. + +- There is a global list of free sigaction objects which are allocated dynamically and never freed after use. Thus when `ostest` app runs for the first time, used memory grows due to their allocation, but they don't grow if `ostest` app runs again. Patch 12406 adds pre-allocated sigaction list and reclaims dynamically allocated ones timely thus we won't see used memory growth due to this list. + +- There is a few folders created by `ostest` when using mqueue functions, they are not cleaned timely thus their inodes are using kernel memory. + +## Result + +After applying above patches and with proper clean up commands, we can see stable used memory before and after using the `ostest` app: + +``` +nsh> free + total used free maxused maxfree nused nfree + Umem: 33374748 7084 33367664 7068 33367632 21 2 +nsh> ostest >/dev/null >>/dev/null +stdio_test: write fd=2 +stdio_test: Standard I/O Check: fprintf to stderr +setvbuf_test: Using NO buffering +setvbuf_test: Using default FULL buffering +setvbuf_test: Using FULL buffering, buffer size 64 +setvbuf_test: Using FULL buffering, pre-allocated buffer +setvbuf_test: Using LINE buffering, buffer size 64 +setvbuf_test: Using FULL buffering, pre-allocated buffer +inode_alloc: 0x8002e6a8 var/mqueue/mqueue +inode_alloc: 0x8002e6d0 mqueue/mqueue +inode_alloc: 0x80030ff8 mqueue +nxmq_alloc_msgq: 0x80031028 +inode_alloc: 0x80030ff8 timedmq +nxmq_alloc_msgq: 0x80031028 +nxmq_alloc_msg: 0x80031090 +nxmq_alloc_msg: 0x800310c0 +nsh> echo $? +0 +nsh> rm -r /var +nsh> free + total used free maxused maxfree nused nfree + Umem: 33374748 7084 33367664 46108 33367632 21 2 +``` + +So with a few improvements, our build of NuttX not only finishes `ostest`, but also does it cleanly! +