Support for mmap() writable mappings. #175

aversecat · 2024-06-11T17:46:39Z

Replaces #27, #39.

Contains ~~mostly original patches from andy, touched up for conflicts~~. Additional fixups and changes to avoid various deadlocks and debug kernel warnings for lock contention issues.

- ~~Does not~~ pass xfstests:generic/346 - hard lockup in _mkwrite when doing update_inode
- ~~Occasionally fails~~ offline-extent-waiting - when reverse staging, the first blocks of the file end up zeros, not the expected content
- Passes all other xfstests
- Added cross-node mmap consistency test. ~~doesn't work on el7~~
- sparse warnings about ret not returning vm_fault_t
- _walk_inodes page fault safe
- _get_allocated_inos page fault safe
- fsstress hard lockup in generic/013

versity-github · 2024-06-11T17:51:27Z

Build 169 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/169/ git: c6a7fd9

versity-github · 2024-06-11T21:12:35Z

Build 144 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/144/ git: c6a7fd9

versity-github · 2024-06-12T00:06:26Z

Build 170 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/170/ git: 0f611df

versity-github · 2024-06-12T03:15:04Z

Build 145 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/145/ git: 0f611df

versity-github · 2024-07-02T19:26:21Z

Build 178 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/178/ git: d112efb

versity-github · 2024-07-02T20:04:41Z

Build 153 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/153/ git: d112efb

versity-github · 2024-07-11T22:46:37Z

Build 180 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/180/ git: 43de817

versity-github · 2024-07-11T22:49:44Z

Build 155 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/155/ git: 43de817

versity-github · 2024-07-12T20:01:16Z

Build 156 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/156/ git: 939b7ec

versity-github · 2024-07-12T20:02:50Z

Build 182 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/182/ git: 939b7ec

versity-github · 2024-07-15T19:26:36Z

Build 185 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/185/ git: cb51a81

versity-github · 2024-07-15T19:28:41Z

Build 160 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/160/ git: cb51a81

zabbo

It's a WIP, we knew there'd be W to do :).

tests/tests/mmap.sh

kmod/src/data.c

tests/golden/xfstests

kmod/src/data.c

versity-github · 2024-07-22T22:25:02Z

Build 163 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/163/ git: 3db7fd0

versity-github · 2024-07-22T22:27:49Z

Build 188 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/188/ git: 3db7fd0

versity-github · 2024-07-23T19:16:30Z

Build 189 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/189/ git: 920070d

versity-github · 2024-07-23T19:19:07Z

Build 165 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/165/ git: 920070d

versity-github · 2024-08-05T18:46:28Z

Build 190 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/190/ git: 789959f

versity-github · 2024-08-05T19:20:59Z

Build 166 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/166/ git: 789959f

versity-github · 2024-08-05T19:27:42Z

Build 191 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/191/ git: d958c43

zabbo · 2024-12-03T22:30:11Z

the -debug- failures are:

Thanks for enumerating these -- yeah, so, the same repeat offenders :/.

aversecat · 2024-12-04T17:10:04Z

Hmmmm, not looking good in CI.

Both debug-el8 and debug-el94 appear(*) stuck on generic/346 which is holetest with mmap. Trying to reproduce myself now.

(*) they completed generic/343 but output is stuck there. No message indicating 346 actually started.

aversecat · 2024-12-09T19:34:50Z

-debug- failures are:

el8: hung task timeout in large-fragmented-free, and orphan-inodes failure
el94: hung task timeout in large-fragmented-free, and orphan-inodes failure
el95: hung task timeout in large-fragmented-free, and this one:

--- golden/archive-light-cycle	2024-12-06 20:05:22.572655167 +0000
+++ /root/scoutfs/tests/results/output/archive-light-cycle	2024-12-07 03:12:08.082647728 +0000
@@ -4,6 +4,8 @@
 == round 1: create
 == round 1: online
 == round 1: verify
+/mnt/test.3/test/archive-light-cycle/dir/3/2-2 /dev/fd/63 differ: char 277401601, line 67726
+script pid 165368 failed: rc 1
 == round 1: release
 == round 1: offline
 == round 1: stage
archive-light-cycle output differs

zabbo · 2025-01-16T21:26:53Z

retest

kmod/src/dir.c

Adds the required memory mapped ops struct and page fault handler for reads. Signed-off-by: Benjamin LaHaise <[email protected]> Signed-off-by: Auke Kok <[email protected]>

Add support for writable MAP_SHARED mmap()ings. Avoid issues with late writepage()s building transactions by doing the block_write_begin() work in scoutfs_data_page_mkwrite(). Ensure the page is marked dirty and prepared for write, then let the VM complete the write when the page is flushed or invalidated. Signed-off-by: Benjamin LaHaise <[email protected]> Signed-off-by: Auke Kok <[email protected]>

Two test programs are added. The run time is about 1min on my el7 instance. The test script finishes up with a read/write mmap test on offline extents to verify the data wait paths in those functions. One program will perform vfs read/write and mmap read/write calls on the same file from across 5 threads (mounts) repeatedly. The goal is to assure there are no locking issues between read/write paths. The second test program performs consistency checking on a file that is repeatedly written/read using memory maps and normal reads and writes, and the content is verified after every operation. Signed-off-by: Auke Kok <[email protected]>

Now that all of these should be passing, we enable all mmap() tests in xfstests, and update the golden output with the new tests. Signed-off-by: Auke Kok <[email protected]>

We merely trace exit values and position, and ignore length. Because vm_fault_t is __bitwise, sparse will loudly complain about a plain cast to u32, so we must __force (on el8). ret will be 512 in normal cases. Signed-off-by: Auke Kok <[email protected]>

These 2 sections of compat for readdir are wholly obsolete and can be hard dropped, which restores the method to look like current upstream code. This was added in ddd1a4e. Signed-off-by: Auke Kok <[email protected]>

Verify using xfs_io that readdir offsets match expected output. Signed-off-by: Auke Kok <[email protected]>

dir_emit() will copy_to_user, which can pagefault. If this happens while cluster locked, we could deadlock. We use a single page to stage dir_emit data, and iterate between fetching dirents while locked, and emitting them while not locked. Signed-off-by: Auke Kok <[email protected]>

Now that we support mmap writes, at any point in time we could pagefault and lock for writes. That means - just like readdir - we can no longer lock and copy_to_user, since it also may page fault and thus deadlock. We statically allocate 32 extent entries on the stack and use these to shuffle out fiemap entries at a time, locking and unlocking around collecting and fiemap_fill_extent_next. Signed-off-by: Auke Kok <[email protected]>

Similar to readdir and fiemap vfs methods, we can't copy to user while holding cluster locks. The previous comment about it being safe no longer applies, and this could deadlock. Rewrite the loop to iterate and store entries in a page, then flush the page contents while not holding a clusterlock. Signed-off-by: Auke Kok <[email protected]>

Similar to fiemap, readdir and walk_inodes, this method could have put_user during a page fault, causing potentially a deadlock. Signed-off-by: Auke Kok <[email protected]>

While debugging a double unlock error we hit this condition and debugging would have been a lot easier had we enforced this simple constraint that we can't decrement the lock users count if it's already 0. Signed-off-by: Auke Kok <[email protected]>

We need to assure we're emitting dents with the proper position and we already have them as part of our dent. The only caveat is to increment ctx->pos once beyond the list to make sure the caller doesn't call us once more. Signed-off-by: Auke Kok <[email protected]>

aversecat · 2025-01-27T19:53:12Z

@zabbo : Mixed push for easier "re-review":

Added e59a5f8 Which adds a readdir output test using xfs_io. The golden here was generated using main.

Added e9d1472 which fixes the output of dir_emit to properly emit using proper ctx->pos values for each entry. This then now passes the test.

We could merge as is, or we can let this run through tests once more and squash the fix into @aversecat
Avoid deadlock in _readdir() due to copy_to_user().

Either obviously is fine.

zabbo · 2025-02-06T20:16:54Z

There seem to be real failures in archive-light-cycle in the 9.5 debug run. Have we looked into this? Are we seeing this anywhere else?

== round 1: create
== round 1: online
== round 1: verify
+/mnt/test.1/test/archive-light-cycle/dir/1/2-1 /dev/fd/63 differ: char 238444545, line 58215
== round 1: release
== round 1: offline
== round 1: stage```

aversecat · 2025-02-14T19:38:10Z

@zabbo I found this in a test run in a branch(#202) that has no mmap patches:

--- golden/archive-light-cycle	2025-02-13 18:55:06.532816042 +0000
+++ /root/scoutfs/tests/results/output/archive-light-cycle	2025-02-14 02:04:50.421626043 +0000
@@ -15,6 +15,7 @@
 == round 2: create
 == round 2: online
 == round 2: verify
+/mnt/test.4/test/archive-light-cycle/dir/4/2-1 /dev/fd/63 differ: char 134172673, line 32758
 == round 2: release
 == round 2: offline
 == round 2: stage
archive-light-cycle output differs
--- golden/block-stale-reads	2025-02-13 18:55:06.541816111 +0000
+++ /root/scoutfs/tests/results/output/block-stale-reads	2025-02-14 02:04:53.510648267 +0000
@@ -1,2 +1,2 @@
 == Issue scoutfs df to force block reads to trigger stale invalidation/retry
-counter block_cache_remove_stale changed
+counter block_cache_remove_stale didn't change
block-stale-reads output differs

So it seems this is a potential issue in main.

Here's the fail.log for that run: http://jenkins.vpn.versity.com:8080/job/ng-scoutfs-test-debug-el95/16/artifact/artifacts/results/fail.log/*view*/

zabbo · 2025-02-18T18:20:51Z

So it seems this is a potential issue in main.

So it would seem, and so far only 9.5 debug runs I think.

aversecat · 2025-02-18T18:25:21Z

I checked 2 of the offsets, and they're suspiciously at 1byte beyond a 4k boundary:

$ printf "0x%08x\n" 238444545
0x0e366001
$ printf "0x%08x\n" 203247617
0x0c1d5001

aversecat added the enhancement New feature or request label Jun 11, 2024

aversecat force-pushed the auke/mmap branch from c6a7fd9 to 0f611df Compare June 12, 2024 00:02

aversecat force-pushed the auke/mmap branch from 0f611df to d112efb Compare July 2, 2024 19:22

aversecat force-pushed the auke/mmap branch from d112efb to 43de817 Compare July 11, 2024 22:08

aversecat force-pushed the auke/mmap branch from 939b7ec to cb51a81 Compare July 15, 2024 18:48

zabbo requested changes Jul 16, 2024

View reviewed changes

tests/tests/mmap.sh Outdated Show resolved Hide resolved

kmod/src/data.c Outdated Show resolved Hide resolved

tests/golden/xfstests Show resolved Hide resolved

kmod/src/data.c Outdated Show resolved Hide resolved

aversecat force-pushed the auke/mmap branch from cb51a81 to 3db7fd0 Compare July 22, 2024 21:30

aversecat force-pushed the auke/mmap branch from 3db7fd0 to 920070d Compare July 23, 2024 18:32

aversecat force-pushed the auke/mmap branch from 920070d to 789959f Compare August 5, 2024 18:36

aversecat force-pushed the auke/mmap branch from 789959f to d958c43 Compare August 5, 2024 18:48

aversecat changed the title mmap() tree **WIP** mmap() tree. Aug 5, 2024

aversecat force-pushed the auke/mmap branch 2 times, most recently from 8b5ba17 to 282fa84 Compare December 3, 2024 23:09

aversecat force-pushed the auke/mmap branch from 282fa84 to c6eec81 Compare December 6, 2024 17:57

zabbo requested changes Jan 21, 2025

View reviewed changes

kmod/src/dir.c Outdated Show resolved Hide resolved

bcrl and others added 14 commits January 23, 2025 14:28

Add support for read only mmap()

b7a3d03

Adds the required memory mapped ops struct and page fault handler for reads. Signed-off-by: Benjamin LaHaise <[email protected]> Signed-off-by: Auke Kok <[email protected]>

Enable all xfstests mmap() tests.

92f704d

Now that all of these should be passing, we enable all mmap() tests in xfstests, and update the golden output with the new tests. Signed-off-by: Auke Kok <[email protected]>

mmap() trace events.

519b47a

We merely trace exit values and position, and ignore length. Because vm_fault_t is __bitwise, sparse will loudly complain about a plain cast to u32, so we must __force (on el8). ret will be 512 in normal cases. Signed-off-by: Auke Kok <[email protected]>

remap_pages ops becomes obsolete.

b944f60

Drop readdir pre-.iterate() compat (el7.5ish).

1bcd1d4

These 2 sections of compat for readdir are wholly obsolete and can be hard dropped, which restores the method to look like current upstream code. This was added in ddd1a4e. Signed-off-by: Auke Kok <[email protected]>

Readdir w/offset validation.

e59a5f8

Verify using xfs_io that readdir offsets match expected output. Signed-off-by: Auke Kok <[email protected]>

Avoid cluster locking while put_user() in _allocated_inos.

8b76a53

Similar to fiemap, readdir and walk_inodes, this method could have put_user during a page fault, causing potentially a deadlock. Signed-off-by: Auke Kok <[email protected]>

aversecat force-pushed the auke/mmap branch from c6eec81 to e9d1472 Compare January 27, 2025 19:50

zabbo merged commit a4b5a25 into main Feb 20, 2025
4 of 6 checks passed

zabbo deleted the auke/mmap branch February 20, 2025 22:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for mmap() writable mappings. #175

Support for mmap() writable mappings. #175

aversecat commented Jun 11, 2024 •

edited

Loading

versity-github commented Jun 11, 2024

versity-github commented Jun 11, 2024

versity-github commented Jun 12, 2024

versity-github commented Jun 12, 2024

versity-github commented Jul 2, 2024

versity-github commented Jul 2, 2024

versity-github commented Jul 11, 2024

versity-github commented Jul 11, 2024

versity-github commented Jul 12, 2024

versity-github commented Jul 12, 2024

versity-github commented Jul 15, 2024

versity-github commented Jul 15, 2024

zabbo left a comment

versity-github commented Jul 22, 2024

versity-github commented Jul 22, 2024

versity-github commented Jul 23, 2024

versity-github commented Jul 23, 2024

versity-github commented Aug 5, 2024

versity-github commented Aug 5, 2024

versity-github commented Aug 5, 2024

zabbo commented Dec 3, 2024

aversecat commented Dec 4, 2024

aversecat commented Dec 9, 2024 •

edited

Loading

zabbo commented Jan 16, 2025

aversecat commented Jan 27, 2025

zabbo commented Feb 6, 2025

aversecat commented Feb 14, 2025

zabbo commented Feb 18, 2025

aversecat commented Feb 18, 2025 •

edited

Loading

Support for mmap() writable mappings. #175

Support for mmap() writable mappings. #175

Conversation

aversecat commented Jun 11, 2024 • edited Loading

versity-github commented Jun 11, 2024

versity-github commented Jun 11, 2024

versity-github commented Jun 12, 2024

versity-github commented Jun 12, 2024

versity-github commented Jul 2, 2024

versity-github commented Jul 2, 2024

versity-github commented Jul 11, 2024

versity-github commented Jul 11, 2024

versity-github commented Jul 12, 2024

versity-github commented Jul 12, 2024

versity-github commented Jul 15, 2024

versity-github commented Jul 15, 2024

zabbo left a comment

Choose a reason for hiding this comment

versity-github commented Jul 22, 2024

versity-github commented Jul 22, 2024

versity-github commented Jul 23, 2024

versity-github commented Jul 23, 2024

versity-github commented Aug 5, 2024

versity-github commented Aug 5, 2024

versity-github commented Aug 5, 2024

zabbo commented Dec 3, 2024

aversecat commented Dec 4, 2024

aversecat commented Dec 9, 2024 • edited Loading

zabbo commented Jan 16, 2025

aversecat commented Jan 27, 2025

zabbo commented Feb 6, 2025

aversecat commented Feb 14, 2025

zabbo commented Feb 18, 2025

aversecat commented Feb 18, 2025 • edited Loading

aversecat commented Jun 11, 2024 •

edited

Loading

aversecat commented Dec 9, 2024 •

edited

Loading

aversecat commented Feb 18, 2025 •

edited

Loading