-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for mmap() writable mappings. #175
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a WIP, we knew there'd be W to do :).
Thanks for enumerating these -- yeah, so, the same repeat offenders :/. |
8b5ba17
to
282fa84
Compare
Hmmmm, not looking good in CI. Both debug-el8 and debug-el94 appear(*) stuck on (*) they completed |
|
retest |
Adds the required memory mapped ops struct and page fault handler for reads. Signed-off-by: Benjamin LaHaise <[email protected]> Signed-off-by: Auke Kok <[email protected]>
Add support for writable MAP_SHARED mmap()ings. Avoid issues with late writepage()s building transactions by doing the block_write_begin() work in scoutfs_data_page_mkwrite(). Ensure the page is marked dirty and prepared for write, then let the VM complete the write when the page is flushed or invalidated. Signed-off-by: Benjamin LaHaise <[email protected]> Signed-off-by: Auke Kok <[email protected]>
Two test programs are added. The run time is about 1min on my el7 instance. The test script finishes up with a read/write mmap test on offline extents to verify the data wait paths in those functions. One program will perform vfs read/write and mmap read/write calls on the same file from across 5 threads (mounts) repeatedly. The goal is to assure there are no locking issues between read/write paths. The second test program performs consistency checking on a file that is repeatedly written/read using memory maps and normal reads and writes, and the content is verified after every operation. Signed-off-by: Auke Kok <[email protected]>
Now that all of these should be passing, we enable all mmap() tests in xfstests, and update the golden output with the new tests. Signed-off-by: Auke Kok <[email protected]>
We merely trace exit values and position, and ignore length. Because vm_fault_t is __bitwise, sparse will loudly complain about a plain cast to u32, so we must __force (on el8). ret will be 512 in normal cases. Signed-off-by: Auke Kok <[email protected]>
These 2 sections of compat for readdir are wholly obsolete and can be hard dropped, which restores the method to look like current upstream code. This was added in ddd1a4e. Signed-off-by: Auke Kok <[email protected]>
Verify using xfs_io that readdir offsets match expected output. Signed-off-by: Auke Kok <[email protected]>
dir_emit() will copy_to_user, which can pagefault. If this happens while cluster locked, we could deadlock. We use a single page to stage dir_emit data, and iterate between fetching dirents while locked, and emitting them while not locked. Signed-off-by: Auke Kok <[email protected]>
Now that we support mmap writes, at any point in time we could pagefault and lock for writes. That means - just like readdir - we can no longer lock and copy_to_user, since it also may page fault and thus deadlock. We statically allocate 32 extent entries on the stack and use these to shuffle out fiemap entries at a time, locking and unlocking around collecting and fiemap_fill_extent_next. Signed-off-by: Auke Kok <[email protected]>
Similar to readdir and fiemap vfs methods, we can't copy to user while holding cluster locks. The previous comment about it being safe no longer applies, and this could deadlock. Rewrite the loop to iterate and store entries in a page, then flush the page contents while not holding a clusterlock. Signed-off-by: Auke Kok <[email protected]>
Similar to fiemap, readdir and walk_inodes, this method could have put_user during a page fault, causing potentially a deadlock. Signed-off-by: Auke Kok <[email protected]>
While debugging a double unlock error we hit this condition and debugging would have been a lot easier had we enforced this simple constraint that we can't decrement the lock users count if it's already 0. Signed-off-by: Auke Kok <[email protected]>
We need to assure we're emitting dents with the proper position and we already have them as part of our dent. The only caveat is to increment ctx->pos once beyond the list to make sure the caller doesn't call us once more. Signed-off-by: Auke Kok <[email protected]>
@zabbo : Mixed push for easier "re-review": Added e59a5f8 Which adds a readdir output test using xfs_io. The golden here was generated using Added e9d1472 which fixes the output of We could merge as is, or we can let this run through tests once more and squash the fix into @aversecat Either obviously is fine. |
There seem to be real failures in archive-light-cycle in the 9.5 debug run. Have we looked into this? Are we seeing this anywhere else?
|
@zabbo I found this in a test run in a branch(#202) that has no
So it seems this is a potential issue in Here's the |
So it would seem, and so far only 9.5 debug runs I think. |
I checked 2 of the offsets, and they're suspiciously at 1byte beyond a 4k boundary:
|
Replaces #27, #39.
Contains
mostly original patches from andy, touched up for conflicts. Additional fixups and changes to avoid various deadlocks and debug kernel warnings for lock contention issues.Does notpassxfstests:generic/346
- hard lockup in _mkwrite when doing update_inodeOccasionally failsoffline-extent-waiting - when reverse staging, the first blocks of the file end up zeros, not the expected contentdoesn't work onel7