Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: repair permission feature after indexing #25

Open
dirkpetersen opened this issue Mar 30, 2024 · 2 comments
Open

Feature: repair permission feature after indexing #25

dirkpetersen opened this issue Mar 30, 2024 · 2 comments
Labels

Comments

@dirkpetersen
Copy link
Owner

dirkpetersen commented Mar 30, 2024

As Froster is primarily an end user tool that does not run as root user, it will require at least read permissions for all file system indexing processes and read+write permissions to complete archiving processes. Posix file systems tend to experience permission drift over time. To ensure seamless collaboration among users, a systems administrator will have to repair permissions occasionally. As this process can be labor and time consuming, Froster can prepare the commands that a systems administrator will then review and execute as root. We need to address 3 use cases:

  1. The group ownership of a file or folder is incorrect, often because of issue 2 (setgid) or because users made an incorrect permission change
  2. The setgid bit, which ensures that that new folders and files inherit the group ownership from a parent directory, has been removed / overwritten and new files are no longer created with the correct permissions. Instead they have the group permission of the primary group of the user who created the files. This primary group may very well be a different department and other project members are not members of this group which leads to file access problems.
  3. Files and Folders have no read and/or write permissions for the owning group. This often occurs because of software errors or because software thinks it requires specific permissions.

For all 3 problems there is a solution that can be executed as root, for example:

  1. For example, change all files and folders owned by group apples to oranges: find /my/dir -group apples -exec chgrp oranges {} +
  2. set the setgid bit for all directories that don't have it: find /my/dir -type d ! -perm -2000 -exec chmod g+s {} +
  3. ensure that all files and folders owned by group oranges have rw permissions: find /my/dir -group oranges ! -perm -g+r -exec chmod g+r {} + -o ! -perm -g+w -exec chmod g+w {} +

As crawling through the file system can take a very long time we can use pwalk and duckdb to generate text files containing file and folder lists for which permissions need to be adjusted. Permission changes can then be executed in parallel via xargs, for example here with 256 parallel processes triggered by xargs:

  1. make sure to use the gid (e.g. 3901) and not the group name to avoid creating a fork bomb: xargs -a repair-grp-hpcusers.txt -P 256 -d '\n' chgrp 3901

  2. xargs -a repair-setgid.txt -P 256 -d '\n' chmod g+s

  3. xargs -a repair-read-write-lab.txt -P 256 -d '\n' chmod g+rw

Froster could generate the text files required to repair permissions, or repair the permissions directly, for example:

froster index --repair-permissions --chgrp oldgroup:newgroup,oldgroup2:newgroup2

There are 3 different usage patterns:

  • End user runs froster index --repair-permissions and hands over text files to system Administrator who will run the permission repair (this is the most inaccurate option as the end user may not have read access to a significant number of folders)
  • System administrator runs froster index --repair-permissions and will repair permissions directly.
  • user runs froster index --repair-permissions --pwalk-csv myfile.csv /my/folder with a pwalk csv file previously run by a Systems administrator (this is an option if the Systems Administrator does not use Froster but prefers text files that contain files and folders that require certain actions)
@dirkpetersen
Copy link
Owner Author

dirkpetersen commented Mar 31, 2024

To generate the text files with the appropriate file and folder lists we use duckdb against a pwalk csv file (or better against a parquet file that a csv file was converted to). In these examples we use /metadata/myfile.parquet, the numeric GID 3010 for oldgroup and 3901 for newgroup, these are the SQL queries that need to be executed:

  1. xargs -a repair-grp-hpcusers.txt -P 256 -d '\n' chgrp 3901
PRAGMA memory_limit='16384MB';PRAGMA threads=16;
COPY (
SELECT filename
FROM "/metadata/myfile.parquet"
WHERE GID IN (3009, 3010)     -- never use group name, always GIDs
) TO 'repair-grp-hpcusers.txt' (HEADER false, QUOTE '', ESCAPE '');
  1. xargs -a repair-setgid.txt -P 256 -d '\n' chmod g+s
PRAGMA memory_limit='16384MB';PRAGMA threads=16;
COPY (
SELECT filename
FROM "/metadata/myfile.parquet"
WHERE st_mode NOT LIKE '0042%' AND     -- Excludes dirs with setgid set
      SUBSTR(st_mode, 3, 1) = '4'      -- Ensures it's a dir
) TO 'repair-setgid.txt' (HEADER false, QUOTE '', ESCAPE '');
  1. xargs -a repair-read-write-lab.txt -P 256 -d '\n' chmod g+rw
PRAGMA memory_limit='16384MB';PRAGMA threads=16;
COPY (
SELECT filename
FROM "/metadata/myfile.parquet"
WHERE filename NOT LIKE 'id_%' AND
      fileExtension NOT IN ('pem', 'key') AND
      GID = '3901' AND
      SUBSTR(st_mode, 6, 1) < '6'     -- write permission can be 6 (files) or 7 (executables or directories)
) TO 'repair-read-write-lab.txt' (HEADER false, QUOTE '', ESCAPE '');

@dirkpetersen
Copy link
Owner Author

dirkpetersen commented Jun 28, 2024

this issue is the lowest priority right now but there could be a simplified implementation for a tool share-repair that crawls the file system and implements this while logging all changes to STDOUT and errors to STDERR

  • set the setgid bit on all folders if not set
  • if a file or folder has a gidNumber is identical to the uidNumber (private group) change group ownership to the group that owns the first found folder up the tree where that gidNumber is not identical to the uidNumber owning that folder. Stop searching at the root folder where the crawl began and print an error if no group can be found. Only print one error per tree (e.g. root folder is owned by private group)
  • ensure that each file has at least group read permissions and each folder has at least group read+execute permissions. (chmod +g)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants