You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix check-mode.sh display of leading-whitespace paths
In #1708, `check-mode.sh` deliberately parsed `git ls-files -sz`
records using default nonempty `IFS`, to conveniently read fields
into variables. But this would strip leading and trailing
whitespace from the paths. Trailing whitespace is not currently
present, since only files with `.sh` suffixes are processed, so the
only paths that would be displayed incorrectly were those with
leading whitespace.
(We do not intentionally have such files in the repository, but
they can arise accidentally. They might also, in principle, be
added at some point to support tests. They would not likely be
kept, because they are incompatible with Windowws and `git` will
not check them out there, but they could plausibly exist in some
commits on a feature branch, even with no mistakes.)
This was *sort of* okay, because the script never accesses the
repository or filesystem using the paths, but only displays them to
the user. But this was still bad for a few reasons:
- The paths were at least in principle ambiguous.
- A path with accidental leading whitespace would not be noticed.
- The `path` variable looks like it would be usable, so it could be
easily misused in future changes, if not fixed.
- Having to think through whether or not this parsing bug is
excessively serious, when reasoning about the code, may make the
code more cumbersome to understand and maintain.
This is to say that I should never have done it in that way.
Therefore, this replaces the default-`IFS` logic with empty-`IFS`
calls to `read` and regular-expression processing, using `bash`'s
`[[` keyword, which supports a `=~` operator that matches against a
regular expression and populates the `BASH_REMATCH` array variable
with captures.
(This approach to parsing `git ls-file -sz` is based on code I wrote
in https://github.com/EliahKagan/gitscripts/blob/master/git-lsx.)
As discussed in #1708, this script may be converted to Rust. This
change is not intended as a substitute for that. It should actually
make that slightly easier, in that a regex approach has clearer
semantics, whether it is translated into code that uses regex or
not. (If regex are used, it will probably still need adjustment,
as not all dialects reocognize POSIX classes like `[[:digit:]]`.)
This commit includes some files to test the fix in the script. They
are at the top level of the repository, since the bug this fixes
only occurs with whitespace at the beginning of the first path
component (relative to the root of the working tree). These files
will be removed shortly. The messages for them are:
mode +x but no shebang: $'\tb.sh'
mode -x but has shebang: \ a.sh
Before the fix, they would have been shown as `a.sh` instead of
`\ a.sh`, and `b.sh` instead of `$'\tb.sh'`. Note that the approach
to quoting has not itself changed: the `%q` format specifier for
the `printf` builtin in `bash` formats them this, when the paths
are parsed in such a way that the characters that would need
quoting if used in the shell are not lost.
0 commit comments