Skip to content

Commit 11ac30e

Browse files
committed
Fix check-mode.sh display of leading-whitespace paths
In #1708, `check-mode.sh` deliberately parsed `git ls-files -sz` records using default nonempty `IFS`, to conveniently read fields into variables. But this would strip leading and trailing whitespace from the paths. Trailing whitespace is not currently present, since only files with `.sh` suffixes are processed, so the only paths that would be displayed incorrectly were those with leading whitespace. (We do not intentionally have such files in the repository, but they can arise accidentally. They might also, in principle, be added at some point to support tests. They would not likely be kept, because they are incompatible with Windowws and `git` will not check them out there, but they could plausibly exist in some commits on a feature branch, even with no mistakes.) This was *sort of* okay, because the script never accesses the repository or filesystem using the paths, but only displays them to the user. But this was still bad for a few reasons: - The paths were at least in principle ambiguous. - A path with accidental leading whitespace would not be noticed. - The `path` variable looks like it would be usable, so it could be easily misused in future changes, if not fixed. - Having to think through whether or not this parsing bug is excessively serious, when reasoning about the code, may make the code more cumbersome to understand and maintain. This is to say that I should never have done it in that way. Therefore, this replaces the default-`IFS` logic with empty-`IFS` calls to `read` and regular-expression processing, using `bash`'s `[[` keyword, which supports a `=~` operator that matches against a regular expression and populates the `BASH_REMATCH` array variable with captures. (This approach to parsing `git ls-file -sz` is based on code I wrote in https://github.com/EliahKagan/gitscripts/blob/master/git-lsx.) As discussed in #1708, this script may be converted to Rust. This change is not intended as a substitute for that. It should actually make that slightly easier, in that a regex approach has clearer semantics, whether it is translated into code that uses regex or not. (If regex are used, it will probably still need adjustment, as not all dialects reocognize POSIX classes like `[[:digit:]]`.) This commit includes some files to test the fix in the script. They are at the top level of the repository, since the bug this fixes only occurs with whitespace at the beginning of the first path component (relative to the root of the working tree). These files will be removed shortly. The messages for them are: mode +x but no shebang: $'\tb.sh' mode -x but has shebang: \ a.sh Before the fix, they would have been shown as `a.sh` instead of `\ a.sh`, and `b.sh` instead of `$'\tb.sh'`. Note that the approach to quoting has not itself changed: the `%q` format specifier for the `printf` builtin in `bash` formats them this, when the paths are parsed in such a way that the characters that would need quoting if used in the shell are not lost.
1 parent 34efe03 commit 11ac30e

File tree

3 files changed

+11
-3
lines changed

3 files changed

+11
-3
lines changed

b.sh

Whitespace-only changes.

a.sh

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
#!/usr/bin/env bash

etc/check-mode.sh

+10-3
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ cd -- "$root"
1010
symbolic_shebang="$(printf '#!' | od -An -ta)"
1111
status=0
1212

13-
function check () {
13+
function check_item () {
1414
local mode="$1" oid="$2" path="$3" symbolic_magic
1515

1616
# Extract the first two bytes (or less if shorter) and put in symbolic form.
@@ -28,11 +28,18 @@ function check () {
2828
status=1
2929
}
3030

31+
readonly record_pattern='^([0-7]+) ([[:xdigit:]]+) [[:digit:]]+'$'\t''(.+)$'
32+
3133
# Check regular files named with a `.sh` suffix.
32-
while read -rd '' mode oid _stage_number path; do
34+
while IFS= read -rd '' record; do
35+
[[ $record =~ $record_pattern ]]
36+
mode="${BASH_REMATCH[1]}"
37+
oid="${BASH_REMATCH[2]}"
38+
path="${BASH_REMATCH[3]}"
39+
3340
case "$mode" in
3441
100644 | 100755)
35-
check "$mode" "$oid" "$path"
42+
check_item "${BASH_REMATCH[@]:1:3}"
3643
;;
3744
esac
3845
done < <(git ls-files -sz -- '*.sh')

0 commit comments

Comments
 (0)