-
-
Notifications
You must be signed in to change notification settings - Fork 577
fix: ensure filenames with spaces are excluded from targets #2748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
"**/__pycache__/*.pyc.*", | ||
"**/__pycache__/*.pyo.*", | ||
# File names with spaces should also be ignored. | ||
"**/* *", | ||
] + glob_excludes.version_dependent_exclusions() + extra_files_glob_exclude, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking that it would be nice to have glob_excludes.pyc_files()
and glob_excludes.pyo_files()
and glob_excludes.files_with_spaces()
. Then we can ensure that the explanation for why we need to do what we need to do can be next to their definitions.
I would also love to exclude .pyc
and .pyc.*
is the hermetic toolchain definition, so that the exclude is the same regardless if we are chmod
ing the dir to be read-only or not.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah can add those methods.
Re: pyc, I think we'd only want the temp files excluded here? I'd originally excluded then in a different PR in a different part of the code (removed in this PR in favor of here). This change is keeping the pyc excluded in a single place.
If the pyc files are stable, then generally it would be preferable to keep them, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. Yeah, if they are stable it is fine and we are already setting the vars to make them stable, so SGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If memory serves, excluding pyc was what finally got rid of the Windows jobs getting "can't delete open file" errors. My theory was two processes both went to import at a module without a pyc. Both would start the pyc process, but one would manage to finish writing and open the pyc, then the other process would try to overwrite it. But it couldn't, because the file was open.
The secondary issue is, as pycs are created, they show as additional files added to the target, thus invalidating it, which means anything downstream has to re-run. Eventually things will settle, but they'll only stay settled as long as the repo sticks around. A similar issue can happen with the timestamps: two processes might race and end up creating slightly different timestamped pycs, thus making it look like the file changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only true for the pyc
generation happening at repository_rule
execution time. I have added -B
a while ago.
When the packages are used in the regular py_binary
and py_test
rules I expect the pyc
files to be created in the sandbox and not the repository_rule
output dirs, but my claim should be checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good. Yeah, that should prevent that issue, then. SGTM.
glob, | ||
exclude = [ | ||
# File names with spaces should be excluded. | ||
"**/* *", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that our supported bazel
versions support files with spaces, so why do we need to exclude them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, it seems that someone tried it and it did not work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we also hit the issue with setuptools in runfiles, but with the Go runfiles library. Setuptools seems to contain files with spaces, so even if bazel itself can handle then now, the runfiles libraries can't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM. The only comment I have is to replace the "files with spaces should be ignored" with text that tells why they should be ignored. I suggested an edit to that effect in one spot. I'm OK with copy/pasting that same comment, or factor out a common function with the comment there instead
"**/__pycache__/*.pyc.*", | ||
"**/__pycache__/*.pyo.*", | ||
# File names with spaces should also be ignored. | ||
"**/* *", | ||
] + glob_excludes.version_dependent_exclusions() + extra_files_glob_exclude, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good. Yeah, that should prevent that issue, then. SGTM.
"**/__pycache__/*.pyc.*", | ||
"**/__pycache__/*.pyo.*", | ||
# File names with spaces should also be ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# File names with spaces should also be ignored. | |
# Ignore files with spaces because, while Bazel supports them, | |
# the runfiles manifest format doesn't yet |
Some dependencies contain files with spaces in the name. These should be excluded as they are generally unsupported, and when placed in a runfiles manifest file, they cause it to be malformed.
This changes omits files with spaces in the names from glob patterns.
It also changes the
.pyo.NNN
temp file inclusion added in #2743 as it seems it was slightly misplaced, and missed form 3p dependency targets.