-
-
Notifications
You must be signed in to change notification settings - Fork 584
fix: ensure filenames with spaces are excluded from targets #2748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -103,7 +103,14 @@ def whl_library_targets( | |
for filegroup_name, glob in filegroups.items(): | ||
native.filegroup( | ||
name = filegroup_name, | ||
srcs = native.glob(glob, allow_empty = True), | ||
srcs = native.glob( | ||
glob, | ||
exclude = [ | ||
# File names with spaces should be excluded. | ||
"**/* *", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought that our supported There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. However, it seems that someone tried it and it did not work? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah we also hit the issue with setuptools in runfiles, but with the Go runfiles library. Setuptools seems to contain files with spaces, so even if bazel itself can handle then now, the runfiles libraries can't. |
||
], | ||
allow_empty = True, | ||
), | ||
visibility = ["//visibility:public"], | ||
) | ||
|
||
|
@@ -229,10 +236,13 @@ def whl_library_targets( | |
"**/*.py", | ||
"**/*.pyc", | ||
"**/*.pyc.*", # During pyc creation, temp files named *.pyc.NNNN are created | ||
"**/*.pyo.*", # During pyo creation, temp files named *.pyo.NNNN are created | ||
# RECORD is known to contain sha256 checksums of files which might include the checksums | ||
# of generated files produced when wheels are installed. The file is ignored to avoid | ||
# Bazel caching issues. | ||
"**/*.dist-info/RECORD", | ||
# File names with spaces should be excluded. | ||
"**/* *", | ||
] + glob_excludes.version_dependent_exclusions() | ||
for item in data_exclude: | ||
if item not in _data_exclude: | ||
|
@@ -242,7 +252,10 @@ def whl_library_targets( | |
name = py_library_label, | ||
srcs = native.glob( | ||
["site-packages/**/*.py"], | ||
exclude = srcs_exclude, | ||
exclude = srcs_exclude + [ | ||
# File names with spaces should be excluded. | ||
"**/* *", | ||
], | ||
# Empty sources are allowed to support wheels that don't have any | ||
# pure-Python code, e.g. pymssql, which is written in Cython. | ||
allow_empty = True, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking that it would be nice to have
glob_excludes.pyc_files()
andglob_excludes.pyo_files()
andglob_excludes.files_with_spaces()
. Then we can ensure that the explanation for why we need to do what we need to do can be next to their definitions.I would also love to exclude
.pyc
and.pyc.*
is the hermetic toolchain definition, so that the exclude is the same regardless if we arechmod
ing the dir to be read-only or not.What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah can add those methods.
Re: pyc, I think we'd only want the temp files excluded here? I'd originally excluded then in a different PR in a different part of the code (removed in this PR in favor of here). This change is keeping the pyc excluded in a single place.
If the pyc files are stable, then generally it would be preferable to keep them, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. Yeah, if they are stable it is fine and we are already setting the vars to make them stable, so SGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If memory serves, excluding pyc was what finally got rid of the Windows jobs getting "can't delete open file" errors. My theory was two processes both went to import at a module without a pyc. Both would start the pyc process, but one would manage to finish writing and open the pyc, then the other process would try to overwrite it. But it couldn't, because the file was open.
The secondary issue is, as pycs are created, they show as additional files added to the target, thus invalidating it, which means anything downstream has to re-run. Eventually things will settle, but they'll only stay settled as long as the repo sticks around. A similar issue can happen with the timestamps: two processes might race and end up creating slightly different timestamped pycs, thus making it look like the file changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only true for the
pyc
generation happening atrepository_rule
execution time. I have added-B
a while ago.When the packages are used in the regular
py_binary
andpy_test
rules I expect thepyc
files to be created in the sandbox and not therepository_rule
output dirs, but my claim should be checked.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good. Yeah, that should prevent that issue, then. SGTM.