Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement wildcard pattern for input #59

Merged
merged 7 commits into from
May 16, 2024

Conversation

yucongalicechen
Copy link
Collaborator

Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nicely done. Just one small typo and one suggestion in the input.

I also see some copy-pasted code which makes me wonder if it might be better to simply look for the wildcard and expand it before passing into this code block. Kind of like a prefilter that does something like

for input_name in args.input:
     if '*" in name 
             do the globbing and find the list of files
             add the glob list to args.inputs and remove the "*" name

then again,

for input_name in args.inputs:
    code unchanged from before

which would be easier to maintain.

"file_list.txt that can be found in the folder ./data).",
"file_list.txt that can be found in the folder ./data). "
"Wildcard character (*) is accepted. Examples include './*chi'"
" (load all files with .chi extension) and 'data/test*' (load "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe change to, ./*.chi to be closer to that description.

What happens if the wild-card expands to files and directories? We should tell the user what happens then.

@yucongalicechen
Copy link
Collaborator Author

  • I added a function expand_wildcard_file to take care of wildcard patterns and return the expanded list of inputs. Shall we put the two expand functions into set_input_lists for convenience? So that in the labpdfproc.py and tests, we only need to call for set_input_lists.
  • I wrote an error message for wildcard patterns that cannot match any files. But if the user specifies an invalid wildcard pattern like "**", this error message might not be that helpful as the original error message from Python (as it would tell user that it is an invalid wildcard pattern). I currently cannot think of a way to distinguish between these two situations, unless checking that if wildcard pattern is one of the invalid cases, but there are many invalid cases to check..

Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see inline comments.

src/diffpy/labpdfproc/labpdfprocapp.py Outdated Show resolved Hide resolved
src/diffpy/labpdfproc/tools.py Outdated Show resolved Hide resolved
@sbillinge
Copy link
Contributor

re your comment, I agree that it might be good to put the two pre-filters into the expand_user_input function. Perhaps we shoud also make them private functions (add an underscore at the beginning) which makes the code more readable. It is nice that we have independent tests for them, no need to undo that.

Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may need a test for a user putting a wildcard in the file-list file. I think your code may currently fail this test, but let's make the test and see.

Almost there, see a few last cleaning edits, and also I think my request to not support wildcards on directories may have gotten lost in those long comments on the last review, bu tplease remove support for wildcards on directories.

src/diffpy/labpdfproc/labpdfprocapp.py Outdated Show resolved Hide resolved
src/diffpy/labpdfproc/tests/test_tools.py Outdated Show resolved Hide resolved
src/diffpy/labpdfproc/tests/test_tools.py Outdated Show resolved Hide resolved
src/diffpy/labpdfproc/tools.py Outdated Show resolved Hide resolved
src/diffpy/labpdfproc/tools.py Show resolved Hide resolved
Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work. I could merge it like this, but there are a couple of final comments.

@@ -48,6 +48,11 @@ def expand_list_file(args):
file_inputs = [input_name.strip() for input_name in f.readlines()]
args.input.extend(file_inputs)
args.input.remove(file_list_input)
wildcard_inputs = [input_name for input_name in args.input if "*" in input_name]
for wildcard_input in wildcard_inputs:
input_files = [str(file) for file in Path(".").glob(wildcard_input) if "file_list" not in file.name]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may be able to remove if file_list not in file.name because file_list files have been removed already.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is for files in the glob directory (not in args.input), so if we have a file list in the same directory as the wildcard, then it'll be loaded if we don't skip it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, yes better to have it to be on the safe side. It may be better to make it stricter so if file.name == file_list.txt but it is probably ok as it is.

src/diffpy/labpdfproc/tests/conftest.py Show resolved Hide resolved
@sbillinge sbillinge merged commit 5c3eed7 into diffpy:main May 16, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants