-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JOSS] Comments on the functionality #175
Comments
@rcannood first off, thank you so much for your careful review and very helpful comments. I'll try to address some of your points below. TL;DR: Please let us know what you think would be best in terms of making the documentation more accessible: (a) split the "comprehensive guide" into separate parts; (b) create new, self-contained vignettes for the following 3 topics: caching, checkpointing, and debugging; (c) something else. I lean toward (a) as part of this JOSS review with (b) as a longer-term goal.
I've added a couple of issues related to this part of the API: #176 and #177. I agree this is a bit unclear as it stands but I think that a few updates to the docs and a small bit of API refactoring could go a long way to addressing your concerns. One point of confusion is the
We've definitely found it challenging to document To clarify, do you think it would be better to created separate, self-contained vignettes for each concept? Another option is to split the comprehensive guide into smaller chunks (e.g. Part 1, Part 2, etc) focused on specific parts of the API. Would that achieve the same goal?
I think this is a great idea. I've opened #178.
We print the total execution time at the end of As I mentioned above, debugging capabilities are documented here. When an error occurs, we capture the error object itself, process ID, output from
I don't blame you if you missed that example in the "comprehensive guide" vignette given that vignette is a monolith, or maybe that example just isn't clear enough. If the problem is the former, then my inclination is to split that vignette into separate parts so that users can quickly get to the information they need via the docs site header navigation. Please let us know if you think a different approach would be better or if the existing example needs to be reworked.
This is possible using the appropriate |
I see! 👍 Sounds good!
Indeed, I found the simChef: a comphrensive guide quite long, which makes it hard to find information regarding a specific aspect, e.g. caching or debugging. However, please consider this a non blocking issue as the information is definitely there. For all of the other items I see you've created issues. I will also consider them optional improvements to the software tool, so that they are no longer blocking publication of the manuscript from my part :) |
This issue pertains to my review of the simChef's functionality as part of my review of the JOSS submission of this tool at openjournals/joss-reviews#6156.
I apologize for being quite critical about certain statements in the paper (discussed below), but I do think that explicitly showcasing each of the topics below would be of immense value to an inexperienced user wishing to perform the kinds of studies allowed by simChef.
Functionality
The paper states:
The documentation also states:
There is too little information on how to enable distributed computation, reproducibility, caching, checkpointing and debugging when using simChef.
I found some information in Setting Up Your Simulation Study that suggests you can use
hpc = TRUE
andinit_renv = TRUE
to enable distributed computations and reproducibility. However, at this stage I find it hard to argue that the package seamlessly integrates with any of these concepts, since they are not discussed enough in the documentation.Would it be possible to create separate articles in the documentation to showcase how to set up distributed computation, set up reproducibility, how to use caching & checkpointing, and how to debug runs? Or would you have an alternative solution?
I'm certain that you as developers know exactly how to do this with simChef, but currently the documentation does little for novice simChef users to learn how to do any of these things from scratch.
In the context of reproducibility I would expect the rendered report to contain information on software versions of the used environments.
W.r.t. caching and debugging tools I'd expect to be able to see the execution time, CPU usage, memory usage and error messages when using an HPC as backend. What happens when one of the executions fails? How can I debug what went wrong during a failed run?
Performance
This ties in with the previous comment. The paper mentions efficient usage of computational resources, but some simulation studies will require using an HPC to be able to run the analysis in a decent time frame.
Checklist
The text was updated successfully, but these errors were encountered: