DCN July 2020 Hackathon updates to Jupyter Notebook Primer#19
DCN July 2020 Hackathon updates to Jupyter Notebook Primer#19kekoziar wants to merge 9 commits intoDataCurationNetwork:mainfrom
Conversation
Update Jupyter Notebook version number in the format overview table
The guidance is provided by the Software Sustainability Institute (1), and funded by Jisc (2).
Clarified for curators unfamiliar with computer science terminology the relation between a kernel and programming language. Elaborated on the cell order and expectations of users (those who download a notebook)
expand dependencies section to include other types of dependencies file. Annotate citation.cff Clarify that a container metafile is appropriate to request if used.
Added annotations and clarifications.
Add clarifying question to help curator unfamiliar with code. Add examples of ipynb archived in data repositories. add/renumber associated end-notes.
Add title and alt text for decision tree images.
| - reST export of the Jupyter Notebook (export from Jupyter web application) | ||
| - CodeMeta.json | ||
| - CITATION.cff | ||
| - CodeMeta.json, requirements.txt, or environment.yml (dependencies) |
There was a problem hiding this comment.
I would recommend listing CodeMeta.json as preferred at least as it provides the ability to define more extensive structured metadata using a controlled vocab.
| - CodeMeta.json | ||
| - CITATION.cff | ||
| - CodeMeta.json, requirements.txt, or environment.yml (dependencies) | ||
| - CITATION.cff (a software citation file appropriate if not depositing in a repository) |
There was a problem hiding this comment.
Adding a citation file is always appropriate— many repositories do not have the fields necessary to automatically generate a proper software citation.
| - Documents what the Jupyter Notebook is for | ||
| - Request that this file include citation(s) to third-party algorithms and analyses | ||
| - Recommend code comments within the Notebook file itself in addition to the README file | ||
| - Documents what the Jupyter Notebook is for (but recommendation is that the Notebook utilize code comments) |
There was a problem hiding this comment.
Code comments should not be seen as a replacement or alternative to providing a README file. The code comments are used to describe what specific sets of cells do, but the notebook itself can have a much broader description and context.
| - Recommend additional machine actionable dependency documentation (e.g. requirements.txt or environment.yml) | ||
| - CITATION.cff for the Notebook | ||
| - Preferred citation; should enable native software citation | ||
| - Relevant if the Notebook is not being submitted to a repository |
|
@dbouquin IIRC, we're not saying to not have dependencies listed or citation information; there was concern regarding recommending very specific file types (CITATION.cff and CodeMeta.json) without appropriate explanation of and assistance to help create them. I think it would be helpful to new curators who aren't familiar with python notebooks and these files to include a link to an example dataset that includes these files. Can you link one? |
|
Do you think something like this would work? Not sure what you mean by dataset here. https://doi.org/10.5281/zenodo.3953146 (This is code that generates CodeMeta files for R packages— there's a codemeta.json file included) |
|
While dataset may be used broadly, I mean dataset specific to this primer. That would be a Python notebook that is an example of the recommended curation level. |
|
Got it. I know these are "data curation primers" it's just I would never
refer to a Jupytner notebook as data so it confused me for a sec.
After a very quick search, how about this:
https://doi.org/10.5281/zenodo.3569768
And here's a random example on GitHub:
https://github.com/donomii/throff-jupyter
There's also a nice example on this project from Jupyter:
https://github.com/jupyter/nbgrader
…On Wed, Sep 16, 2020 at 11:53 AM K.E. Koziar ***@***.***> wrote:
While dataset may be used broadly, I mean dataset specific to this primer.
That would be a Python notebook that is an example of the recommended
curation level.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLS7VTG32LGK2VSB7WPZSTSGDNOLANCNFSM4O6P4MEA>
.
|
Separate commits detail proposed changes.
To summarize the proposed changes:
PR made on behalf of our team:
@kozlowwe
@gjanee
@srerickson
@cincyamyK
@kekoziar
@gdntmoon