-
Notifications
You must be signed in to change notification settings - Fork 31
Fix: Update scope document and also fix a few sphinx bugs #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 25 commits
ba272d9
b839139
f70c92f
eb5733b
e2c0066
c3ec5c6
59c2bd5
35bfc62
f2b3c9d
07d219c
1248d93
bbcf516
c78faac
e80bc22
57ccb6e
272d5ba
665bfe1
3d7a6f8
4c5a12e
75dd711
c9c90c2
1fb11a6
4fe7f02
cc7e9ce
31f4a03
06ada72
2eef270
9efc539
6de8c52
46570dd
152c2f3
f5b50d9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,21 +18,257 @@ Currently, the packages that pyOpenSci reviews also need to fall into the | |
technical and applied scope of our organization. This scope may expand over time | ||
as the organization grows. | ||
|
||
|
||
|
||
## Is Your Package in Scope For pyOpenSci Review? | ||
|
||
pyOpenSci only reviews packages that fall within our specified domain and | ||
technical scope listed below. | ||
pyOpenSci reviews packages that fall within a list of specified categories and | ||
domains. Packages must also meet our technical scope requirements. | ||
|
||
If you are unsure whether your package is in scope for review, please | ||
open a [pre-submission inquiry using a GitHub Issue](https://github.com/pyOpenSci/software-review/issues/new?assignees=&labels=0%2Fpresubmission&template=presubmission-inquiry.md&title=) to get feedback from | ||
one of our editors. We are happy to look at your package and help you understand | ||
whether it is in scope or not. | ||
|
||
```{include} /appendices/scope.md | ||
### About the types of packages that we review | ||
|
||
pyOpenSci reviews packages that support open reproducible science, | ||
data processing and and the various stages of managing the | ||
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
data lifecycle. Packages submitted to pyOpenSci should fit into one or | ||
more of the categories below and should be within our technical scope. | ||
|
||
```{admonition} Package Use Metrics Are Not a Requirement for Review | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This suggests "we don't need you to go and measure your package use (but we may consider how much it is used)". Rather: "Widespread package usage is not required for review", "We review independent of popularity", or similar? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. many thanks @stefanv fixing. @NickleDave i think had similar feedback here and i haven't been able to get it right! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this any better : Your Package Does Not Need to Have Widespread Use to be Reviewed We review packages across a spectrum of small to large user bases. The popularity of your package is not a consideration in our review process! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good to me! nit: would maybe say it like "Your Package Does Not Need to Be In Widespread Use Yet to be Reviewed" or "Your Package Does Not Need to Be Widely Used to be Reviewed"? ("in use" not "have use") There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this was merged a few hours ago... would you be open to creating a PR for this - i'm fine with cleaner wording. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yup, sorry, was catching up with replies |
||
:class: important | ||
|
||
When we evaluate whether you package is within our scope, we only consider: | ||
|
||
1. how the package is developed and | ||
2. how the package relates to and supports the broader scientific ecosystem | ||
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
pyOpenSci does not evaluate how much community use your package has. | ||
As such, the number of GitHub stars or PyPI / Conda | ||
downloads is NOT considered as a part of our scope evaluation. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is very explicit, but unsure it adds much content. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you're worried about users being confused at this point already, perhaps it's worth clarifying WHY you are reviewing in the first place. It's not to establish how useful a package is, but how well it is put together. This is a subtle change of emphasis on the "we only consider" clause above. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok and now i'm here - We review packages with the goal of improving package quality and usability for scientists. When we evaluate whether you package is within our scope, we only consider:
We welcome young packages that are just entering the scientific Python |
||
|
||
We welcome young packages that are just entering the scientific Python | ||
ecosystem to apply for review if they are relevant to the science community and | ||
fit into at least one scope category below. We also welcome mature packages with | ||
a growing or established community! | ||
``` | ||
|
||
If you are unsure whether your package fits into one of the general or | ||
statistical categories, please open an issue as a [pre-submission inquiry](https://github.com/pyOpenSci/software-submission/issues/new?assignees=&labels=0%2Fpresubmission&template=presubmission-inquiry.md&title=). | ||
|
||
```{note} | ||
This is a living document. The categories below may change through time. | ||
This may mean in some cases, some previously peer review-accepted packages | ||
may not be in-scope today. We strive for consistency in our peer review process. However, we also evaluate packages on a case-by-case basis. | ||
In some cases exceptions are made. | ||
``` | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. NOTE: i think we should add a policy around telemetry in packages. a current review has it. i think we should discourage against it but if it's required make it opt in rather than opt out preferred? We should also evaluate whether we want to review general data manipulation and coding tools. ROS does not review these tools. but defining what that means is tricky. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps start with a narrower scope? It's easy to broaden it later, but not the other way around. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. good point! is this a specific comment related to our scope as defined now or just to that statement above? I added that given Ropensci has similar language and it just allows us to adjust in the future in case we realize we are headed down a rabbit hole that doesn't work for us! (kind of like a legal disclaimer :) in case we need to modify scope) We've had the same scope for the past 4 years BUT of course we are being more specific now about each category given all of the questions we've gotten :) |
||
## Package categories | ||
|
||
The following categories are the current domain areas that fall into the | ||
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
pyOpenSci domain scope. Note that your package should have some level of | ||
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
demonstrated scientific application. This could be a use case that you can | ||
link to or a tutorial that demonstrates its potential application for science. | ||
|
||
Below we provide examples of packages from pyOpenSci ecosystem. Because we | ||
have growing community of packages, in some cases we link to R packages | ||
within the rOpenSci community that match the category scope for reference. | ||
|
||
We will update this page as our review process evolves. | ||
|
||
```{note} | ||
Many of the example packages below perform tasks that might fit in multiple | ||
categories. There examples are there to provide you with a flavor of the types | ||
of packages that would fall into that category. | ||
``` | ||
|
||
### Data retrieval | ||
Packages for accessing and downloading data from online sources. This category | ||
includes wrappers for accessing APIs. | ||
|
||
Our definition of scientific applications is broad, including data storage | ||
services, journals, and other remote servers, as many data sources may be of | ||
interest to scientists. However, retrieval packages should be focused on data | ||
sources / topics, rather than services. For example a general client for Amazon | ||
Web Services data storage would not be in-scope. | ||
|
||
* Examples: [OpenOmics](https://github.com/pyOpenSci/software-submission/issues/31) | ||
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Data extraction | ||
|
||
These packages aid in retrieving data from unstructured sources such as text, | ||
images, and PDFs. They might also parse scientific data types and outputs from | ||
scientific equipment. | ||
|
||
* Examples: [devicely](https://github.com/pyOpenSci/software-submission/issues/37), [jointly](https://github.com/pyOpenSci/software-submission/issues/45) | ||
|
||
### Data processing & munging | ||
|
||
Data munging tools are tools that support processing data discussed above. This | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it worth giving a definition of munging here, or linking to one? (like https://en.wikipedia.org/wiki/Data_wrangling) Something like "transforming data in a way that makes further analysis possible" (e.g. Physcraper updating local phylogenies with publicly available data) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. absolutely - want to fix inline please?
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
category focuses on tools for handling data in specific formats that scientists | ||
may be interested in working with. These data may also be generated from | ||
scientific workflows or exported from instruments and wearables. | ||
|
||
* Examples: [physcraper](https://github.com/pyOpenSci/software-submission/issues/26) | ||
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
### Data deposition | ||
|
||
Tools for depositing data into scientific research repositories. | ||
|
||
* Examples: [This is an example from rOpenSci - eml](https://github.com/ropensci/software-review/issues/80) | ||
|
||
### Data validation and testing: | ||
|
||
Tools that enable automated validation and checking of data quality and | ||
completeness. These tools should be able to support scientific workflows. | ||
|
||
* Example: [pandera](https://github.com/pyOpenSci/software-submission/issues/12) | ||
|
||
### Scientific software wrappers | ||
|
||
Scientific software wrappers refer to packages that provide a Python interface | ||
for existing scientific packages written in other languages. | ||
|
||
These packages should have a clear scientific application. Wrappers must provide | ||
significant added value to the scientific ecosystem be it in data handling, or | ||
improved installation processes for Python users. | ||
|
||
We strongly encourage submissions that wrap tools that are open-source with | ||
an OSI-approved license. Exceptions will be evaluated case-by-case, | ||
considering whether open-source options exist. | ||
|
||
<!-- TODO: need an example for this category --> | ||
* Examples: We don't have a package in this category yet - *Could be your package?* | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't pyGMT an example? Because it wraps GMT. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i think so. pyGMT actually fits into several of our categories i think i just placed it in one. but if we implement your suggestion above then we demonstrate that a package can be in several categories. i'm fine with that!
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Workflow automation & versioning | ||
Tools that automate and link together workflows and as such support | ||
reproducible workflows. These | ||
tools may include build systems and tools to manage continuous integration. | ||
This also includes tools that support version control. | ||
|
||
<!-- TODO: marting git package is a good example but it's needs to be sunsetted as it's no longer maintained. hamilton is just starting review but would fit | ||
well here - the link below for it is a presubmission not a review --> | ||
* Examples: [Hamilton - currently under review](https://github.com/pyOpenSci/software-submission/issues/74), martin's git package (that is no longer maintained.) | ||
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Citation management and bibliometrics: | ||
|
||
Tools that facilitate managing references, such as for writing manuscripts, | ||
creating CVs or otherwise attributing scientific contributions, or accessing, | ||
manipulating or otherwise working with bibliometric data. (Example: [Example from rOpenSci - RefManageR](https://github.com/ropensci/software-review/issues/119)) | ||
|
||
### Data visualization & analysis | ||
These are packages that enhance a scientists experience visualizing and | ||
analyzing data. | ||
|
||
* Examples: [PyGMT - (also spatial and data munging)](https://github.com/pyOpenSci/software-submission/issues/43), | ||
|
||
### Database software bindings | ||
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Bindings and wrappers for database APIs. | ||
|
||
* Example: [Example from rOpenSci - rrlite](https://github.com/ropensci/software-review/issues/6) | ||
|
||
|
||
## Domain areas | ||
|
||
In addition, our scope includes focused domain areas. These areas are based on | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The "computational library" class, described here, seems to be missing from the previous section, when it's arguably the first category included? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Say a little bit more about what you mean? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ugh... i missed a few comments here! i will make sure these are added in a separate PR. i am not sure how i missed this |
||
partnerships that we form with communities and also expertise that we hold | ||
within our organization. As we develop [new community partnerships](/partners/scientific-communities) and grow, | ||
we will expand this list. | ||
|
||
### Geospatial | ||
|
||
Packages focused on the retrieval, manipulation, and analysis of spatial data. | ||
|
||
* Examples: [PyGmt](https://github.com/pyOpenSci/software-submission/issues/43), | ||
[Moving Pandas ](https://github.com/pyOpenSci/software-submission/issues/18) | ||
|
||
### Pangeo | ||
|
||
We have a [partnership with Pangeo](../partners/pangeo). Often times packages submitted as a part of that partnership are also in the geospatial domain. | ||
|
||
* Examples: [xclim - under review now](https://github.com/pyOpenSci/software-submission/issues/73) | ||
|
||
### Education | ||
|
||
Packages to aid with instruction. | ||
<!--TODO - Earthpy fit in this category but it also needs to be sunsetted --> | ||
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Package technical scope | ||
|
||
To be in technical scope for a pyOpenSci review, your package: | ||
|
||
* Should have maintenance workflows documented. | ||
* Should declare vendor dependencies using standard approaches rather than including code from other packages within your repository. | ||
* Should not have an exceedingly complex structure. Others should be able to contribute and/or take over maintenance if needed. | ||
|
||
```{admonition} pyOpenSci's goal is to support long(er) term maintenance | ||
pyOpenSci has a goal of supporting long term maintenance of open source | ||
Python tools. It is thus important for us to know that if you need to step down as a maintainer, the package infrastructure and documentation is | ||
in place to support us finding a new maintainer who can take over you | ||
package's maintenance. | ||
``` | ||
|
||
### Telemetry & user-informed consent | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am going to pull this into it's own PR so we can link it explicitly to the open issues and discussion. |
||
|
||
Your package should avoid collecting usage analytics. With | ||
that in mind, we understand that package-use data can be invaluable for the | ||
development process. If the package does collect such data, it should do so | ||
by prioritizing user-informed-consent. This means that before any data are | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. from @yuvipanda - user-informed-consent and use "usage analytics" to define what is being collected. |
||
collected, the user understands: | ||
|
||
1. What data are collected | ||
2. How the data are collected. | ||
3. What you plan to do with the data | ||
4. How and where the data are stored | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. from @stefanv possibly allow the user to view what is being sent (locally - good enough) or at it's endpoint storage (best) - i'm interpreting following good-better-best |
||
|
||
Once the user is informed of what will be collected and how that data will be handled, stored and used, you can implement `opt-in` consent. `opt-in` means that the user agrees to usage-data collection prior to it being collected (rather than having to opt-out when using your package). | ||
|
||
We will evaluate usage data collected by packages on a case-by-case basis | ||
and reserve the right not to review a package if the data collection is overly | ||
invasive. | ||
|
||
### Notes on scope categories | ||
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- pyOpenSci is still developing as a community. If your scientific Python | ||
package does not fit into one of the categories or if you have any other | ||
questions, we'd encourage you to open a pre-submission inquiry. We're happy to help. | ||
- Data visualization packages come in many varieties, ranging from small | ||
hyper-specific methods for one type of data to general, do-it-all packages | ||
(e.g. matplotlib). pyOpenSci accepts packages that are somewhere in between the | ||
two. If you're interested in submitting your data visualization package, please | ||
open a pre-submission inquiry first. | ||
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Examples of packages that might be out of technical scope | ||
|
||
pyOpenSci may continue to update its criteria for technical scope | ||
review as more packages with varying structural approaches are reviewed. | ||
Your package **may not be in technical scope** for us to review at this time if | ||
fits any of the out-of-technical-scope criteria listed below. | ||
lwasser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Example 1: Your package is an out of sync fork of another package repository that is being actively maintained. | ||
|
||
Sometimes we understand that a package maintainer may need to step down. In | ||
that case, we strongly suggest that the original package owner, transfer the | ||
package repository to a new organization along with PyPI credentials. A new | ||
organization would allow transfer of ownership of package maintenance rather | ||
than several forks existing. | ||
|
||
If your package is a divergent fork of a maintained repository we will encourage you | ||
to work with the original maintainers to merge efforts. | ||
|
||
However, if there is a case where a forked repository is warranted, please | ||
consider submitting a pre-submission inquiry first and explain why the package is a | ||
fork rather than an independent parent repository. | ||
|
||
### Example 2: Vendored dependencies | ||
|
||
If your package is a wrapper that wraps around another tool, we prefer that | ||
the dependency be added as a dependency to your package. This allows | ||
maintenance of the original code base to be independent from your package's | ||
maintenance. | ||
|
||
(package-overlap)= | ||
## Package Overlap | ||
pyOpenSci encourages competition among packages, forking and re-implementation | ||
|
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,6 @@ | ||
# Community Code of Conduct | ||
# pyOpenSci Code of Conduct | ||
|
||
We keep our Code of Conduct in our governance documentation. [Click here to | ||
go there now.](https://www.pyopensci.org/governance/CODE_OF_CONDUCT.html) | ||
All individuals participating in any pyOpenSci program such as our peer review process, need to abide by our code of conduct. | ||
|
||
[Click here to | ||
read our full code of conduct now.](https://www.pyopensci.org/governance/CODE_OF_CONDUCT.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eriknw @NickleDave @arianesasso @cmarmo @sneakers-the-rat @Batalex i'd LOVE your input on these changes. this page is really the only one you have to look at in this review. the rest of the changes are just link fixes and cleanup.
Many thanks in advance for this!