You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat: add more to our gen ai policy
* Apply suggestion from @lwasser
* Minor text edit
Co-authored-by: Kylen Solvik <kysolvik@gmail.com>
* Minor text edit
Co-authored-by: Kylen Solvik <kysolvik@gmail.com>
* Apply suggestion from @MicahGale
Co-authored-by: Micah Gale <mgale@fastmail.com>
* Apply suggestion from @kysolvik
Co-authored-by: Kylen Solvik <kysolvik@gmail.com>
---------
Co-authored-by: Inessa Pawson <inessapawson@gmail.com>
Co-authored-by: Kylen Solvik <kysolvik@gmail.com>
Co-authored-by: Micah Gale <mgale@fastmail.com>
Copy file name to clipboardExpand all lines: about/package-scope.md
+11-24Lines changed: 11 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -140,7 +140,7 @@ Tools for depositing data into scientific research repositories.
140
140
141
141
- Examples: [This is an example from rOpenSci - eml](https://github.com/ropensci/software-review/issues/80)
142
142
143
-
### Data validation and testing:
143
+
### Data validation and testing
144
144
145
145
Tools that enable automated validation and checking of data quality and
146
146
completeness. These tools should be able to support scientific workflows.
@@ -169,9 +169,9 @@ reproducible workflows. These
169
169
tools may include build systems and tools to manage continuous integration.
170
170
This also includes tools that support version control.
171
171
172
-
- Examples: Both of these tools are not pyOpenSci reviewed as of yet but are examples of tools that might be in scope for this category - [snakemake](https://snakemake.readthedocs.io/en/stable/), [pyGitHub](https://github.com/PyGithub/PyGithub)
172
+
- Examples: Both of these tools are not pyOpenSci reviewed as of yet but are examples of tools that might be in scope for this category - [snakemake](https://snakemake.readthedocs.io/en/stable/), [pyGitHub](https://github.com/PyGithub/PyGithub)
173
173
174
-
### Citation management and bibliometrics:
174
+
### Citation management and bibliometrics
175
175
176
176
Tools that facilitate managing references, such as for writing manuscripts,
177
177
creating CVs or otherwise attributing scientific contributions, or accessing,
@@ -207,7 +207,11 @@ The review for this package:
207
207
- requires at least 1 domain specialist
208
208
- will never vet the analytical method itself.
209
209
210
+
<<<<<<< genai-2
211
+
1. If your package introduces a novel or newer analytic approach that is not yet vetted/ accepted by a scientific journal, we can not review it. We cannot review projects that exist as a proof-of-concept demonstration of a model or analytical approach that might accompany a paper. In this case, the approach should be sent to a scientific journal for vetting.
212
+
=======
210
213
2. We cannot review a package that introduces a new or novel analytic approach unless they have already been **vetted or accepted by a scientific journal**. We also cannot review projects that serve as proof-of-concept demonstrations of a model or analytical approach that might accompany a paper. If your package falls under either of these cases, please submit it to a scientific journal for peer review before requesting a review here.
214
+
>>>>>>> main
211
215
212
216
3. If your package implements a novel approach that **has** been peer-reviewed and accepted by a credible scientific journal, it may be eligible for our [publication fast-track review](publication-fast-track). Fast-track review is a streamlined review process focused on software quality and packaging standards. A fast-track review is performed by one reviewer rather than two, focusing solely on packaging rather than the scientific methods applied. Since the domain/scientific component has already been vetted by the journal, the pyOpenSci fast-track reviewer is not expected to re-evaluate the underlying scientific method. To apply for the fast track route, please first submit a pre-submission inquiry and include the publication details.
213
217
@@ -230,7 +234,7 @@ we will expand this list.
230
234
Packages focused on the retrieval, manipulation, and analysis of spatial data.
@@ -300,31 +304,14 @@ that may be outside JOSS scope while maintaining our partnership for
300
304
packages that meet both organizations' criteria.
301
305
:::
302
306
303
-
### Telemetry & user-informed consent
304
-
305
-
Your package should not collect collecting usage analytics without first informing your users about what data are being collected and what is being done with that data. With
306
-
that in mind, we understand that package-use data can be invaluable for the
307
-
development process. If the package does collect such data, it should do so
308
-
by prioritizing user-informed-consent. This means that before any data are
309
-
collected, the user understands:
310
-
311
-
1. What data are collected
312
-
2. How the data are collected.
313
-
3. What you plan to do with the data
314
-
4. How and where the data are stored
315
-
316
-
Once the user is informed of what will be collected and how that data will be handled, stored and used, you can implement `opt-in` consent. `opt-in` means that the user agrees to usage-data collection prior to it being collected (rather than having to opt-out when using your package).
317
-
318
-
We will evaluate usage data collected by packages on a case-by-case basis
319
-
and reserve the right not to review a package if the data collection is overly
320
-
invasive.
321
-
322
307
To be in technical scope for a pyOpenSci review, your package:
323
308
324
309
- Should have maintenance workflows documented.
325
310
- Should declare vendor dependencies using standard approaches rather than including code from other packages within your repository.
326
311
- Should not have an exceedingly complex structure. Others should be able to contribute and/or take over maintenance if needed.
327
312
313
+
See our [policy for use of generative AI / LLMs](../our-process/policies.md#generative-ai-and-open-source-development) for additional expectations regarding AI-generated code and documentation.
314
+
328
315
:::{admonition} pyOpenSci's goal is to support long(er) term maintenance
329
316
pyOpenSci has a goal of supporting long term maintenance of open source
330
317
Python tools. It is thus important for us to know that if you need to step down as a maintainer, and that the package infrastructure and documentation is
@@ -363,7 +350,7 @@ Your package might be out of in technical scope if it is:
363
350
A few examples of packages that may be too technically challenging for us to
364
351
find a new maintainer for in the future are below.
365
352
366
-
### Example 1: Your package is an out of sync fork of another package repository that is being actively maintained.
353
+
### Example 1: Your package is an out of sync fork of another package repository that is being actively maintained
367
354
368
355
Sometimes we understand that a package maintainer may need to step down. In
369
356
that case, we strongly suggest that the original package owner, transfer the
Copy file name to clipboardExpand all lines: appendices/gen-ai-checklist.md
+10-3Lines changed: 10 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,11 +1,18 @@
1
1
```markdown
2
-
-[ ] Generative AI was used to produce some of the material in this submission.
3
-
-[ ] If generative AI was used in this project, the authors affirm that all generated material has been reviewed and edited for clarity, correctness, and completeness. The authors are responsible for the content of their work and affirm that it is in a state where reviewers will not be responsible for primary editing and review of machine-generated material.
2
+
-[ ] This package has a public development history spanning 3-6 months, with commits distributed over time that reflect **iterative, thoughtful development.**
3
+
-[ ] All code in this package has been **carefully reviewed by a human**. Its implementation is also understood by the authors submitting the package.
4
+
-[ ] All communication on this issue will be written by a human (someone on your maintainer team). We embrace the use of LLMs for translation and grammar correction. We prefer honest interactions over ones that prioritize perfect language and grammar. Use as little aid from a LLM as possible.
5
+
-[ ]**Generative AI tools were used to develop and maintain this package.**
4
6
5
-
If you checked the first box above, please describe how generative AI was used, including:
7
+
**Please list the tools and frameworks that you used below.**
8
+
If you checked the first box above, please describe the tools and frameworks that you used below including:
6
9
7
10
-**Which parts** of the submission were generated (e.g., documentation, tests, code). In addition to a general description, please specifically indicate any substantial portions of code (classes, modules, subpackages) that were wholly or primarily generated by AI.
11
+
12
+
8
13
-**The approximate scale** of the generated portions (e.g., "all of the tests were generated and then checked by a human," "small routines were generated and copied into the code").
14
+
15
+
9
16
-**How the generative AI was used** (e.g., line completion, help with translation, queried separately and integrated, agentic workflow).
10
17
11
18
If you have a policy around generative AI use in your project, please provide a link to it below:
The Generative AI policy below was co-developed by the pyOpenSci community. Its goals are:
7
+
8
+
***Acknowledge the widespread use of Generative AI tools** (LLMs) and promote transparency and responsible use that ensures better software outputs that support sound open source development practices.
9
+
***Ensure equitable balance of effort in peer review** — authors are responsible for human review of AI-generated content before submission; our volunteer reviewers are not responsible for identifying and/or correcting machine-generated errors or issues.
10
+
***Protect volunteer reviewers** from being the first line of review for generated code.
11
+
* Give reviewers and editors the information they need to make informed decisions about what they choose to review.
12
+
***Support and promote packages that follow sustainable software practices** that enable future discovery and uphold the foundational principles of scientific open source.
13
+
* Raise awareness of the broader challenges Generative AI presents to the scientific open source community.
14
+
* Promote transparency and privacy in user data
15
+
16
+
[Please see this GitHub issue for a discussion of the topic.](https://github.com/pyOpenSci/software-peer-review/issues/331)
17
+
18
+
In generating our Generative AI policy, we acknowledge some of the other policies in the open source ecosystem that inspired our work here, including:
*[Melissa Mendonça’s Collection of GenAI Policies](https://github.com/melissawm/open-source-ai-contribution-policies)
24
+
25
+
::::
26
+
27
+
## Generative AI and open source development
28
+
29
+
We understand and support your use of Generative AI tools to improve software development workflows and to make your developer workflows more efficient. We want you to use them thoughtfully and effectively, and in ways that improve both the open source ecosystem and your development trajectory.
30
+
31
+
We expect that all code and documentation submitted to our peer review process should have meaningful human review, intervention, judgment, and context. We understand that the use of current Generative AI tools is often tightly woven into development workflows, making disclosure challenging. But **we still require disclosure** to support both transparency and to allow reviewers and editors to understand what they are reviewing.
32
+
33
+
The policies below support adherence to thoughtful open source development best practices. A pyOpenSci package submission should demonstrate both need and sustained value to the research community. **Short-lived, single-use codebases are out of scope for pyOpenSci.**
34
+
35
+
## Communication in review issues
36
+
37
+
* We prefer that all communication in our software review issues are written by a human. We embrace the use of LLMs for translation and grammar correction. We prefer honest interactions over ones that prioritize perfect language and grammar. Use as little aid from an LLM as possible.
38
+
* We will block accounts that spam our repositories or burden our volunteers with repeated, automated comments that aren't directly related to and in support of productive conversations in a review.
39
+
40
+
## Package development and design approach
41
+
42
+
***Development History Timeline:** Projects should have at least **3-6 months of public development history**, with evidence of releases, public issues, and pull requests that reflect **iterative, thoughtful development** rather than rapid and recent code generation.
43
+
* If the human effort put into the package is less than the effort required to review it, please don't submit the package.
44
+
* Software should be developed openly, rather than developed in private and then moved to a public repository with an OSI-approved license to meet minimal open source requirements.
45
+
***Development History Approach:** We encourage thoughtful development history and patterns, including tightly scoped commits with clear commit messages that follow iterative development best practices, rather than large commits that address multiple issues in a package and affect large volumes of files throughout the package. These workflows signal careful design and development, and changes to a codebase that could be reviewed by a human.
46
+
* Projects with very short, rapid development timelines (weeks to a few months) will face higher scrutiny by our review teams than those that have a significant development history (more than 6 months)
47
+
***Package Scope & Design:** We value packages with a thoughtful, well-scoped design. When submitting, we will ask you to describe the key design decisions behind your package — the tradeoffs you considered and why you built it the way you did.
48
+
* We place greater value on packages that have been adopted or used by a wide user base, since this demonstrates that the package has design and performance characteristics that meet multiple use cases.
49
+
* Be sure to situate your package within the broader Python ecosystem: identify related tools, explain how your package differs from them, and explain how it complements, extends, or builds upon them.
50
+
* We particularly value **work that builds upon or extends existing tools rather than reinventing solutions** where quality alternatives already exist.
51
+
52
+
Below is the checklist that you will need to respond to in our submission form:
0 commit comments