Skip to content

Add pipeline AGENTS.md#2

Open
itrujnara wants to merge 17 commits into
mainfrom
add-agents-md
Open

Add pipeline AGENTS.md#2
itrujnara wants to merge 17 commits into
mainfrom
add-agents-md

Conversation

@itrujnara

Copy link
Copy Markdown
Collaborator

This is the initial draft of the central nf-core AGENTS.md. This file will be referenced by the template AGENTS.md. Feel free to leave any comments and suggestions, since this file should be based on community experience.

@itrujnara itrujnara requested a review from a team June 16, 2026 12:44
Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
| ├── script.py // all scripts must start with a shebang
| └── other_script.R
├── CHANGELOG.md // changelog, should be updated after every substantial change
├── CITATIONS.md // list of tool citations, should be updated when new tools are added

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think this update should be made via nf-core tools, cf nf-core/tools#4328

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it's deterministic

Comment thread AGENTS.md Outdated

@pinin4fjords pinin4fjords left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My AI-assisted review :-).

But it's derived from my own AI-dev practices.


Really solid first draft. The directory treemap and the terms / pipeline-structure / configuration sections are accurate and exactly the grounding an agent needs, and it is good to see modules.config already described as a single process block with withName selectors.

Most of my comments are about places where the behaviour the file prescribes would, in practice, make an agent burn CI or fight the branch model, plus a couple of module conventions reviewers reliably enforce. Themes, with specifics inline:

  • The per-commit routine asks for a full nf-test run (and release lint) before every atomic commit. For an agent making several commits on a branch that is N full pipeline test runs where one before pushing would do. Better to tie the heavy checks to push / pre-PR and keep only the cheap static checks per-commit.
  • The fork branch policy contradicts the feature-branch rule one sentence earlier and would let an agent pile unrelated work onto its fork's dev, which then cannot produce single-feature PRs.
  • Nothing about git worktrees. For autonomous agents that may run several sessions against one checkout, a one-line "use a worktree per task" near the branch policy would prevent a lot of clobbering.
  • Snapshot guidance should warn that file hashes are architecture-sensitive, or an agent will regenerate them locally and break CI on a different arch.
  • ext.args / ext.prefix are the heart of module configuration but never appear; that is the convention reviewers most often ask for ("should args be exposed?").

Agree that the modules repo will need its own version (it is not template-generated and has different conventions), and that AGENTS.md updates are best applied through nf-core tools rather than hand edits.

Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
Comment on lines +162 to +167
Before each commit, perform all of the following:
1. Run `nextflow lint .` to lint all Nextflow scripts in the repository. Resolve all errors and all possible warnings. Repeat until there are no solvable outstanding issues.
2. Run `nf-core pipelines lint`, resolve all errors and all possible warnings. Repeat until there are no solvable outstanding issues. If you are preparing a release (PR to main), use `nf-core pipelines lint --release` instead.
3. Run `nf-test test tests/`. If the pipeline fails, resolve the underlying issues. If the test fails due to mismatching snapshots, update them with `nf-test test tests/ --update-snapshot` only if you expect the specific change in the output. Otherwise, fix the issue that caused the unexpected change.
4. Run `prek` and stage all changes it generates.
After completing these steps, you are free to commit your changes.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running the full pipeline nf-test suite (and release lint) before every atomic commit is very expensive and slow, and CI re-runs it on the PR anyway. For an agent that makes several commits per branch this multiplies cost for little signal. Suggest splitting: keep the cheap, fast static checks at per-commit (prek and nextflow lint ., which prek may already cover), and move nf-test test tests/ and nf-core pipelines lint --release to a pre-push / pre-PR step. That keeps commits cheap while still guaranteeing green before review.

Relatedly, since nf-core CI runs the full test matrix on every push, consider telling the agent to append [skip ci] to work-in-progress commit messages while iterating, and to drop it (so CI runs) on the final commit before opening or updating the PR for review. That avoids a full CI run per intermediate commit while still guaranteeing a green run before merge.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is deterministic and should be in agent hooks in the various harnesses

@itrujnara itrujnara Jun 16, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, although we are not planning to create agent hooks within this project (even though it would be ideal if people used them)

Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
Each PR requires reviews: 1 for dev, 2 for main. Advise the user to ask for reviews in the nf-core Slack, in `#request-review` (for dev PRs) or `#release-review-trading` (for main PRs). There is also a CI pipeline executed on each PR. All checks must pass before the PR can be merged.

## Agent self-disclosure
As an AI agent, you are required to acknowledge your activity in nf-core. If you generated a majority of the code in a commit, add "This commit was generated by {your name}" at the of the commit message body. If you open a PR autonomously, add "This pull request was created by {your name}" at the end of the PR message (above the checklist).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just read somewhere that some models/harnesses may not be good at identifying themselves. https://www.reddit.com/r/ollama/comments/1u6b5o9/i_dont_trust_ollama_cloud_is_it_possible_that_its/

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the model is not recoverable because it is not part of the context, and the model may change between sessions. What we want to include here is the name of the agent, which should be injected into the context (at least it is for Copilot, but should also be for the others). Knowing the underlying model at commit time would be nice, but it may not be too reliable, as the link suggests.

Comment thread AGENTS.md Outdated
All comments and documentation must be written in English with British spelling. Documentation files should additionally follow the style guide at https://nf-co.re/docs/developing/documentation/style-guide.

## Nextflow pitfalls
TBC

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion is that publishing should be handled by one method, either publishDir in the modules.config, or the output block in the workflow file. I'm not sure who's migrated to the output block though.

nf-core tools sometimes fails. Ask the user how to proceed. Don't auto-generate files that should be generated by an nf-core tools command.

There's the question about DSL1 syntax ( is everything migrated now? )

Ensure all variables are scoped (def).

itrujnara and others added 4 commits June 16, 2026 16:20
Co-authored-by: Maxime U Garcia <max.u.garcia@gmail.com>
Co-authored-by: Jonathan Manning <pininforthefjords@gmail.com>
@itrujnara

Copy link
Copy Markdown
Collaborator Author

Pushed some updates, hopefully they address the feedback so far

Copy link
Copy Markdown

Really nice draft @itrujnara — the content here is genuinely strong, and most of it is exactly what an agent needs. Before we lock in where this lives, I wanted to float a bigger-picture question, since I think how we distribute it matters about as much as the content itself.

A few things rolled together:

1. A root AGENTS.md tends to describe the repo it sits in. Agents read the one nearest their working dir, from the local checkout. So someone cloning nf-core/agents to work on tooling would read this and get pipeline rules (nf-core pipelines lint, the branch policy, changelog updates) that don't really apply here. I think what we've authored is really pipeline guidance — great content that's just slightly mis-homed as this repo's own config.

2. The "referenced by the template AGENTS.md" bit is the part I'd love to talk through. In practice agents don't reliably fetch a remote central file at runtime — they read what's on disk. So a runtime reference risks not getting loaded, and it overlaps with the template ownership tools already has. We sort of pointed at this already in the "updates should flow through nf-core tools because it's deterministic" thread (@maxulysse / nf-core/tools#4328) — if tools owns the sync, it feels natural for the content to live in the template rather than in a repo that pipelines point at.

3. Different contexts probably want their own file. Roughly:

  • Pipeline guidance → baked into the template in nf-core/tools, synced/linted by tools (single source when we author, concrete local file when the agent runs).
  • Modules → its own in nf-core/modules (matches what we already said about it being special / not template-generated).
  • tools → its own for the Python codebase.

4. Where I think this repo really shines. The piece with no other obvious home is the harness glue — a Claude Code plugin (hooks that run lint/nf-test deterministically, inject the agent name for self-disclosure), Cursor rules, a Codex config, a shared skill/command library wrapping nf-core tools. That lines up nicely with the "this is deterministic, it belongs in hooks not prose" point from the per-commit-routine and self-disclosure threads, and feels like a natural charter for nf-core/agents.

One caveat in the other direction: if we end up with several real consumers of the same prose (template + modules + docs site), a content-source repo that tools vendors in at release (not referenced at runtime) would make sense. I'd just hold off on that until the duplication actually bites.

None of this is meant to slow down the great work on the content — mostly want to make sure we're happy with the distribution model before it spreads across pipelines. Keen to hear what everyone thinks!


Generated by Claude Code

@itrujnara

Copy link
Copy Markdown
Collaborator Author

Hi @edmundmiller, thanks for your feedback. It is true that each AGENTS.md should live in the related repo, but here the decision to split was deliberate. The full discussion can be found in nf-core/proposals#141 and nf-core/proposals#143.
Tl;dr: tools release cycle is slow. The idea here was to decouple the AGENTS file from the tools version and make it editable with a simple PR. Note that this AGENTS file applies exclusively to repos built with the template (i.e., pipelines). Non-template repos like modules, website, and tools will get their own AGENTS files, shipped directly to the revelant repos (note that I will only work on the one in modules, since I don't work on infrastructure enough to describe the good practices there).
There might be a better approach, but I feel it might be a bit late now, since the RFC has now been approved and the implementation document has been greenlit for execution. Unfortunately, from what I see, you have not participated in either discussion.

Comment thread AGENTS.md Outdated

@vagkaratzas vagkaratzas left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • good sections coverage (might even be too much)
  • way too verbose overall IMO
  • unsure about git commits, should add to ask user who wants to make the commits
  • rewrite with a bot, where less text is more (tokens to spend later)

Might have gone a bit out of scope. I thought we would just have a basic folders description and code guiderails, and then links to nf-core online docs. But now it seems this has everything plus more.

Comment thread AGENTS.md Outdated
@@ -0,0 +1,188 @@
# nf-core: agents

This is the main AI context file for nf-core pipelines. All AI agents and coding assistants must read and strictly follow the rules contained in this document.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When setting bot guiderails, bold/caps can help, because models process Markdown and textual emphasis. Rule-like wording examples such as this:

- Agents **MUST** run `nf-core pipelines lint` before submitting changes.
- Agents **MUST NOT** modify generated files manually.
- Agents **SHOULD** prefer existing pipeline patterns over introducing new conventions.

Do this throughout.

Suggested change
This is the main AI context file for nf-core pipelines. All AI agents and coding assistants must read and strictly follow the rules contained in this document.
This is the main AI context file for nf-core pipelines. All AI agents and coding assistants **MUST** follow the rules contained in this document.

Comment thread AGENTS.md Outdated
This is the main AI context file for nf-core pipelines. All AI agents and coding assistants must read and strictly follow the rules contained in this document.

## Nextflow language
Unless otherwise stated, all code in the repository is written in the Nextflow programming language. The documentation can be found at https://docs.seqera.io/nextflow/.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not true for bin/ scripts, .tml files, ++ I guess. Maybe say for workflows / subworkflows or skip, because this is self-explenatory to the bots without needing to reading this line.

Comment thread AGENTS.md Outdated
All comments and documentation must be written in English with British spelling. Documentation files should additionally follow the style guide at https://nf-co.re/docs/developing/documentation/style-guide.

## Nextflow pitfalls
- Nextflow supports 2 ways to publish files to the output directory: workflow outputs (modern) and `publishDir` configuration directives (legacy). If the pipeline uses legacy outputs, use them consistently for new modules unless directly prompted otherwise. If the pipeline uses workflow outputs, use them consistently and never revert to legacy outputs.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should you link the modules.config here, where guides this? Some pipeline might use both, so this might be more confusing to the bot than intended

Comment thread AGENTS.md Outdated
- Nextflow supports 2 ways to publish files to the output directory: workflow outputs (modern) and `publishDir` configuration directives (legacy). If the pipeline uses legacy outputs, use them consistently for new modules unless directly prompted otherwise. If the pipeline uses workflow outputs, use them consistently and never revert to legacy outputs.
- nf-core tools commands may fail. If that happens, ask the user for help. Never generate any file that is supposed to be generated by nf-core tools.
- Certain very old pipelines might be using Nextflow DSL1 syntax (with the entire workflow in a single file and channel from/to keywords). This syntax is now deprecated. Do not attempt to work on those pipelines and advise the user to refactor to DSL2.
- All variables must be defined using the `def` keyword (example: `def x = 5`).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too much detail for here. The bots should just use nextflow lint * and the language server to bypass these problems.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also just not true in the slightest and in fact will break regularly. For example if you define def prefix = 'blahblahblah' in your script section and then try to use path("${prefix}.tsv") in output, then your code will fail.

Comment thread AGENTS.md Outdated
│ ├── samplesheet.csv // example valid samplesheet
│ └── schema_input.json // JSON schema describing the samplesheet format
├── bin // scripts for local modules
| ├── script.py // all scripts must start with a shebang and carry a licence/author header

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and be executable

Comment thread AGENTS.md Outdated

If you work on multiple features in parallel, use a separate worktree for each task to prevent clobber.

If you only want to fix a bug in a released version of a pipeline, you should instead create a branch called `patch` from `main`, work in it, and open a PR to nf-core main once done.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly the branch should be named patch: ++ ? Make sure else the PR will be auto rejected I think

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't mention this option for this general use case at all. this feature is more for "emergency" patches

Comment thread AGENTS.md Outdated
## Commit rules and routine
Each commit should be as atomic as possible, that is, only contain one logical change. There is no limit on the number of files in a commit. There is no mandated commit message format, but the commit title should be concise and written in imperative mood. If the commit consists only of installing or updating an nf-core module or subworkflow, limit the commit title to `Install/update nf-core module/subworkflow {name}`.

Before each commit, run `prek` and stage all changes it generates. Resolve all errors and all possible warnings. Repeat until there are no solvable outstanding issues. After that, you are free to commit your changes.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Before each commit, run `prek` and stage all changes it generates. Resolve all errors and all possible warnings. Repeat until there are no solvable outstanding issues. After that, you are free to commit your changes.
Before each commit, stage changes and then run `prek`. Resolve all errors and all possible warnings. Repeat until there are no solvable outstanding issues. After that, you are free to commit your changes.

Comment thread AGENTS.md Outdated
## Push routine
You can push changes to GitHub as often as required, especially during PR review, but you should only push after implementing some meaningful changes. Only push if the code is working.

Before pushing, ensure nf-core linting is passing. Run `nf-core pipelines lint`, resolve all errors and all possible warnings. Repeat until there are no solvable outstanding issues. If you are preparing a release (PR to main), use `nf-core pipelines lint --release` instead.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont forget the strict syntax linting nextflow lint *

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned in an earlier comment, prek runs this command automatically

Comment thread AGENTS.md Outdated
GitHub Actions will run CI for every push. If you know the code will cause issues or you intend to push more changes, add `[skip ci]` at the end of the commit title. Omit this tag if the changes are final, especially right before a PR or when you want the code to be reviewed.

## PR procedure
Changes to nf-core `dev` and `main` branches must be made through GitHub pull request. A PR should generally contain a single feature. The PR must use and follow the nf-core PR template, including the checklist. The PR message should start with a brief explanation of the changes made and the motivation.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems duplicate from before?

Comment thread AGENTS.md Outdated
## PR procedure
Changes to nf-core `dev` and `main` branches must be made through GitHub pull request. A PR should generally contain a single feature. The PR must use and follow the nf-core PR template, including the checklist. The PR message should start with a brief explanation of the changes made and the motivation.

Each PR requires reviews: 1 for dev, 2 for main. Advise the user to ask for reviews in the nf-core Slack, in `#request-review` (for dev PRs) or `#release-review-trading` (for main PRs). There is also a CI pipeline executed on each PR. All checks must pass before the PR can be merged.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we want a guiderails markdown file, also making the bot provide advice to the user. To discuss more on this

@itrujnara

Copy link
Copy Markdown
Collaborator Author

Hi @vagkaratzas, thanks a lot for your feedback. I have applied the most straightforward suggestions. I will need to run through the file again tomorrow to decrease verbosity/redundancy and highlight key instructions.
I would appreciate more guidance on the topic of user interaction. The only relevant section that remains is about Slack and PR review. It feels reasonable to me to have the AI tell the user "PR is open, you need a review, you can ask for it on Slack". That is, unless we are okay with the AI posting to Slack on its own, which is a much deeper discussion.

@vagkaratzas vagkaratzas left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of comments/suggestions but I think it's in the right direction now. The only way to evaluate this is by actually using it. And people reporting what works and whatnot so we can keep improving

Comment thread AGENTS.md Outdated
Comment thread resources/pipeline/AGENTS.md
Comment thread AGENTS.md Outdated

@ewels ewels left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. Let's ship it and iterate 👍🏻

Comment thread resources/pipeline/AGENTS.md
Comment thread AGENTS.md Outdated
Co-authored-by: Evangelos Karatzas <32259775+vagkaratzas@users.noreply.github.com>
Co-authored-by: Phil Ewels <phil.ewels@seqera.io>
Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
│ └── nf-core // nf-core modules (see section below)
│ ├── fastqc
│ | ├── main.nf // Nextflow script, may be edited if necessary
| | └── ... // do not edit other files in nf-core modules

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? how are they different from editing main.nf?

Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
Co-authored-by: Phil Ewels <phil.ewels@seqera.io>
Comment thread AGENTS.md Outdated

Several files have been skipped from the treemap. If a file is not in the treemap, you **SHOULD NOT** edit it unless explicitly prompted.

## Key nf-core terms

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as for the module repo - as a human I'd prefer these definitions frontloaded, I don't know if that applies in this context though.

@itrujnara itrujnara changed the title Add AGENTS.md Add pipeline AGENTS.md Jun 26, 2026
@itrujnara

Copy link
Copy Markdown
Collaborator Author

Changed file path to resources/pipeline/AGENTS.md to get out of the way of @edmundmiller's pending changes, content remains unchanged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants