Skip to content

Conversation

@manuel-g-castro
Copy link
Contributor

@manuel-g-castro manuel-g-castro commented Nov 5, 2025

Hi, @kinow @dbeltrankyl @VindeeR

This merge request solves a critical issue that breaks every single workflow we have.

The problem is that now the task's bash script is passed as a heredoc, but it is not guarded by quotes. therefore the bash script executing the Autosubmit code (header, tailer, etc) makes the parameter and variable substitution.

As far as I could tell, this bug was introduced by

bash -e <<__AS_CMD__
set -xuve
{body}
__AS_CMD__

I am marking this PR as draft because I would like for someone (@VindeeR ?) to write a test to check if the variables are being substituted correctly, so that the CI/CD is able to detect similar errors.

Check List

  • I have read CONTRIBUTING.md.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to pyproject.toml.
  • Tests are included (or explain why tests are not needed).
  • Changelog entry included in CHANGELOG.md if this is a change that can affect users.
  • Documentation updated.
  • If this is a bug fix, PR should include a link to the issue (e.g. Closes #1234).

@manuel-g-castro manuel-g-castro added the critical Use this label to report a critical failure that must be worked ASAP. label Nov 5, 2025
@manuel-g-castro manuel-g-castro marked this pull request as draft November 5, 2025 17:10
@codecov-commenter
Copy link

codecov-commenter commented Nov 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 67.51%. Comparing base (4b47f90) to head (2f086ff).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2694      +/-   ##
==========================================
- Coverage   67.69%   67.51%   -0.19%     
==========================================
  Files          86       86              
  Lines       19786    19786              
  Branches     3840     3840              
==========================================
- Hits        13395    13358      -37     
- Misses       5458     5501      +43     
+ Partials      933      927       -6     
Flag Coverage Δ
fast-tests 67.51% <ø> (-0.19%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dbeltrankyl
Copy link
Collaborator

dbeltrankyl commented Nov 5, 2025

@manuel-g-castro
Copy link
Contributor Author

Hi, @dbeltrankyl !

Thank you for the comment.

In my opinion, your changes are better than the ones that I made.

I actually do not agree with launching the task's bash using bash -e << because any error you have on the task would point to this line in the final .cmd since you are redirecting stdin.

What you have in your branch is like what we used to have in AS3 and AS4.15. If an error would happen, the .out would point to the line of the .cmd where the fault happened.

So I wonder why did you, @kinow and @VindeeR , decided to change to this redirection of the script to stdin.

I am happy to close this PR and just open an issue to move the discussion. Also, as I said, I think that a test needs to be created to account this unbound variable error.

@dbeltrankyl
Copy link
Collaborator

dbeltrankyl commented Nov 6, 2025

Hello @manuel-g-castro

I actually do not agree with launching the task's bash using bash -e << because any error you have on the task would point to this line in the final .cmd since you are redirecting stdin.

I did not review that to be honest so I don't have a strong opinion. We can change to whatever is best

I wouldn't close this PR as we need to cherry-pick the fix done on v4.2 for v4.1.16 release and add the test then!

@dbeltrankyl dbeltrankyl reopened this Nov 6, 2025
@kinow
Copy link
Member

kinow commented Nov 7, 2025

We can change to whatever is best

+1

and add the test then!

Exactly my thoughts too. If it's not broken in CICD, we first need to make sure it's broken with a test, and then add the fix so it doesn't happen again.

I actually do not agree with launching the task's bash using bash -e << because any error you have on the task would point to this line in the final .cmd since you are redirecting stdin.

What you have in your branch is like what we used to have in AS3 and AS4.15. If an error would happen, the .out would point to the line of the .cmd where the fault happened.

So I wonder why did you, @kinow and @VindeeR , decided to change to this redirection of the script to stdin.

That's completely on me.

I couldn't find another way without displacing the error lines a bit. Now, unfortunately, users must look at the lines in the final script, not in their original scripts. We can try to match the previous behaviour, but I couldn't find a way to make everything work that way and also have the _STAT updated correctly.

But I was also changing other parts that were necessary to have the code updated. Maybe focusing just on this part, of the stdin redirect could be simpler. So let's leave this open and try to fix for 4.1.16 👍

Thanks Manuel!

@manuel-g-castro
Copy link
Contributor Author

Hi, @kinow ! Thanks for the comment.

Now, unfortunately, users must look at the lines in the final script, not in their original scripts.

I do not think this was ever the case. For me, at least, every time that I need to debug a script of mine, I need to check the .cmd script generated by Autosubmit. Otherwise the lines will be mismatched with the ones from my template script.

I think I am not being clear on my point of view. So let me explain an example.

This is my script, and because of the set -xuve, it fails because of unbounded variables:

#!/bin/bash

set -xuve
echo ${UNBOUNDED_VARIABLE}

If i execute the workflow with this script, i get the following error on the .err:

/app/autosubmit/experiments/a000/tmp/LOG_a000/a000_LOCAL_SETUP.cmd: line 45: UNBOUNDED_VARIABLE: unbound variable

which correctly indicates the faulty line at the 45ft line of the .cmd. Below you have the snipped of the script (with numbering).

    38	###################
    39	# Autosubmit job
    40	###################
    41	
    42	#!/bin/bash
    43	
    44	set -xuve
    45	echo ${UNBOUNDED_VARIABLE}
    46	
    47	###################
    48	# Autosubmit tailer
    49	###################

Now, if I do the same with the changes in master (c80654), the error points to the line that executes bash -- 46 --, but not the line where the error actually happens in the script -- 51 --:

/app/autosubmit/experiments/a000/tmp/LOG_a000/a000_LOCAL_SETUP.cmd: line 46: UNBOUNDED_VARIABLE: unbound variable

Here is the snipped of the final .cmd script (with line numbers).

    46	bash -e <<__AS_CMD__
    47	set -xuve
    48	#!/bin/bash
    49	
    50	set -xuve
    51	echo ${UNBOUNDED_VARIABLE}
    52	
    53	__AS_CMD__

@dbeltrankyl dbeltrankyl self-assigned this Nov 11, 2025
@dbeltrankyl
Copy link
Collaborator

dbeltrankyl commented Nov 11, 2025

In my opinion, your changes are better than the ones that I made.

I've just added the 4.2 changes but didn't add any test

There are some tests changes in #2700 if that is merged we can modify some of the integration run tests to include some variables and:

  • See the log of the script ran in the search of UNBOUNDED_VARIABLE: unbound variable
    or/and
  • Does shellcheck warn about this? if yes we can add a test for inspect and use shellcheck for checking the cmd's

@dbeltrankyl dbeltrankyl removed their assignment Nov 11, 2025
@kinow
Copy link
Member

kinow commented Nov 11, 2025

Thanks @dbeltrankyl , @manuel-g-castro ! I can review these changes and add more tests if needed. Thanks a lot!!! 🙇

manuel-g-castro and others added 2 commits November 11, 2025 10:38
…ng parameter substitution.

This fixes bug introduced by commit 761c20, "[Enhancemente] Delete wrapper code (#2550)"
@kinow kinow force-pushed the fix-variables-being-interpreted-before-the-script-runs branch from afcf6f7 to 2f086ff Compare November 11, 2025 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

critical Use this label to report a critical failure that must be worked ASAP.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants