Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starting HELICS-Example with helics run does not work, seperate calling does #104

Open
carwegka opened this issue Sep 4, 2024 · 34 comments
Assignees
Labels
bug Something isn't working

Comments

@carwegka
Copy link

carwegka commented Sep 4, 2024

When trying to run the osmses example via the helics run command, the execution cannot find the helics package:
grafik

This is although helics is clearly installed with pip:
grafik

But when running the example with the 3 separate starts as in
$ python battery_cosim_complete.py & (or launch in its own shell)
$ python charger_cosim_complete.py & (or launch in its own shell)
$ helics_broker -f 2 (or launch in its own shell)
it seems to work:
grafik
grafik
Nonetheless, the broker prints out a warning at the end:
grafik

The "Debugging Command" prints out the following:
grafik

If you need additional information, let me know!

@carwegka carwegka added the bug Something isn't working label Sep 4, 2024
@trevorhardy
Copy link
Contributor

This issue came up during a tutorial I gave earlier this week and the fact that the co-simulation will run when the federates are launched individually but fail when using the runner makes me think something might be up with the runner re-work the two of you have been doing (but not completed?). Note that I haven't had this problem on macOS (ARM-based). This bug was manifesting when starting from a clean virtual environment.

@trevorhardy
Copy link
Contributor

trevorhardy commented Sep 4, 2024

@carwegka, the error you're seeing at the end of the co-simulation is somewhat expected (depending on variation in the OS scheduler). It happens on the very last time step when one of the federates ends early enough that it exits the co-simulation before the other federate has published its final value. When this happens, HELICS throws this warning to indicate that there is nobody to receive the sent publication (which is generally not good). Since it is the last time step, there will be no new calculations even if the federate was there to receive the publication.

I usually add a function that I call on the last step to advance the federates to infinite time (well, the maximum time HELICS can track) and doing so outside the main simulation loop which means all the federates will be granted that time and receive an pending publications prior to exiting the co-simulation. I've made this change and pushed it up into the "complete" version of the example.

@trevorhardy
Copy link
Contributor

@nightlark and @josephmckinsey, the example I was using can be found at HELICS-Examples/unmaintained/python/osmses_2024_battery_charger using the "osmses_2024_runner.json" which runs the "complete" version of the example.

@nightlark
Copy link
Member

nightlark commented Sep 4, 2024

This issue came up during a tutorial I gave earlier this week and the fact that the co-simulation will run when the federates are launched individually but fail when using the runner makes me think something might be up with the runner re-work the two of you have been doing (but not completed?). Note that I haven't had this problem on macOS (ARM-based). This bug was manifesting when starting from a clean virtual environment.

The pull requests for the re-work of the web UI/CLI haven’t been merged to main yet (or pushed to PyPI). Unless the pyhelics you used is from one of those branches, this is a pre-existing bug and wasn’t introduced by those changes.

@nightlark
Copy link
Member

Could the runner be launching the scripts outside of the virtual environment?

@trevorhardy
Copy link
Contributor

The pull requests for the re-work of the web UI/CLI haven’t been merged to main yet (or pushed to PyPI). Unless the pyhelics you used is from one of those branches, this is a pre-existing bug and wasn’t introduced by those changes.

They were just pulling from PyPI so whatever the latest released version that's out there publicly.

@carwegka
Copy link
Author

carwegka commented Sep 4, 2024

@nightlark We (a colleague also had the same problem) tried it with global python interpreter (outside of virtual environment) and it also didn't work

@nightlark
Copy link
Member

nightlark commented Sep 4, 2024

Yea... the runner code hasn't changed in over 2 years, so it's likely a very long standing bug (@kdheepak is the most familiar with this area) or maybe a change in recent Python versions on some operating systems.

For a minimal reproduction, the runner appears to just be spawning subprocesses: https://github.com/GMLC-TDC/pyhelics/blob/main/helics/cli.py#L260-L266

You could try writing a Python script that just hardcodes those subprocess calls, replacing f["exec"] with a string consisting of the command to run, and env=env with env=dict(os.environ)

@nightlark
Copy link
Member

I've also been unable to reproduce the error using fresh virtual environments on macOS (M2, Python 3.9 and 3.12), and RHEL 8 Linux (x64, Python 3.9) -- this may be a Windows-only issue, but I don't have a Windows system to test on.

@kdheepak
Copy link
Contributor

kdheepak commented Sep 4, 2024

I think the problem here is that the runner spawns subprocesses that don't have the same environment has the parent process, and import helics doesn't seem to be present in that? Which is weird. Printing out the environment print(os.environ) in the individual federates before any import helics might give a clue as to what is going on there.

Add the following for good measure at the top of each python file and share the output here again:

import sys
import os

print("Environment:", os.environ)

print("Python Version:", sys.version)

print("Python Interpreter Location:", sys.executable)

@carwegka
Copy link
Author

carwegka commented Sep 5, 2024

For the battery file:

Environment: environ({'ALLUSERSPROFILE': 'C:\ProgramData', 'APPDATA': 'C:\Users\wegkamp\AppData\Roaming', 'CHROME_CRASHPAD_PIPE_NAME':
'\\.\pipe\crashpad_1548_BGSIWXYCAKBRIEMZ', 'COMMONPROGRAMFILES': 'C:\Program Files\Common Files', 'COMMONPROGRAMFILES(X86)': 'C:\Program Files (x86)\Common Files', 'COMMONPROGRAMW6432': 'C:\Program Files\Common Files', 'COMPUTERNAME': 'ELENIANB59', 'COMSPEC': 'C:\WINDOWS\system32\cmd.exe', 'DRIVERDATA': 'C:\Windows\System32\Drivers\DriverData', 'FPS_BROWSER_APP_PROFILE_STRING': 'Internet Explorer', 'FPS_BROWSER_USER_PROFILE_STRING': 'Default', 'GUROBI_HOME': 'C:\gurobi1001\win64', 'HOMEDRIVE': 'U:', 'HOMEPATH': '\', 'HOMESHARE': '\\134.169.52.10\_Users$\wegkamp', 'LOCALAPPDATA': 'C:\Users\wegkamp\AppData\Local', 'LOGONSERVER': '\\ELENIAAD02V', 'NUMBER_OF_PROCESSORS': '4', 'ONEDRIVE': 'C:\Users\wegkamp\OneDrive', 'ORIGINAL_XDG_CURRENT_DESKTOP': 'undefined', 'OS': 'Windows_NT', 'PATH': 'c:\Users\wegkamp\.vscode\extensions\ms-python.python-2024.12.3-win32-x64\python_files\deactivate\powershell;C:\Users\wegkamp\Dokumente_lokal\helics_example\.venv\Scripts;c:\Users\wegkamp\.vscode\extensions\ms-python.python-2024.12.3-win32-x64\python_files\deactivate\powershell;C:\Users\wegkamp\Dokumente_lokal\helics_example\.venv\Scripts;C:\gurobi1001\win64\bin;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\Users\wegkamp\AppData\Local\Programs\Microsoft VS Code\bin;C:\Users\wegkamp\AppData\Local\Programs\Git\cmd', 'PATHEXT': '.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.CPL', 'PROCESSOR_ARCHITECTURE': 'AMD64', 'PROCESSOR_IDENTIFIER': 'Intel64 Family 6 Model 158 Stepping 9, GenuineIntel', 'PROCESSOR_LEVEL': '6', 'PROCESSOR_REVISION': '9e09', 'PROGRAMDATA': 'C:\ProgramData', 'PROGRAMFILES': 'C:\Program Files', 'PROGRAMFILES(X86)': 'C:\Program Files (x86)', 'PROGRAMW6432': 'C:\Program Files', 'PSMODULEPATH': '\\134.169.52.10\_Users$\wegkamp\Eigene Dateien\WindowsPowerShell\Modules;C:\Program Files\WindowsPowerShell\Modules;C:\WINDOWS\system32\WindowsPowerShell\v1.0\Modules', 'PUBLIC':
'C:\Users\Public', 'SESSIONNAME': 'Console', 'SYSTEMDRIVE': 'C:', 'SYSTEMROOT': 'C:\WINDOWS', 'TEMP': 'C:\Users\wegkamp\AppData\Local\Temp', 'TMP': 'C:\Users\wegkamp\AppData\Local\Temp', 'WINDIR': 'C:\WINDOWS', 'ZES_ENABLE_SYSMAN': '1', 'PYTHONSTARTUP': 'c:\Users\wegkamp\.vscode\extensions\ms-python.python-2024.12.3-win32-x64\python_files\pythonrc.py', 'TERM_PROGRAM': 'vscode', 'TERM_PROGRAM_VERSION': '1.92.2', 'LANG': 'en_US.UTF-8', 'COLORTERM': 'truecolor', 'GIT_ASKPASS': 'c:\Users\wegkamp\AppData\Local\Programs\Microsoft VS Code\resources\app\extensions\git\dist\askpass.sh', 'VSCODE_GIT_ASKPASS_NODE': 'C:\Users\wegkamp\AppData\Local\Programs\Microsoft VS Code\Code.exe', 'VSCODE_GIT_ASKPASS_EXTRA_ARGS': '', 'VSCODE_GIT_ASKPASS_MAIN': 'c:\Users\wegkamp\AppData\Local\Programs\Microsoft VS Code\resources\app\extensions\git\dist\askpass-main.js', 'VSCODE_GIT_IPC_HANDLE': '\\.\pipe\vscode-git-a83ddf38b9-sock', 'VIRTUAL_ENV': 'C:\Users\wegkamp\Dokumente_lokal\helics_example\.venv', 'VIRTUAL_ENV_PROMPT': '.venv', 'VSCODE_INJECTION': '1'})

Python Version: 3.10.4 (tags/v3.10.4:9d38120, Mar 23 2022, 23:13:41) [MSC v.1929 64 bit (AMD64)]

Python Interpreter Location: C:\Users\wegkamp\Dokumente_lokal\helics_example.venv\Scripts\python.exe

For the charger file:

Environment: environ({'ALLUSERSPROFILE': 'C:\ProgramData', 'APPDATA': 'C:\Users\wegkamp\AppData\Roaming', 'CHROME_CRASHPAD_PIPE_NAME': '\\.\pi
pe\crashpad_1548_BGSIWXYCAKBRIEMZ', 'COMMONPROGRAMFILES': 'C:\Program Files\Common Files', 'COMMONPROGRAMFILES(X86)': 'C:\Program Files (x86)\Common Files', 'COMMONPROGRAMW6432': 'C:\Program Files\Common Files', 'COMPUTERNAME': 'ELENIANB59', 'COMSPEC': 'C:\WINDOWS\system32\cmd.exe', 'DRIVERDATA': 'C:\Windows\System32\Drivers\DriverData', 'FPS_BROWSER_APP_PROFILE_STRING': 'Internet Explorer', 'FPS_BROWSER_USER_PROFILE_STRING': 'Default', 'GUROBI_HOME': 'C:\gurobi1001\win64', 'HOMEDRIVE': 'U:', 'HOMEPATH': '\', 'HOMESHARE': '\\134.169.52.10\_Users$\wegkamp', 'LOCALAPPDATA':
'C:\Users\wegkamp\AppData\Local', 'LOGONSERVER': '\\ELENIAAD02V', 'NUMBER_OF_PROCESSORS': '4', 'ONEDRIVE': 'C:\Users\wegkamp\OneDrive', 'ORIGINAL_XDG_CURRENT_DESKTOP': 'undefined', 'OS': 'Windows_NT', 'PATH': 'c:\Users\wegkamp\.vscode\extensions\ms-python.python-2024.12.3-win32-x64\python_files\deactivate\powershell;C:\Users\wegkamp\Dokumente_lokal\helics_example\.venv\Scripts;c:\Users\wegkamp\.vscode\extensions\ms-python.python-2024.12.3-win32-x64\python_files\deactivate\powershell;C:\Users\wegkamp\Dokumente_lokal\helics_example\.venv\Scripts;C:\gurobi1001\win64\bin;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\WINDOWS\System32\OpenSSH\;C:\Users\wegkamp\AppData\Local\Programs\Microsoft VS Code\bin;C:\Users\wegkamp\AppData\Local\Programs\Git\cmd', 'PATHEXT': '.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.CPL', 'PROCESSOR_ARCHITECTURE': 'AMD64', 'PROCESSOR_IDENTIFIER': 'Intel64 Family 6 Model 158 Stepping 9, GenuineIntel', 'PROCESSOR_LEVEL': '6', 'PROCESSOR_REVISION': '9e09', 'PROGRAMDATA': 'C:\ProgramData', 'PROGRAMFILES': 'C:\Program Files', 'PROGRAMFILES(X86)': 'C:\Program Files (x86)', 'PROGRAMW6432': 'C:\Program Files', 'PSMODULEPATH': '\\134.169.52.10\_Users$\wegkamp\Eigene Dateien\WindowsPowerShell\Modules;C:\Program Files\WindowsPowerShell\Modules;C:\WINDOWS\system32\WindowsPowerShell\v1.0\Modules', 'PUBLIC': 'C:\Users\Public', 'SESSIONNAME': 'Console', 'SYSTEMDRIVE': 'C:', 'SYSTEMROOT': 'C:\WINDOWS', 'TEMP': 'C:\Users\wegkamp\AppData\Local\Temp', 'TMP': 'C:\Users\wegkamp\AppData\Local\Temp', 'WINDIR': 'C:\WINDOWS', 'ZES_ENABLE_SYSMAN': '1', 'PYTHONSTARTUP': 'c:\Users\wegkamp\.vscode\extensions\ms-python.python-2024.12.3-win32-x64\python_files\pythonrc.py', 'TERM_PROGRAM': 'vscode', 'TERM_PROGRAM_VERSION': '1.92.2', 'LANG': 'en_US.UTF-8', 'COLORTERM': 'truecolor', 'GIT_ASKPASS': 'c:\Users\wegkamp\AppData\Local\Programs\Microsoft VS Code\resources\app\extensions\git\dist\askpass.sh', 'VSCODE_GIT_ASKPASS_NODE': 'C:\Users\wegkamp\AppData\Local\Programs\Microsoft VS Code\Code.exe', 'VSCODE_GIT_ASKPASS_EXTRA_ARGS': '', 'VSCODE_GIT_ASKPASS_MAIN': 'c:\Users\wegkamp\AppData\Local\Programs\Microsoft VS Code\resources\app\extensions\git\dist\askpass-main.js', 'VSCODE_GIT_IPC_HANDLE': '\\.\pipe\vscode-git-a83ddf38b9-sock', 'VIRTUAL_ENV': 'C:\Users\wegkamp\Dokumente_lokal\helics_example\.venv', 'VIRTUAL_ENV_PROMPT': '.venv', 'VSCODE_INJECTION': '1'})

Python Version: 3.10.4 (tags/v3.10.4:9d38120, Mar 23 2022, 23:13:41) [MSC v.1929 64 bit (AMD64)]

Python Interpreter Location: C:\Users\wegkamp\Dokumente_lokal\helics_example.venv\Scripts\python.exe

PS: I deleted some of the domain part of the environment, which I think is not important, but I don't want to have published. I hope this is not that important.

@kdheepak
Copy link
Contributor

kdheepak commented Sep 5, 2024

It seems to be using the correct interpreter, i.e. C:\Users\wegkamp\Dokumente_lokal\helics_example.venv\Scripts\python.exe.

Can you open a new command prompt, and without activating your .venv, run C:\Users\wegkamp\Dokumente_lokal\helics_example.venv\Scripts\python.exe -c "import helics" and see if that works?

@carwegka
Copy link
Author

carwegka commented Sep 5, 2024

So if I do that and afterwards run helics run --path=HELICS-Examples\unmaintained\python\osmses_2024_battery_charger\osmses_2024_runner.json

This then also gives different interpreters for the two files:
Python Version: 3.10.4 (tags/v3.10.4:9d38120, Mar 23 2022, 23:13:41) [MSC v.1929 64 bit (AMD64)]
Python Interpreter Location: C:\Users\wegkamp\AppData\Local\Programs\Python\Python310\python.exe

Nonetheless, the error occurs

@kdheepak
Copy link
Contributor

kdheepak commented Sep 5, 2024

Without helics run, can you run C:\path\to\helics_example.venv\Scripts\python.exe -c "import helics"?

@carwegka
Copy link
Author

carwegka commented Sep 5, 2024

I can, yes. And it doesn't throw back any error/warning.

@kdheepak
Copy link
Contributor

kdheepak commented Sep 5, 2024

So just to clarify, when you run helics run you get this interpreter:

C:\Users\wegkamp\AppData\Local\Programs\Python\Python310\python.exe

but when you run python battery_cosim_complete.py you get this interpreter:

C:\Users\wegkamp\Dokumente_lokal\helics_example.venv\Scripts\python.exe

Is that correct?

@carwegka
Copy link
Author

carwegka commented Sep 5, 2024

If I am in the venv, I get
Python Interpreter Location: C:\Users\wegkamp\AppData\Local\Programs\Python\Python310\python.exe
for helics run and
Python Interpreter Location: C:\Users\wegkamp\Dokumente_lokal\helics_example\.venv\scripts\python.exe
for python HELICS-Examples\unmaintained\python\osmses_2024_battery_charger\battery_cosim_complete.py.

@nightlark
Copy link
Member

Do you have both C:\Users\wegkamp\Dokumente_lokal\helics_example.venv and
C:\Users\wegkamp\Dokumente_lokal\helics_example\.venv directories? I
noticed that your interpreter is located in the former, but the VIRTUAL_ENV
variable is pointing to the latter (along with an entry in PATH). If both
exist, I wonder if both have helics installed.

@kdheepak
Copy link
Contributor

kdheepak commented Sep 5, 2024

If I am in the venv, I get Python Interpreter Location: C:\Users\wegkamp\AppData\Local\Programs\Python\Python310\python.exe for helics run

This is the reason you are getting the import error.

How have you installed Python? How did you create the .venv virtual environment to begin with? Can you uninstall this version of Python?

@nightlark
Copy link
Member

C:\Users\<account>\AppDAta\Local\Programs is one of the default install locations for Python on Windows. I'm wondering if the underlying issue was something like the helics_example\.venv folder getting renamed to helics_example.venv, since the "activate" scripts created by venv have some memory of their original location (which would no longer exist, leading to the default Python executable running). It might be interesting to see a copy of whichever script was used to activate the virtual environment.

@trevorhardy
Copy link
Contributor

@carwegka, is this issue closed out as far as you're concerned or do we need to keep digging into it?

@afisher1
Copy link
Member

Hi @trevorhardy and @nightlark. I recently had another academic user report a similar issue where the helics run command is not properly using the python virtual environment python executable on windows. I can also confirm this is still an issue. A work around seems to be to explicity specify the python executable for your virtual env in the "exec" string in your runner configuration file. for example:

{
"name": "gldHelicsTestRunner",
"broker": true,
"federates": [
{
"directory": "C:\Users\username\workspace\gldHelicsTesting",
"exec": "C:\Users\username\workspace\gldHelicsTestings\.venv\Scripts\python.exe helicsFederate.py python_config.json 10",
"host": "localhost",
"name": "python_federate"
}
]
}

When doing this I can confirm it's using the correct python environment. However, I am running into another issue. If I attempt to run this little test python federate and a helics_broker explicitly it runs successfully. if I try to run using helics run --path=
I get the following error in both a Windows PowerShell and cmd prompt.
helics run --path=gldHelicsTestRunner.json
[warn] helics-cli's web interface is not installed. You may want to run pip install "helics[cli]".
[warn] helics-cli's observer functionality is not installed. You may want to run pip install "helics[cli]".
[info] Running federation: gldHelicsTestRunner
[info] Adding auto broker (i.e. helics_broker -f1) to helics-cli subprocesses.
[info] Running federate python_federate as a background process
Error: FileNotFoundError: [WinError 2] The system cannot find the file specified

@nightlark
Copy link
Member

nightlark commented Nov 11, 2024

@afisher1 are you escaping the backslashes correctly? You might need up to quadruple backslashes in some cases based on how many times the string is getting decoded/processed by things that handle backslash escaped characters...

Usually using Unix style path separators even on Windows is far easier.

@afisher1
Copy link
Member

@nightlark yes I did. sorry that was odd. Must've been an odd copy and paste issue there. I've tried with \ and \\ and \\ in the exec string.

@afisher1
Copy link
Member

Ok GitHub keeps executing the escape lol. I've tried with 2, 3, and 4 backslashes all with the same result.

@nightlark
Copy link
Member

Did you also do the "directory" string?

@nightlark
Copy link
Member

To rule out crazy escaping issues, I'd highly recommend using / even on Windows.

@nightlark
Copy link
Member

You can also add a call to breakpoint() in the helics/cli.py file you have installed just before the call to spawn a subprocess, to inspect what the actual values loaded from the config file look like.

@afisher1
Copy link
Member

afisher1 commented Nov 11, 2024

Ok I must've missed an escaping backslash somewhere. My subsequent error was due to unescaped backslash. And yes it also works correctly with a /. So no additional issues besides helics runner not using the virtual env it was executed under when starting the federate processes in windows. So current workaround solution on windows with the helics runner is to use the full path to the virtual env python executable in the "exec" key.

@carwegka
Copy link
Author

Sorry for the delayed answer, I completely got lost of this thread!

So to me it seems that there is a different python version in the venv compared to the global python version. And with the two different calls of the helics example (1 via helics run and 2 via separate calling) they might use the different version (maybe 2 does not use the venv python?). So one of the two does fail because the python version is "wrong" and thus cannot find helics in the global environment?

But why does the second version (separate calling in different terminals) not fail no matter if I activate the environment before or not?

And to answer at least some of your questions, the environment was created using python -m venv .venv and then .venv\scripts\activate.
The .venv folder probably was never renaming, maybe this was a copying issue.

@afisher1
Copy link
Member

afisher1 commented Nov 11, 2024

@nightlark. After looking into this more this appears to be a known issue with python's subprocess.run and windows OS. The "official" fix is to use the result of sys.executable rather than 'python' in the exec command passed into subprocess. Please note that you should not use sys.exectuable in the python runner configuration file. It needs to be used in place of the 'python' in the exec string internally in the subprocess.run() function call. see here. https://stackoverflow.com/questions/65283987/venv-not-sticking-across-subprocess-run-on-python-windows. @carwegka this is the underlying issue causing it to use the system installed python executable rather than the one created with your virtual environment regardless of whether you called helics run under the virtual environment or not. So giving the full file path to your desired virtual env python executable in the exec key should fix your issues. Even with the proper fix implemented on our end you would still need to make sure you executed helics run in the activated virtual env.

@nightlark
Copy link
Member

Right now the HELICS CLI just passes the exec string verbatim to subprocess.run as-is with no further processing. I think adding special treatment for occurrences of python in the "exec" field would be baking in an assumption that the venv/executable used for the helics runner is the right one for the federate to use, which at minimum breaks when using helics CLI installed by pipx. shutil.which() to resolve the full path to the executable based on PATH seems a bit better since most commands and not just python could be run through it as a preprocessing step. (The Python binary is also not guaranteed to just be "python" -- I've seen "python3", "python3.9" etc depending on how it was installed).

I think this gets at an underlying issue of managing what environment a command should run in, with the ability to support multiple Python federates that are installed in their own venvs to avoid dependency conflicts with each other.

I guess we sort of already support that right now in the form of specifying the full path to the python interpreter -- though that's not really portable if you're sharing a config file with others. I'm not sure we have a way to make a fully portable config file, since the location/creation of a venv would be dependent on a user creating it in the right place for the example/demo they are trying to run...

@carwegka
Copy link
Author

Thanks for the help and explanations, this clarifies the general problem to me (although I'm probably not that much into the topic). Real thanks!

@nightlark
Copy link
Member

A change that should help address this issue will be in the 3.6 release of pyhelics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants