-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPICH support #478
MPICH support #478
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
@@ -115,6 +115,7 @@ const ( | |||
|
|||
openMPISlotsEnv = "OMPI_MCA_orte_set_default_slots" | |||
intelMPISlotsEnv = "I_MPI_PERHOST" | |||
mpichSlotsEnv = "TODO" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anyone know how to figure out what should be provided here?
Interestingly, the pi example works as is, i.e. with mpichSlotsEnv == TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the name of the environment variable for configuring the number of workers per host.
It works in the sense that it runs, because this would just set an environment variable called TODO. But it's not doing the intended outcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's -ppn
command line option (both for Intel and MPICH), but I don't think it can be controlled via an environemnt variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, it's hard to overwrite the command line, because users might be using wrappers around mpiexec
. That's why environment variables is better. Another option that openMPI and Intel MPI support is to set them in the hostfile.
Actually, it might be useful to use that in general because of this #445
Thanks, this PR is still in my radar, but I don't think I'll have a chance to look at it in the next 2 weeks. |
Thanks. Are there any actions on me at this point? |
Did you figure a solution for this?
|
Not yet. I think I misunderstood the purpose od |
Thanks for the discussion in the other thread. So, with my limit understanding, the next step for me would be to remove the part with sets up |
That is correct. It's up to the user to set |
And if you have a chance, maybe we could be consistent across the MPI implementations and always use the hostfile instead of an environment variable. That would fix #445 too |
@sheevy are you still working on this? |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
can you rebase? |
oopsie, sorry for noise, this was meant to to be a local change in my fork. I forgot the PR is still open. I started reappling the change from scratch onto master as rebasing this branch was just too much work. Too much has changed in the time this PR was dormant. But I now have time to finalyl close it. As I understand, the recent addtion of slots to host file is exactly what was missing. So now it's just a case of porting the change from this PR onto master ,as everything else is now in place. |
/close |
@alculquicondor: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This is my initial attempt to have MPICH working on mpi-operator. Bear in mind I'm a noob when it comes to MPI and Kubernetes.
This is to close #477 which @alculquicondor offered to help with.
I've demonstrated pi example works when using MPICH.
There's a single TODO in the commit, because I'm not sure what a value of one of the constant should be for MPICH.
I've changed some of the test files, but not all. I'd appreciate a hint how to progress on that front.
Also, the autogenerated files were in fact edited by me manually...