Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ac kubernetes #182

Merged
merged 22 commits into from
Jan 9, 2025
Merged

Ac kubernetes #182

merged 22 commits into from
Jan 9, 2025

Conversation

Acribbs
Copy link
Contributor

@Acribbs Acribbs commented Nov 10, 2024

No description provided.

@Acribbs
Copy link
Contributor Author

Acribbs commented Nov 12, 2024

@snsansom @IanSudbery @nickilott @david-cgat @jscaber (and all). This is a big change. I have refactored the code to introduce a new base class called the Executor that is inherited by Slurm, Torque, SGE, Kubernetes and local (it replaces make_runner()). The main aim was to implement kubernetes functonality which I couldnt do with the current framework. I introduced a new class function in base_executor.py that is inherited in executor.py (There is also a kubernets.py which I will move into executor.py before merging). I have tested it locally and on slurm on my end and it works, but if others could give it a go that would be great. It also defaults to LocalExecutor if you dont specify it to run locally, but now I think about it, I think im going to make it raise an error as that keeps it consistent with the current behaviour when it complains if DRMMA API isnt found if its ran for cluster config but is ran locally. Thinking of implementing a lot more functionalities in my free time (when i travel on trains and planes) so would be happy to accept suggestions on what to implement next.

@nickilott
Copy link

Thanks Adam for keeping going with cgat-core! I'm happy to test it out on existing pipelines and will let you know here how it goes

@Acribbs
Copy link
Contributor Author

Acribbs commented Dec 1, 2024

Planning on merging this soon

@nickilott
Copy link

Hey Adam, I just pulled the branch and tried to run a pipeline and got the following error:

File /gpfs3/well/powrie/users/utu691/devel/venv/Python-3.10.8-GCCcore-12.2.0/skylake/lib/python3.10/site-packages/cgatcore/pipeline/executors.py, line 69, in run
self.monitor_job_completion(job_id)
AttributeError: 'SlurmExecutor' object has no attribute monitor_job_completion

I haven'[t had a chance to look into the source of the error in any detail at all but wanted to flag before the merge...

@Acribbs
Copy link
Contributor Author

Acribbs commented Dec 8, 2024

Thanks for this! I have no idea why the tests are not picking this up. Will investigate further. May need some more tests to capure this

@Acribbs
Copy link
Contributor Author

Acribbs commented Jan 1, 2025

@nickilott This was my fault, I had placeholder comments to include code for creating the monitor_job_completion function but I never did it, not sure how this wasn't picked up in testing, but have explicitly wrote tests to now check this. I tested the code on my end though and still dont get why i didnt get the same error as you. Anyway, its now fixed and should be fine. If you have time to check this out, it would be much appreciated.

@nickilott
Copy link

Hi Adam,
I have pulled these changes and will test in the next couple of days if that's ok?

@Acribbs
Copy link
Contributor Author

Acribbs commented Jan 8, 2025

yeh no problem, thanks!

@nickilott
Copy link

Hi Adam,

I ran a pipeline with this code and it seemed to work fine for me :)

@Acribbs
Copy link
Contributor Author

Acribbs commented Jan 9, 2025

Awesome, thanks! I'm going to merge given i now have tests in place for capturing the issue you raised earlier and it works my side.

@Acribbs Acribbs merged commit 8f753d4 into master Jan 9, 2025
14 checks passed
@Acribbs Acribbs deleted the AC-kubernetes branch January 9, 2025 09:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants