Current development model is not sustainable and should be consolidated? #94

MaHaWo · 2024-10-30T09:24:40Z

What's the question at hand
This arose in a recent discussion with @anschaible
So, I saw that there are several PRs open, oftentimes for months at a time, which are interdependent and very big (O(1000) lines of code). The long lifetime and the interdependencies between them are concerning to me, because this suggests that the branches have diverged significantly from main not just in code, but also conceptually. The whole point of PRs is that they should be short-lived, self-contained sets of changes, they are not a good model to manage divergent versions of a software or abandone them half-way and continue on another branch then.
Hence, the currently open PRs should be consolidated and merged before going on with adding new features.

Details
My fear here is that when the time comes for results to be produced, we run into big issues with merging code together again and have to rely on a version in some branch that may or may not be fully reliable. We would have to spend time with bug-fixes and code adjustment under possibly high pressure, which is not conducive to quality and reproducibility. At any rate, it is hard to keep track of where results where derived from when it comes to producing paper results that need to be reproducible if the development process stays the way it is now. Furthermore, it is not sustainable at all as soon as the software is properly released and you get external devs and users.

Therefore, the idea that came up in the last meeting with @anschaible to have a new version that integrates a list of functionality that you guys currently are working on. This we can treat as a 'pre-release' (without actually releasing anything...), i.e., an as far as possible stable system that we can rely on. Once we converged on a list of things to have here, we can set up a milestone that lists the tasks to do. In this way, the development process can be streamlined, and it will be easier to determine what the next steps are and how to work towards them. Granularity is key here.

Secondarily, perhaps it needs to be discussed if the github development model is still a good fit for this project, or if we need perhaps a development or experimental branch that can serve as test bed. The project is in principle small enough for the former to be more than appropriate, but it seems to me that people have a desire to have some kind of sandbox where they can through shovels and buckets around in without hitting their neighbor in the head.

That being said, the git flow development model is only really sensible if we have a production system that must not break (derived from the main branch) and we don´t have this atm. So it might be too early to switch model completely. Moreover, it must be assured that the experimental/dev branch and the main branch do not diverge too far, and I see a risk of this happening.

Suggestions

come up with a list of features that would go into a consolidated version of rubix that can get a 0.2 version number (git tag) or something. Should be derived from issues and current PRs.
make a milestone with a list of what we decided should go in. Determine a date for the milestone to be due. Everybody needs to be responsible for this date to be held. Depending on everyone's schedule, this could be like January 6th, 2025 or so for example.
Improve development discipline:
- PRs should not live forever. if a PR is stale after 2 months, it gets closed or at least flagged.
- PRs should not be too large. It's ok if a PR only implements something partially, so there's no need to put big complex things all in one PR. Several 1000 lines of changes make review hard and thus decrease quality. Hence, if a PR is too big or implements disparate things, reviewers should feel free to reject it and request it to be split or changed. I would propose a PR should not go far beyond 1000 lines of functional code changes (i.e., not counting config files etc) as an upper limit.
- Make use of milestones and meta issues that group larger tasks and long term plans.

Sorry if this comes across as harsh. The concerns I bring up here are derived from experience with my own phd thesis (10 -15k of C++/Julia code where I was the only dev), where I made all these mistakes as well with quite deleterious effects.

@TobiBu @ufuk-cakir @anschaible

TobiBu · 2024-10-31T10:36:42Z

I completely agree. Your suggestions are really sensible.

Partly the open pull requests are my fault. I was volunteering to close them but the problem you described made it really hard. They were open for too long and are interdependent which makes merging them difficult.

anschaible · 2024-10-31T17:46:49Z

@MaHaWo thanks for writing all these points. It is good to talk about and change things early in the project. This includes also for me that I should split the problems in smaller parts and do not create branches with over 1000 new lines of code.

I like the points listed under improve developement discipline and from my side I agree that we should follow them.

Coming to the versioning, here an idea from my side:
Currently we have six open pull requests.

I would suggest that we merge four of them (excluding gas loading and gas emission) to the main and state that this is v0.1, so this is the version to get a stellar IFU cube. As deadline I would suggest 01. December, as this is the NeuRIPS deadline and then @ufuk-cakir (as I understood) wants to publish then the code.

For v0.2, I would say it is then the final load gas particles and the emission cube with cue. Deadline end of december/beginning of January?

Further things, e.g. dust will belong to v0.3

@TobiBu, @ufuk-cakir what are your opinions on this structure?

@TobiBu we could also spent a morning/afternoon together trying to merge the open pull requests to the main, may be complicated as they diverged from the main a lot. I think all open pull requests can be merged except the cue emission cube.

TobiBu · 2024-10-31T20:28:54Z

sounds all good to me! I like the layout of the versioning.

about merging, let's see. I will get to parts of it right now...

ufuk-cakir · 2024-11-03T21:21:29Z

thanks @MaHaWo for the suggestions!

I agree that we should close the PRs.

As deadline I would suggest 01. December, as this is the NeuRIPS deadline and then @ufuk-cakir (as I understood) wants to publish then the code

@anschaible I would say the code should be public by the date of the NeurIPS workshop (15.Dezember) since then people who want to use this code should be able to. We should plan to have the documentation page set up by then.

MaHaWo added enhancement New feature or request help wanted Extra attention is needed question Further information is requested future Feature that may be implemented in the future, left as a reference. labels Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Current development model is not sustainable and should be consolidated? #94

Current development model is not sustainable and should be consolidated? #94

MaHaWo commented Oct 30, 2024 •

edited

Loading

TobiBu commented Oct 31, 2024

anschaible commented Oct 31, 2024

TobiBu commented Oct 31, 2024

ufuk-cakir commented Nov 3, 2024

Current development model is not sustainable and should be consolidated? #94

Current development model is not sustainable and should be consolidated? #94

Comments

MaHaWo commented Oct 30, 2024 • edited Loading

TobiBu commented Oct 31, 2024

anschaible commented Oct 31, 2024

TobiBu commented Oct 31, 2024

ufuk-cakir commented Nov 3, 2024

MaHaWo commented Oct 30, 2024 •

edited

Loading