Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next release #147

Closed
3 tasks done
eboileau opened this issue Jan 30, 2023 · 43 comments
Closed
3 tasks done

Next release #147

eboileau opened this issue Jan 30, 2023 · 43 comments
Assignees

Comments

@eboileau
Copy link
Contributor

eboileau commented Jan 30, 2023

I would mostly be ready for the next release, but before

After that, what remains to be done? i.e. what are the next steps now?

I think we also need to change requires-python = ">=3.6,<3.11" to requires-python = ">=3.7,<3.11". Do we need to update environment.yml and dependencies in pyproject.toml, etc.

I'd also like to get/update some badges, at least for version or release, python, docs, build (or CI?), codecov, and pre-commit.

@eboileau eboileau self-assigned this Jan 30, 2023
@lkeegan
Copy link
Contributor

lkeegan commented Jan 31, 2023

next steps:

  • new pbiotools release (if necessary)
  • add pypi CI to rp-bp
  • release new rp-bp version (on github, triggering pypi release)
  • add bioconda rp-bp recipe using github release

@eboileau
Copy link
Contributor Author

Thanks, I just merged dieterich-lab/pbiotools#17. I will now merge to master and push.
I know this is not consistent with versioning, but I think this can just be "merged" into v3, without a new release.

@eboileau
Copy link
Contributor Author

@lkeegan what about dieterich-lab/pbiotools#13
I can just change the versions before pushing to master, but I don't want things to break... or we can just ignore it for now?

@lkeegan
Copy link
Contributor

lkeegan commented Jan 31, 2023

you can ignore it or merge it (probably easiest to first push your changes) - this PR isn't important and won't break anything, it's just updating the pre-commit version hooks

@eboileau
Copy link
Contributor Author

No, actually, this doesn't make sense to not release a new version, as the PyPI release needs to change accordingly. If we remove Python 3.6, in particular, then I guess we have to go for v4.

@eboileau
Copy link
Contributor Author

Ok, pbiotools is now to v4 release.

I will try to first sort the docs, in particular #146 , then add a pypi CI to rp-bp (copied from pbiotools), and update the pbiotools v4 requirement, etc.

@eboileau
Copy link
Contributor Author

eboileau commented Feb 6, 2023

@lkeegan , for the PyPI CI (pypi.yml), could you please set-up a template? I thought I'd use the pbiotools one, but in fact we would need to run the conda install, and not the pip install, or I am wrong?

@lkeegan
Copy link
Contributor

lkeegan commented Feb 6, 2023

@eboileau you should be able to use the pbiotools one - this is for PyPI so pip install with pypi dependencies is what we want - this is completely independent of conda/bioconda (although I guess you'll need to also install star etc in the CI if you want to run the tests)

@eboileau
Copy link
Contributor Author

eboileau commented Feb 6, 2023

Would that work to create a conda env to install bowtie2, fastqc, flexbar, samtools, star, then pip install .[test] --verbose , then run tests, etc. ? Then, the Stan models should compile when running the tests.

@lkeegan
Copy link
Contributor

lkeegan commented Feb 6, 2023

I would suggest to keep it simple, either

  • skip pip install & tests step in the pypi CI job, it just does python -m build and uploads the wheel to pypi
  • or duplicate the conda env stuff from the other CI job to allow you to run the tests

In either case, python -m build creates the wheel in an isolated environment so they should both generate the same wheel.
And I think skipping the tests would be ok since this is duplicating tests done in the other CI job

@eboileau
Copy link
Contributor Author

eboileau commented Feb 6, 2023

Yes, I also think skipping the tests is fine, so this would look like:

name: PyPI

on:
  push:
    tags:
      - "*.*.*"
jobs:
  pypi:
    name: Upload to PyPI
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: "3.10"
      - run: pip install --upgrade build
      - run: python -m build
      - uses: pypa/gh-action-pypi-publish@release/v1
        with:
          user: __token__
          password: ${{ secrets.pypi_token }}

@lkeegan
Copy link
Contributor

lkeegan commented Feb 6, 2023

yes, looks good to me

@eboileau
Copy link
Contributor Author

eboileau commented Feb 6, 2023

Thanks. I will then add this, update reqs to pbiotools>=4.0.0, etc. merge with master, and make a new release.
You said you already have a bioconda rp-bp recipe ready, right?

@lkeegan
Copy link
Contributor

lkeegan commented Feb 6, 2023

Sounds good! I have a wip bioconda recipe here that I'll update once there is a new rp-bp github release: lkeegan/bioconda-recipes@1f19d50

@eboileau
Copy link
Contributor Author

eboileau commented Feb 7, 2023

Hi @lkeegan
New release is there!

@lkeegan
Copy link
Contributor

lkeegan commented Feb 8, 2023

@eboileau great, I've updated the recipe & made a PR

If you download & extract the LinuxArtifacts file here you should be able to both install the conda package & try using the docker image:

bioconda/bioconda-recipes#39210 (comment)

Let me know if they work for you and I'll ask for the PR to be merged

@eboileau
Copy link
Contributor Author

eboileau commented Feb 8, 2023

Great, I'll test the package and the container usage, and let you know as soon as possible.

@lkeegan
Copy link
Contributor

lkeegan commented Feb 8, 2023

fyi after following the docker instructions there I could then run the docker container with:

docker run -it quay.io/biocontainers/rp-bp:3.0.0--py310h9f5acd7_0 /bin/bash

Both for conda and docker running compile-rpbp-models returned straight away for me, so looks like the pre-compiled models are being shipped, but I didn't try actually running anything

@eboileau
Copy link
Contributor Author

eboileau commented Feb 8, 2023

Thanks. I will try running the pipeline. I also want to try passing the config and writing results to disk, e.g. with --volume, and test the apps (not sure here whether we need something special?). I also want to test Singularity.

@eboileau
Copy link
Contributor Author

eboileau commented Feb 8, 2023

I have not been able to test the full pipeline. There are two immediate problems:


  1. run-all-rpbp-instances fails at the first sampling iteration
INFO     rpbp.orf_profile_construction.estimate_metagene_profile_bayes_factors 2023-02-08 11:44:42,328 : Estimating Bayes factors for lengths: 16,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35
^M  0%|          | 0/19 [00:00<?, ?it/s]^M 42%|~V~H~V~H~V~H~V~H~V~O     | 8/19 [00:03<00:05,  2.19it/s]^M100%|~V~H~V~H~V~H~V~H~V~H~V~H~V~H~V~H~V~H~V~H| 19/199
 [00:03<00:00,  5.12it/s]
joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/joblib/externals/loky/process_executor.py", line 428, in _process_worker
    r = call_item()
  File "/usr/local/lib/python3.10/site-packages/joblib/externals/loky/process_executor.py", line 275, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.10/site-packages/joblib/_parallel_backends.py", line 620, in __call__
    return self.func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/joblib/parallel.py", line 288, in __call__
    return [func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/joblib/parallel.py", line 288, in <listcomp>
    return [func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/rpbp/orf_profile_construction/estimate_metagene_profile_bayes_factors.py", line 148, in estimate_profile_bayes_factors
    (bft_periodic, bft_nonperiodic) = estimate_marginal_likelihoods(
  File "/usr/local/lib/python3.10/site-packages/rpbp/orf_profile_construction/estimate_metagene_profile_bayes_factors.py", line 64, in estimate_marginal_likelihoods
    bft_periodic = [
  File "/usr/local/lib/python3.10/site-packages/rpbp/orf_profile_construction/estimate_metagene_profile_bayes_factors.py", line 65, in <listcomp>
    pm.sample(
  File "/usr/local/lib/python3.10/site-packages/cmdstanpy/model.py", line 1201, in sample
    raise RuntimeError(msg)
RuntimeError: Error during sampling:

Command and output files:
RunSet: chains=2, chain_ids=[1, 2], num_processes=2
 cmd (chain 1):
        ['/usr/local/lib/python3.10/site-packages/rpbp/models/periodic/start-high-low-low', 'id=1', 'random', 'seed=8675309', 'data', 'file=/tmp/tmpfqz442ci/oiorukwp.json', 'output', 'file=/tmp/tmpfqz442ci/start-high-low-lowyr_l4ofx/start-high-low-low-20230208114446_1.csv', 'method=sample', 'num_samples=100', 'num_warmup=100', 'algorithm=hmc', 'adapt', 'engaged=1'] 
 retcodes=[127, 127]
 per-chain output files (showing chain 1 only):
 csv_file:
        /tmp/tmpfqz442ci/start-high-low-lowyr_l4ofx/start-high-low-low-20230208114446_1.csv
 console_msgs (if any):
        /tmp/tmpfqz442ci/start-high-low-lowyr_l4ofx/start-high-low-low-20230208114446_0-stdout.txt
Consider re-running with show_console=True if the above output is unclear!
"""

...

This is due to

/usr/local/lib/python3.10/site-packages/rpbp/models/periodic/start-high-low-low: error while loading shared libraries: libtbb.so.12: cannot open shared object file: No such file or directory

This is happening with Docker, Singularity, and also with the conda install.


  1. Rp-Bp uses symlinks, but these are broken with --bind (Singularity) or --volume (Docker).

This is not a critical issue, but depending on 1 above, we might have to do something anyway...

With singularity shell, $HOME is mounted by default. If I run the pipeline under $HOME, this is fine, e.g.

$ ls -la ~/data/c-elegans-chrI-example/without-rrna-mapping/
lrwxrwxrwx  1 eboileau dieterich    108 Feb  8 12:09 c-elegans-rep-1.bam -> /home/eboileau/data/c-elegans-chrI-example/without-rrna-mapping/c-elegans-rep-1Aligned.sortedByCoord.out.bam

but using --bind (or --volume with Docker) results in broken symlinks. In theory, this should be fine even to run the pipeline in multiple times, e.g. prepare-rpbp-genome, then run-all-rpbp-instances, etc., as long as the argument to --bind remains the same (the symlinks become "live" again), but this is a bit of a hack, and does not allow to actually use intermediate files outside the container if needed (although we generally only care about the final output, which is not symlinked).

Do you know of any workaround in Singularity/Docker? Otherwise, we need to modify the code (symlinks should in fact just be relative).

@eboileau
Copy link
Contributor Author

eboileau commented Feb 8, 2023

This seems related.
If this is the case, pre-compiling wouldn't work. I will test deleting the exec files, and re-running the pipeline.

@lkeegan
Copy link
Contributor

lkeegan commented Feb 8, 2023

1

The pre-compiled models link to libtbb.so.12

root@b1cbe6861ae8:/# ldd /usr/local/lib/python3.10/site-packages/rpbp/models/translated/periodic-gaussian-mixture
        linux-vdso.so.1 (0x00007ffd441f4000)
        libtbb.so.12 => not found
        libstdc++.so.6 => /usr/local/lib/python3.10/site-packages/rpbp/models/translated/../../../../../libstdc++.so.6 (0x00007f66c83cf000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f66c824c000)
        libgcc_s.so.1 => /usr/local/lib/python3.10/site-packages/rpbp/models/translated/../../../../../libgcc_s.so.1 (0x00007f66c8233000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f66c8072000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f66c869f000)

But the conda environment instead provides libtbb.so.2:

root@b1cbe6861ae8:/# ls /usr/local/lib/libtbb.so*
/usr/local/lib/libtbb.so    /usr/local/lib/libtbb.so.2

I'll look into this, we need to modify the recipe somehow to ensure the same tbb library version is being used for compiling and for the user installation.

2

I don't know of a good workaround for the symlinks with docker issue - but if you could modify the code to use relative paths to some base path instead of symlinks that would (imo) be a big improvement in usability/portability (not just for Docker)

@lkeegan
Copy link
Contributor

lkeegan commented Feb 9, 2023

It seems the libtbb.so culprit is flexbar - @eboileau before I dig any deeper just confirming this is a required dependency of rp-bp?

  • flexbar builds are currently blacklisted on bioconda due to Please upgrade to onetbb seqan/flexbar#38
  • I think currently flexbar may be in a partially broken state as it was compiled against the previous tbb API, and doesn't compile against the current version of tbb in conda-forge, so it might crash if it tries to use some no-longer supported tbb parallel functionality
  • some partial patches from debian maintainers here: https://salsa.debian.org/med-team/flexbar/-/blob/master/debian/patches/onetbb.patch
  • ideal fix would be to finish above job of porting flexbar to new oneTBB api & then make a new release for it on bioconda

Not sure what the best solution is in the meantime, but maybe a pragmatic option would be to skip the pre-compiled models in the first bioconda release and add this in a later bioconda release?

@eboileau
Copy link
Contributor Author

eboileau commented Feb 9, 2023

Hi @lkeegan yes, Flexbar is definitely required.
If skipping compilation, they should be compiled by default on the first run. I think we can modify the recipe as you suggest.


Re 2 above, I don't want to re-write the complete I/O. The easiest solution for now is to remove symlinks by actually renaming files if needed. I'm working on this.

@eboileau
Copy link
Contributor Author

eboileau commented Feb 9, 2023

This case was not set-up in the regression tests as we don't have data for it... this is unfortunate...

In the rpbp recipe as well as in toml and environment, we specify pbiotools >=4.0.0. I'll try to address this tomorrow and release 4.0.1. I assume rpbp would anyway use the latest release available, but it would be better to actually use pbiotools >=4.0.1. I will also make a patch release of rpbp to address this and above.

@lkeegan can you please bear with me before updating the recipe? Thanks.

@lkeegan
Copy link
Contributor

lkeegan commented Feb 9, 2023

@eboileau yes no problem, I'll try to make some progress on the flexbar issue in the meantime

@eboileau
Copy link
Contributor Author

I'll be ready to push and make a patch release of Rp-Bp, as soon as pbiotools 4.0.1 is available on bioconda.

P.S. I'm off for 2 weeks from 14 Feb.

@lkeegan
Copy link
Contributor

lkeegan commented Feb 13, 2023

@eboileau pbiotools 4.0.1 is now on bioconda (and I didn't do anything, the bot made a PR from your github release & a maintainer merged it)

@eboileau
Copy link
Contributor Author

Thanks, indeed saw it yesterday. I'll try to sort #150, then release.

@eboileau
Copy link
Contributor Author

Oh no... For macOS, this doesn't work anymore...

Command ['make', 'STAN_THREADS=TRUE', '/usr/local/miniconda/envs/rpbp/lib/python3.7/site-packages/rpbp/models/nonperiodic/no-periodicity']
	error during processing No such file or directory

and a lot of warnings... see https://github.com/dieterich-lab/rp-bp/actions/runs/4163364277/jobs/7203670504
I don't think we did anything that should have affected the model compilation since https://github.com/dieterich-lab/rp-bp/actions/runs/4105315063

@lkeegan
Copy link
Contributor

lkeegan commented Feb 13, 2023

This looks like a mismatch between compiler & linker. Looking at the conda environment there is a mix of defaults & conda-forge packages which is likely the cause of this - certain packages (like the linker) are not compatible between defaults and conda-forge.

The good news is I think this is just an issue with the conda environment on the CI, the first thing I would try is modifying this line:

channel-priority: true

          channel-priority: strict

Then hopefully the conda-forge packages will take precedence over the defaults.

The bad news here is that looking through the logs I noticed that stan is enabling a bunch of cpu-specific optimizations in the c++ compile step, which means the pre-compiled models may crash with "illegal instruction" errors if the cpu used when building the bioconda recipe supports some instruction that the users cpu doesn't.
Two options here:

  1. Give up on shipping pre-compiled models, then it is fine that stan enables these optimizations since the compiling happens on the same cpu that will be used to run
  2. Disable these optimizations in the pre-compiled models (I found cpp_options but not clear if you can disable these optimizations using this or if you need to modify the makefile). Also the resulting binary may be slower than if it was compiled with optimizations enabled (but I'd be surprised if there was a significant difference)

Maybe option 1 is the safer/simpler way to go, especially as compiling the models doesn't take very long?

@eboileau
Copy link
Contributor Author

@lkeegan Thanks for your quick feedback.

Yes, I was looking into this in more details, I also think setting the channel priority might work.


As for the cpp_options , actually I don't think we need it, since I understand it is mostly used for high-level parallelization via multi-threading, which we did not use in the end. The problem may remain, unless all the cpu-optimization compile steps are only due to this flag, as you point out.
I think option 1 is safer, and also circumvent the above Flexbar problem. I find it annoying not to be able to ship pre-compiled models, but in the end compilation would really just happen on the first run.

@eboileau
Copy link
Contributor Author

I will update the CI config, and remove the Stan cpp option from defaults, and check if tests are passing on GA.

@eboileau
Copy link
Contributor Author

Damn it! Looks like it's not going to be solved before I leave for vacation...

Now it's not resolving Dash + werkzeug versions correctly...
In fact, even for Ubuntu there are conflicts when setting up the environment... and eventually samtools seem to be failing...

@lkeegan
Copy link
Contributor

lkeegan commented Feb 13, 2023

As this looks like only a problem with the CI setup, maybe it would make sense for you to make a release ignoring the broken CI?

I can then do the bioconda packaging & fix the CI, and if there are any issues with the actual release they can be fixed in a patch release when you're back from vacation?

Re. shipping pre-compiled models on bioconda after some more research I actually think this should be ok:

@eboileau
Copy link
Contributor Author

Wow!

But shall I leave channel-priority: strict? i.e make a release as is?

@lkeegan
Copy link
Contributor

lkeegan commented Feb 13, 2023

I think it's ok - this only affects the conda environment in the CI jobs, not what you release

@eboileau
Copy link
Contributor Author

v3.0.1 is there now.

@lkeegan
Copy link
Contributor

lkeegan commented Feb 15, 2023

This should fix the conda issues in the CI: #151

@lkeegan
Copy link
Contributor

lkeegan commented Feb 15, 2023

Test suite now runs successfully for all linux conda packages built here as well as one of the docker containers (all with pre-compiled models): bioconda/bioconda-recipes#39210

@eboileau
Copy link
Contributor Author

eboileau commented Mar 3, 2023

Thank you for this @lkeegan.
I just merged #151, but I'm not sure if I need to make a new release now i.e. if the bioconda package needs to be updated or if all is done? Before installing, I'm trying to use the docker containers and see the rpbp/tags, but I get Your user session has expired. Please sign in to continue., which is strange... (looking here bioconda/bioconda-recipes#39210, I'm not sure in what order tags are given, i.e. which one to use).
Thanks for a short update.

@lkeegan
Copy link
Contributor

lkeegan commented Mar 6, 2023

@eboileau

@eboileau
Copy link
Contributor Author

eboileau commented Mar 6, 2023

@lkeegan

Thanks for the update.

Re rpbp docker, that's right, I can access the containers now, and all seem to work fine, and also singularity, except one little thing, see #150. I'm trying to see what is the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants