Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems configuring marian #44

Open
bhaddow opened this issue Aug 9, 2024 · 7 comments
Open

Problems configuring marian #44

bhaddow opened this issue Aug 9, 2024 · 7 comments
Labels
bug Something isn't working need investigation Unknown scope

Comments

@bhaddow
Copy link
Contributor

bhaddow commented Aug 9, 2024

It looks like marian needs to be built in a specific way, different from the instructions in https://marian-nmt.github.io/docs/

  1. OpusPocus automatically adds bin to the path you give for marian. So you need a bin directory
  2. OpusPocus expects to find spm_train in this bin directory, so you need to make sure spm is built.

Maybe this should be fixed in the documentation?

@rggdmonk rggdmonk added bug Something isn't working need investigation Unknown scope labels Aug 12, 2024
@varisd
Copy link
Contributor

varisd commented Aug 12, 2024

This is the results of local pathing I had on LUMI and needs to be fixed. We should use the build/ directory (and ideally provide a way of installing marian).

@varisd
Copy link
Contributor

varisd commented Aug 13, 2024

Please check PR #49 for proposed changes to codebase and let me know whether the changes look like a feasible improvement of the current situation

@bhaddow
Copy link
Contributor Author

bhaddow commented Aug 13, 2024

I would have suggested a documentation update instead, but this also works. Is the sample pipeline config compatible with the marian installer?

@varisd
Copy link
Contributor

varisd commented Aug 14, 2024

Right now, I am suggesting having marian installation outside of the pipeline run as a part of the OpusPocus installation steps.

Technically, we can also update the GenerateVocabStep, TrainModelStep and TranslateStep to detect during execution, whether Marian installation is present and then running the installation scripts themselves. However, I am worried that in a situation when the marian installation would fail during the step execution a user might get confused.

Another approach would be implementing separate "3rd party software installation" steps which would be run at the beginning of the pipeline (or in a separate pipeline). It would be similar to the software installation targets in the Mozilla's Snakemake pipeline (which also have taken care of software installation)

@bhaddow
Copy link
Contributor Author

bhaddow commented Aug 17, 2024

To get this script to work on our server, I have to disable CUDNN (setting -DUSE_CUDNN=OFF)

And if I don't set the cudnn version, I cannot set the number of threads to all

@bhaddow
Copy link
Contributor Author

bhaddow commented Aug 17, 2024

Why do we need a script for cpu install? Would anyone train on cpu?

@varisd
Copy link
Contributor

varisd commented Aug 19, 2024

Why do we need a script for cpu install? Would anyone train on cpu?
I think it is not a bad idea to have the option there, for example, if someone wants to just try out OpusPocus locally. It could be also useful for translation (there, the CPU version should be usable).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working need investigation Unknown scope
Projects
None yet
Development

No branches or pull requests

3 participants