Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Product shape #101

Merged
merged 51 commits into from
Apr 19, 2017
Merged

Product shape #101

merged 51 commits into from
Apr 19, 2017

Conversation

axch
Copy link
Collaborator

@axch axch commented Apr 19, 2017

Thorough rewrite of the Aronnax Python package, carrying out the packaging plan outlined in Issue #95 sufficiently that the remaining items are decoupled and incrementalizable.

Effects on issues:

axch added 30 commits April 17, 2017 15:41
Also suppress ConfigParser's default behavior of case-normalizing the
options.
Turns out the driver was ignoring the botDrag parameter.
@edoddridge
Copy link
Owner

You weren't kidding - this is a serious rewrite!

I think I like the structure, but I'm struggling to get it running. After running python setup.py install the python files get copied over to a new directory under Anaconda, but they don't take the fortran core or the makefile with them. This means that compile_core tries to run in a directory without these files, and hence it fails. From some online sluething, it seems that we can use a MANIFEST.in file to make sure these files get copied across when the package is installed.

Unless I've missed something obvious?

@axch
Copy link
Collaborator Author

axch commented Apr 19, 2017

Travis seems to agree with you. This is actually an old bug, but I think I know why it is manifesting now and didn't manifest earlier. To wit, before, the .f90 file and the Makefile were, in effect, being searched for relative to the test suite rather than relative to the installed package.

I also know why I didn't experience this problem during development: I had install Aronnax with pip install -e ., which creates symbolic links in the installation directory that point back at the source files. This arrangement is convenient for iterating, because I don't have to reinstall every time I edit the source, but masks the problem that not enough files are, in fact, installed. If that is a sufficient workaround for you to be able to continue the review, I'd like to think a bit on the best way to resolve this issue more permanently.

@edoddridge
Copy link
Owner

Thanks - using pip install -e . means that the test suite runs, and will let me continue the review.

@edoddridge
Copy link
Owner

Comments:

  • I like the implementation of an input configuration file, that then gets altered and written to a new file to become the version that is used for the simulation and the basis of the parameters.in file for the fortran core.
  • As we discussed, this PR introduces lots of undocumented features and code, but that documentation can be incrementally included as the code and interface are refined. I don't think there is any point delaying this PR while we make the documentation more complete. As we use this new interface I expect we'll find friction points and refine the code, thus necessitating changes to the documentation.

Things that I've changed:

  • rename the beta_plane Coriolis functions to include an f in the name. Otherwise it seems like they refer to the velocities, rather than the grid locations.
  • likewise, rename the f_plane Coriolis functions.
  • make the .travis.yml file use the pip install -e . command so that it can run the test suite. Once this PR is merged, one of us should file a ticket to fix that issue.
  • split the Coriolis functions and the wetmask function into separate sections in the documentation
  • edit the docstring of driver.simulate() to indicate that aronnax-merged.confandparameters.in` are generated automatically.
  • added a section to the documentation called "Running Aronnax"

Questions:

  • There is a function named default_configuration, but it doesn't seem to actually create a configuration file with the defaults, apart from setting the compile prefixes to false. Am I missing something?
    • My conclusion from this is that it still requires the aronnax.conf file to exist and contain the defaults, which then get transferred to aronnax-merged.conf unless they are explicitly overwritten by the call to driver.simulate()`. I'm happy with that choice, but want to confirm that I've understood what is happening.
  • Would it be polite to have the driver notice if the output directories already exist, and write to new ones with an incremented number appended? This might be a pain to implement in the fortran core, but could certainly be done with the python generated NetCDF output. A fairly major downside of this is that it would break the correspondence between the config files and the output - it would be possible to edit the config, rerun the simulation and end up with output and config files that came from different simulations.
  • The aronnax-merged.conf files contain memory addresses for inputs that were generated from functions specified in aronnax.conf. Is that likely to be an issue for restarting simulations?

@axch
Copy link
Collaborator Author

axch commented Apr 19, 2017

Your changes look good to me. To your questions:

  • Yes, the default_configuration function is not a complete configuration, and aronnax.conf is required. Its purpose is to set default values for parameters required by the driver program, so that the driver does not crash if aronnax.conf is missing some fields that have reasonable defaults. It would be reasonable to extend this to provide sensible defaults for more parameters, especially "administrative" ones like DumpWind, but I actually think that forcing the user to look at the physics variables (e.g., by copying and modifying an example aronnax.conf) is better than trying to set defaults for them. You also correctly noticed that default_configuration does not save a default configuration file; it just operates in memory.
  • I don't think auto-incrementing names of directories is polite. Instead, it would be reasonable to do any subset of
    • status quo, and warn users in the documentation to save expensive run outputs; or
    • abort before starting the core if the output directory exists and is not empty, expecting users to move old runs out of the way; or
    • teach the core to abort instead of overwriting a file that already exists; or
    • offer the user runtime control of the output directory, e.g. by adding one more configuration parameter for it, perhaps with some combination of the above; or
    • actually, offering the user runtime control of the input directory accomplishes the same effect, because then they become free to control the output directory by controlling the working directory of the process; or
    • offering control of both directories; or
    • finally, as an additional option above and beyond setting the output directory, we can add an auto-increment feature; or possibly some other sort of autogeneration of the name, like a content hash of the configuration file and/or the inputs.
  • If one provided an in-memory function to driver.simulate, one should not expect to be able to automatically rerun it from the merged configuration file alone. (Actually, it is possible to try to provide that capability by saving that function in a Python pickle file, or some such, but that's a pain to think about.) However, specifically restarting, specifically on the same system, can presumably be coded (as an additional feature) to skip the input specification and reuse the generated .bin input files. Then it doesn't matter that the custom function cannot be directly recovered. The same code can also cover the "tweak a parameter and rerun" use case.

Which of the above items are actionable enough to pull out as their own tickets, or as additions to tickets 102-106?

@axch
Copy link
Collaborator Author

axch commented Apr 19, 2017

Speaking of which, do you find that tickets 102-106 cover the gaps you see in this PR? You specifically mentioned documentation, which is #103.

@edoddridge
Copy link
Owner

Cool. Thanks for the explanations.

I agree that forcing the user to think, at least briefly, about the physical parameters is a good idea. I was just a little thrown that a function called default_configuration doesn't actually create a default configuration. Expanding this function to cover the housekeeping parameters is probably worth its own ticket.

Honestly, I think the status quo with a warning in the documentation is probably the best option. So let's leave it as is and I'll open a ticket to add a warning to the docs.

I should have mentioned that 102-106 look good. Thanks for opening them.

@edoddridge edoddridge merged commit 2e839e0 into edoddridge:master Apr 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants