Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting image width and hight via a grammar? #8

Open
2 tasks
isedwards opened this issue Oct 22, 2020 · 9 comments
Open
2 tasks

Setting image width and hight via a grammar? #8

isedwards opened this issue Oct 22, 2020 · 9 comments
Assignees

Comments

@isedwards
Copy link
Collaborator

Currently the width, height and resolution of the images generated by R are hardcoded in a section of code that needs to be refactored for use with all R process methods.

  • What interface should be exposed for users to change these image parameters and does this form part of a grammar?
  • Where should default settings be stored if the user does not make a choice (e.g. a yaml config file that can be set up by a server administrator, or a database settings table that allows default settings to be modified by administrators through a user interface in the future)?
@isedwards
Copy link
Collaborator Author

isedwards commented Oct 22, 2020

I can take on the second of these. I think it's clear that OpenCDMS will have it's own database backend for extensive configuration (in addition to the climate record). In development there should be the option for this to be an SQLite database that has sensible defaults.

In testing and production, the main database management system can be utilised instead of SQLite.

@dannyparsons
Copy link
Contributor

On the first point, as an example, in R, generally saving a graph is a separate function to the functions that create graphs. For example, graphs produced with the ggplot2 package can be saved with the function ggsave(). The main relevant parameters of ggsave() appear to be width, height and units (in, cm or mm) and possibly dpi.

The image_graph() function from the magick package is another example that is currently used in the windrose function you linked to above has similar width, height (in pixels) and res parameters. (This could also be used to save ggplot graphs.)

If we are thinking about a grammar then it certainly makes sense to separate out the definition and production of a graph from saving a graph as an image. This is achievable for graphs produced with the ggplot2 package in R, for example, because graphs are saved as a ggplot object in the R environment, which can then be further manipulated, printed to a graphics device, or saved separately e.g.

library(ggplot2)
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
ggsave(filename = "scatter_plot.png", plot = p)

However, I am not sure if this same structure will be possible for the OpenCDMS processes in general since plots will be produced from many different systems. I imagine that for some of the systems that will be used to general plots, they may only return an image file and it may not be possible to retrieve an intermediate object?

Possibly we may want to first define the parameters we want users to be able to specify when generating a graph and then determine if we can apply these parameters in consistent ways for the current known systems that will be used to produce graphs. From there we could then see how it may be possible or not to separate the image saving in these cases. Does this sound sensible?

@isedwards
Copy link
Collaborator Author

We made quite a lot of progress in the last technical meeting discussing the grammar.

@dannyparsons / @lloyddewit - could you focus on demonstrating the processes (including boxplots) in R? I can then incorporate these into Python and also refactor everything so that the implementation is tidier.

  • We need to jointly agree the input test data
  • Demonstrate the process working in R (Danny/Stephen)
  • Create inline documentation for process in R (Danny/Stephen)
  • Incorporate the process and documentation into Python and add tests (Ian)

Does this sound okay? We may be able to get through the highest priority processes quite quickly with this approach.

@dannyparsons
Copy link
Contributor

Sounds good. I will start with a documented R script for climatic boxplots and share that with you so you can start the incorporation process.

@dannyparsons
Copy link
Contributor

@iedwards attached is an R file containing a documented climatic_boxplots() function with examples using dummy data. I have added inline documentation of the code. (I have also documented the function using the roxygen2 standard used for generating R package documentation though this may not be necessary.)

Is this what you need to incorporate into the processes?

Let me know if you would prefer I upload this somewhere in the repo instead.

climatic_boxplots.zip

@isedwards
Copy link
Collaborator Author

Yes - this looks good.

In the long term, I not expecting any of the R-Instat or Climatol code to live in the OpenCDMS repository (we don't want to maintain it here).

Instead, I'm expecting that we should be able to do something like:

library(r_instat)
library(climatol)

Then we would only need the examples of how to call the process in order to write the Python wrapper and tests

@dannyparsons
Copy link
Contributor

That sounds sensible in general. For many processes there is an easily extractable R function which can be packaged and then used by both R-Instat and OpenCDMS. This is something we would like to do anyway.

There's some interesting cases as well for the R-Instat team to consider. For example, in R-Instat, climatic boxplots is a dialog but we do not use the function climatic_boxplots() to generate the plot. Instead, the dialog directly produces the R code for which climatic_boxplots() is a wrapper for e.g. for the equivalent of:

climatic_boxplots(data = df, x = "month", y = "tmin", facet_row_var = "station")

the R-Instat dialog would produce:

ggplot(df, aes(x = month, y = tmin)) +
  geom_boxplot() +
  facet_grid(rows = vars(station)) 

so I wrote the function climatic_boxplots() to replicate what the R-Instat dialog does essentially. We can still maintain R functions like climatic_boxplots() in the package which will have the same functionality as R-Instat, but may not be called in R-Instat for the same process, but we can ensure have consistent functionality. There may be other uses for these functions outside R-Instat too (e.g. for R users) so I think there is a good case for us maintaining them, even if not used directly in R-Instat.

In short, this approach is sensible and R-Instat processes can fit into this.

@dannyparsons
Copy link
Contributor

@iedwards are you happy with the format of supplying the R script for climatic boxplots? If so, I will continue with the other priority processes.

@isedwards
Copy link
Collaborator Author

Yes - although, instead of attaching sample codes to an issue as a zip file could you create a sub-directory in the repository and resume regular pull requests?

We can accept any submissions that are in the sub-directory and then I will refactor out to other places.

I've just merged #9. I will go through the conversation again and create issues for any of the points that still need solutions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants