Skip to content

Should ModelChain and PVSystemLocalized both have a future in pvlib? #492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
markcampanelli opened this issue Jun 24, 2018 · 10 comments
Closed
Labels

Comments

@markcampanelli
Copy link
Contributor

Is your feature request related to a problem? Please describe.

It is not clear to me what value ModelChain has in comparison to the PVSystemLocalized class.

Describe the solution you'd like

Let's discuss if the codebase can be simplified by removing one or the other of these classes, which appear (at least on the surface) to overlap significantly in purpose, e.g., computing performance for a PVSystem at a given Location.

Describe alternatives you've considered

I'm leaning towards getting rid of ModelChain and making PVSystemLocalized the workhorse, but I really don't know, because I don't really know what workflow might be more popular among users.

Additional context

See the architecture discussion in #487.

@jkfm
Copy link
Contributor

jkfm commented Jun 24, 2018

In terms of user workflow, I like the ModelChain approach. Although I usually estimate output for existing plants, I often do calculations for different systems at specific locations. To do that I find it helpful to keep PVSystemand Location-objects separate.

@markcampanelli
Copy link
Contributor Author

@jkfm Thanks for the feedback. What I do not understand is why we do not initialize a ModelChain object with a PVSystemLocalized object instead of separate PVSystem and Location objects, i.e., the system and location arguments to the ModelChain constructor. Do you happen to have any insight here?

@wholmgren
Copy link
Member

@thunderfish24 I appreciate your efforts to question and improve the API. Please see the following issues for the discussion behind the current design:

The modeling paradigms section of the documentation attempts to summarize the philosophy, though I can't say that it fully captures my thoughts or the community's.

With a few years of use, development, and maintenance I can add the following.

  1. I continue to mostly like the design philosophy of PVSystem, and Location. Intrinsic attributes of a system or location are stored as object attributes. Extrinsic things that influence the system or location are passed into the method arguments.
  2. ModelChain works surprisingly well. It avoids inheritance issues. It supports user extensions that do not require modifying the source code.
  3. It's difficult to add new models to ModelChain, even when the model API fits well with existing ModelChain code. This is in part understandable because I do think tying together all of the modeling steps is just a hard problem. The inference and getter/setter code adds another level of complication for developers, but they do make it easier for end users to work with the class.
  4. The ModelChain and even the Location/PVSystem API is a poor fit for some use cases. That's ok.
  5. Because of the immediately above point, it's critical that pvlib continues to maintain a strong, well-tested "base layer" of functions that implement the models. These functions will never go out of style.
  6. New frameworks, such as dask, have been developed in the last few years that provide powerful alternatives to ModelChain. pvlib should not build those frameworks into its code. pvlib should add examples of how to use its functions with those frameworks.
  7. The PVSystem class needs to live in its own module. This would make it easier to work with the pure functions.
  8. Location only implements get_* and from_* wrappers. We should consider the same approach for PVSystem.
  9. LocalizedPVSystem is a disaster. Multiple inheritance is extremely hard to do well. In my view, we should either go all in on a single class that only implements high level get_* style wrapper methods, or we should delete LocalizedPVSystem entirely. I don't want to do the former, and I think a few people would push back against the latter, so the likely outcome is that nothing changes in the foreseeable future.

@cwhanse
Copy link
Member

cwhanse commented Jun 24, 2018

The TL:DR is I am not in favor of doing away with, or making major changes to, ModelChain. I'm sure it can be better implemented, and the documentation and tutorials could be greatly improved.

I can't comment on ModelChain vs. LocalizedPVSystem, having used the former some, and the latter not at all. But the underlying user story for ModelChain is still valid: As a PV system modeler I want to run a sequence of models, specifying a model choice at each step, to get predicted PV system output. Before the ModelChain class we asked users to create scripts with the sequence of their chosen models. ModelChain provides some assistance with that process, and makes the scripting easier (I think) for new users.

One goal of PVLib (besides providing reference implementations of individual models) is to guide modelers through the modeling steps, from irradiance to AC power, and to educate about the choices available at each step. I think the current ModelChain meets this goal although in a somewhat less-user-friendly manner than I'd like.

@markcampanelli
Copy link
Contributor Author

@wholmgren and @cwhanse Thanks for the additional background, and again thanks for you patience as I come up to speed on PV systems and "out of the weeds" of single diode modeling and calibration. I will do my best to read up on those issues as quickly as I can find time for. My reaction so far to ModelChain w.r.t. PVSystem is that I think certain functionality would be better delegated to PVSystem (and merely orchestrated by ModelChain for integrated performance simulations). For example, the scaling of the DC power based on the system configuration seems to belong in PVSystem.

Echoing Cliff's statement about one important goal of PVLib, I think we really should strive to make ModelChain bring clarity and "orchestration" to the high level modeling steps/process, and banish the implementation details to classes like PVSystem and Location. Right now I don't think that this is sufficiently well done in ModelChain. For example, I now have some code where PVSystem defines a model_dc property that is assigned a callable (e.g., a function, including user-defined) that is exposed to ModelChain via the method calc_model_dc:

class PVSystem(object):
...
    def calc_model_dc(self, irrad, temp):
        """
        Calculate the DC model.

        Use the model specified in model_dc, which may be a user-defined
        function (more precisely, any callable obeying the API shown here).

        TODO Details, inc. irrad and temp being "generic"
        """

        return self.scale_voltage_current_power(
            self.model_dc(irrad, temp, **self.module_parameters))
...

Also, I have struggled a bit to accept the seemingly "non-DRY" wrappers of low level functions with PVSystem methods, but now I think I can see the flexibility that this affords some users' workflows. I think making consistent DC API's among pvwatts, sapm, and sdm (i.e., all functions that can be assigned to the PVSystem.model_dc property) will do a lot to lower the "development pain" here.

I should have some new code implementing these ideas coming soon in #478.

@cwhanse
Copy link
Member

cwhanse commented Jun 26, 2018

@thunderfish24 judging solely on the above code snipping, to me it looks like PVSystem.calc_model_dc is overlapping with ModelChain.dc_model. I think using a kwarg on a ModelChain object to select the dc model is better. I am now thinking that the keywords should be sapm, pvwatts, desoto, CEC, pvsyst, Campanelli etc.. We should give up singlediode as a key value and instead regard singlediode as a set of DC models. The ModelChain method singlediode should use the kwarg to select the appropriate calcparams function.

@markcampanelli
Copy link
Contributor Author

Re: #194 : I like the approach overall, but it seems like the implementation of ModelChain mixes up a whole bunch of concerns that should be kept separate. For example, the inclusion in the code below of dc_model=pvusa_mc_wrapper, ac_model=pvusa_ac_mc_wrapper in the ModelChain constructor seems out of place. Why isn't this in the PVSystem constructor?

# user defined parameters and functions
module_parameters = {'a': 3, 'b': 0.001, 'c': 1, 'd': -0.05}
system = PVSystem(module_parameters=module_parameters)
# pvusa_mc_wrapper is a user-defined function. see gist for details
mc = ModelChain(system, location, dc_model=pvusa_mc_wrapper, ac_model=pvusa_ac_mc_wrapper)
mc.run_model(times)

I would keep the high-level ModelChain constructor super simple, i.e., it takes only a PVSystem and Location. Then, we make one/more model_chain_factory "factory functions" that construct the PVSystem and Location objects from lots of different "parts" (inferred, if necessary) and pass them to the simple ModelChain constructor.

I haven't had time to dive into why subclasses are a bad design choice for the dc or ac models and PVSystem itself. Also, keeping all the data structure type handling straight through all the layers (e.g., pandas Series vs. pandas Dataframe vs. numpy array vs. (ordered) dict) adds significant mental and development overhead. I made what I thought was a "like-for-like" change in a refactor, and I'm now getting this error from within pandas: "ValueError: Unable to coerce to Series, length must be 2: given 1".

@cwhanse
Copy link
Member

cwhanse commented Jun 26, 2018

From my perspective, keywords like dc_model that select the model to be used at each step belong in ModelChain, not in PVSystem.

@wholmgren
Copy link
Member

At this risk of this coming off as yet another screed...

@thunderfish24 we share the same goals regarding the purpose of ModelChain: "ModelChain bring clarity and "orchestration" to the high level modeling steps/process".

Quick point of clarification:

banish the implementation details to classes like PVSystem and Location.

The implementation details should mostly live in the functions. This goes to my "functions will never go out of style" comment above.

Cliff's proposal would work for me.

A PVSystem.model_dc attribute would add a "self-awareness" to the PVSystem object. I think this makes the object more complicated, rather than less complicated. A PVSystem should continue to be a container for metadata and wrappers that automatically use that metadata if available.

For example, the inclusion in the code below of dc_model=pvusa_mc_wrapper, ac_model=pvusa_ac_mc_wrapper in the ModelChain constructor seems out of place. Why isn't this in the PVSystem constructor?

If PVSystem had a model_dc attribute as you propose, then it might make sense to put the pure functions (not the wrappers) in the PVSystem constructor. As it stands, it makes sense (at least to me) to put the wrappers in the ModelChain constructor.

@thunderfish24 what do you think about adding get_aoi_loss, get_dc_power, get_ac_power, etc. methods to PVSystem? That is what I was getting at with my comment 8 above. This would offload a lot of the model wrapping code from ModelChain onto PVSystem.

I would keep the high-level ModelChain constructor super simple, i.e., it takes only a PVSystem and Location

It is that simple, provided that you supply a PVSystem with sufficient metadata that the inference methods can fully determine your intent. All of the model choices are defined at one level of API: the ModelChain. This is simple (to me), though I admit the implementation, particularly for developers, is not.

Re data structures: this is already too long, so let's discuss that elsewhere!

@cwhanse
Copy link
Member

cwhanse commented Jan 20, 2021

@wholmgren closed via #1136?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants