Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New features #45

Closed
7 of 12 tasks
peterdudfield opened this issue Mar 22, 2023 · 8 comments
Closed
7 of 12 tasks

New features #45

peterdudfield opened this issue Mar 22, 2023 · 8 comments

Comments

@peterdudfield
Copy link
Contributor

peterdudfield commented Mar 22, 2023

Would be interested to think what people think I should do first?
@JackKelly @jacobbieker @dantravers

@jacobbieker
Copy link
Member

I would probably start with removing the sde from training, and then probably more lag features? I think XGBoost models don't need the data to be normalized, so not sure that's necessary, although I guess if the units are different between CEDA and live MetOffice it probably makes sense to do that first.

@peterdudfield
Copy link
Contributor Author

Bonus one is to add mcc and hcc to nwp variables

@JackKelly
Copy link
Member

JackKelly commented Mar 22, 2023

if the units are different between CEDA and live MetOffice it probably makes sense to do that first

Yeah, it's pretty essential that the data the model sees at inference time is exactly the same as the data it sees at training time 🙂 so I agree that sounds like the priority!

And I agree with @jacobbieker that I don't think XGBoost models require the data to be normalised (because it chops real-valued inputs up into bins).

Does the model also get historical NWP data? If not, I think that might help a bit: i.e. if the model gets lagged GSP data for n timesteps in the past, then it might be useful to give the model NWP data for those same timesteps so the model can see the difference between the expected forecast (given the NWP) and what actually happened in the recent past. But maybe the model is already doing that?

@peterdudfield
Copy link
Contributor Author

Thanks @JackKelly and @jacobbieker , i re-ordered above, do you that order is about right?

@JackKelly
Copy link
Member

Lgtm!

@jacobbieker
Copy link
Member

Looks great!

@peterdudfield
Copy link
Contributor Author

Thanks, @dantravers you happy with this?

@dantravers
Copy link

Looks reasonable to me! I'd be curious to see if this does well, so could be higher?
Use historic NWP data, not just forecasts
But seems sensible. Thanks for asking the open question!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: No status
Development

No branches or pull requests

4 participants