[STEP] Design Document for Prediction pipeline of `ptf-v2` #46

phoeenniixx · 2025-08-07T05:16:02Z

A proposal for the predict of pytorch-forecasting v2
(Copied from the hackmd: https://hackmd.io/@Pm5-sJBvSfeR6I59oCaLOA/BJIqEgYDlg/edit)

phoeenniixx · 2025-08-07T05:21:48Z

FYI @fkiraly @agobbifbk @PranavBhatP

fkiraly

Thanks, great summary!

May I request some basic elements to this STEP:

IMPORTANT: ensure you also have usage vignettes for the new design. Put these at the top of the "new design" sections
IMPORTANT: also at the top, discuss requirements, and design principles you are using. Do not start with the solution (this is the wrong place to start in writing and in thinking), but describing the aim and problems.
introduction, motivation, high-level summary of what and how this does
in sections "code snippets", make clear whether they are designs or status quo; vignettes or internal code

fkiraly · 2025-08-07T08:29:41Z

Some design comments about the content:

I think predict has too many arguments. Can we reduce their number?
I think predict should be on the level of D1. So that users will never have to deal with the particular architecture in predict.

Regarding STEPs, should this not be in a asingle step together, with scope being ptf v2 API?

phoeenniixx · 2025-08-07T09:56:41Z

I think predict should be on the level of D1. So that users will never have to deal with the particular architecture in predict.

Can you please elaborate what are you thinking? I am not quite sure how to move forward with this

I think predict has too many arguments. Can we reduce their number?

Yeah we can use list (like dicts in __getitem__ of D1) to club similar args like use param returns which is a list containing the args you want to return - index, x etc.

Regarding STEPs, should this not be in a asingle step together, with scope being ptf v2 API?

Well I didnot had the edit access to #39, so I raised a new one :)

fkiraly · 2025-08-07T11:04:06Z

Can you please elaborate what are you thinking? I am not quite sure how to move forward with this

I am thinking: much closer to the sktime interface. So that the v2 interface is completely independent of model architecture, except through __init__ of model or package.

In particular, it means that predict can have args that relate to where to forecast, or taking exogenous data, but must not relate to model specifics such as decoder/encoder length.

I think predict has too many arguments. Can we reduce their number?
Yeah we can use list

Good idea, or dict.

Well I didnot had the edit access to #39, so I raised a new one :)

I see, @pranavvp16, can you give edit access? We can also leave different parts in different PR to work on them, but ultimately imo we want to have a single doc.

fkiraly · 2025-08-07T11:06:29Z

I noticed, #39 was me. I have now given you and @PranavBhatP write access so that you can directly edit. Also happy to use this PR instead and copy stuff over from #39, as you prefer. Perhaps for the start it is even better to keep the two PR separate?

agobbifbk · 2025-08-07T11:23:37Z

Probably here we need a distinction between the predict method of the model class (usually the forward loop but in some cases it can be model specific), and here what I can imagine is that there are not so many parameters (for example if I trained the model using a distribution loss I can ask for a single sample or multiple samples returning the point mean and standard deviation for example), and the predict of D1/D2 layer somehow that process the tensors given by the model predict using also the time and groups from the dataloader giving to the user a more usable output of the prediction (csv, pandas dataframe, xarray, ...).

There is something that I faced when we develop DSIPTS related to realtime prediction. In this case we don't have the future target and some known variables (e.g. the hour or the month). If we reuse the D2 as it is now probably the sample generation process will NOT produce the sample we need (we are discarding target with NaNs I suppose). Somehow we need to think to put this logic during the sample generation: what we do in DSIPTS is extending the input data with additional timestamps with nans on the target (or generally on the unknown variables) before extracting the temporal features and creating the sample(s). It is not relevant at this point of the discussion but we need to remember it :-)

phoeenniixx · 2025-08-10T09:31:25Z

Hi @fkiraly, @agobbifbk, I've made some additions to the design doc, Please review
Some points:

the .predict() is accepting only D2 layer (and not D1 layer or dataframe) as @agobbifbk feared this could lead to coupling as we'll be creating an instance of D2 layer (in case of D1 layer) and both D1/D2 layers (in case of raw dataframe) inside the BaseModel (where .predict() will reside) and I agree with him.
Maybe we can move the to_dataframe from an independent util to the PredictCallBack itself, because I think the data from return_info should be enough to create a dataframe? The return_info will return index, x, y etc and I think this should suffice. Although, I still need to try it out locally, So I am not sure if I am thinking right here.

steps/26_predict_pipeline_for_ptf_v2/step.md

fkiraly

Great start! I have two big comments.

I think it is important to design this with at least two different examples for D2 in mind. Suggestion how to proceed:

find at least one model that will not be EncoderDecoderDataModule for D2
write down the vignette
given both vignettes, compare and write one "generalized" vignette

It should be possible to access predict without D2, using only D1 and specification syntax

have a look at the M layer design: sktime/pytorch-forecasting#1870 - this would also work for predict
the interesting question is, should this be additional to, or instead of, the direct use of D2 and T layer?

agobbifbk · 2025-08-18T07:15:53Z

It should be possible to access predict without D2, using only D1 and specification syntax
The D2 layer produces the dataloaders for the training procedure. We can think to use only a D1 + some information for rebuilding correctly the dataloader, but we need to store some information such as context/prediction length and scalers. If we force the user to pass through the D2 layer we are sure that all the information are in the correct place, I understand that this is an over-killing procedure, do you have any ideas to make it lighter?

phoeenniixx · 2025-08-18T12:59:08Z

It should be possible to access predict without D2, using only D1 and specification syntax

Yeah, if you look at the recent changes, I've introduced a layered approach, where the D1 layer object creates a D2 layer object inside the _pkg class (and not in the actual model class), and the D2 layer is sent to the model class by the pkg class (actually just the dataloaders, D2 layer creates the dataloaders inside the _pkg class)

phoeenniixx · 2025-08-18T13:00:05Z

If you want we can have a discussion on the new approach, where I could explain exactly the idea I am thinking (based on the suggestion made by @fkiraly )

phoeenniixx · 2025-08-21T12:58:15Z

Hi @fkiraly, @agobbifbk, I have added the doctsrings to the model package class idea, adding the "side effects" of using the ckpt_path etc. Please have alook at it :)

agobbifbk · 2025-08-21T13:25:40Z

Can you point out the new lines, it will be easier for me, thx!

phoeenniixx · 2025-08-21T13:40:18Z

The main doctsrings lie in this class :

enhancement-proposals/steps/26_predict_pipeline_for_ptf_v2/step.md

Line 666 in 1fb04a3

* ***Package Class***

and the proposal starts from here

phoeenniixx · 2025-08-21T13:41:02Z

I would really appreciate some commments on the docstrings of the package class (starting from line 666)

agobbifbk · 2025-08-21T13:55:52Z

Ok thx I was looking at the HMD file :-(

agobbifbk · 2025-08-22T08:23:45Z

You wrote The .predict() method signature for all models in v2:

def predict(
    data,
    mode: str = "prediction",  # "prediction", "quantiles", "raw"
    return_info: list[str] | None = None,  # e.g. ["index", "x", "y", "decoder_lengths"]
    write_interval,  # when to write to the disk?
    output_dir: str | None = None,  # if provided, stream to disk
    **trainer_kwargs,
) -> dict[str, torch.Tensor]:

do you really mean model or are you referring to the base class or a model wrapper class? Hard to think that each model implements such a function right?

pkg = DeepAR_pkg(trainer_cfg=trainer_cfg, ckpt_path="checkpoints/last.ckpt")
prediction_output = pkg.predict(
    data_module,  
    mode="quantiles",
    return_info=["index"]  # return index to get time_idx and groups
)

DataModule prepares the dataloader, what happens if you passes data_module to the predict? Probably here you want to get the prediction from the test set, right? So you need the test_dataloader? I see you do it after, my question is: do we want the predict method accepts also a dataModule object?

Here I see a critical point (you already mentioned it :-) ). If a checkpoint is passed, the datamodule_cfg must be loaded from the correct place:

advantages: less burdern for the user, we are sure we load the correct sutff
disadvantages: none

    - If ``ckpt_pth`` is NOT None:
        - The ``datamodule_cfg`` can either be ``dict`` or ``path``.
            - If ``dict``, the datamodule is directly configured using the dict, but this 
              is dangerous as the configurations should be exactly the same otherwise the 
              model pipeline will not behave as intented

Why optionally save checkpoints?

 Provide ``model_cfg`` + ``trainer_cfg`` + ``datamodule_cfg`` (as dict). 
      Call ``pkg.fit(dataset, ...)`` to train and optionally save checkpoints.

What do you think about this: I trained my model for 1000 epochs in 2days, I accidentally rerun the same script. I would like to be stopped in the case we see that there is already a trained model for that configuration. Do you think a simple boolean overwrite parameter can fit in the design?

Another point here:

        preds = self.model.predict(dataloader, mode, return_info,write_interval, 
                                   output_dir, trainer_kwargs, 
                                   **anyother_param_and_kwargs)
       return preds

this can be critical: suppose we have a D1 layer that reads from a large collection of csv meaning that the preds can not be stored in memory. As I understand from the rest of the document, the self.model.predict takes care to eventually save the result to the disk BUT I don't see how the model can revert the values in case of the application of a scaler on the target for example. The scalers probably are saved in the D2 object (are we sure that all D2 have a scaler?).
What about something like:

        for batch in dataloader:
              res = self.model.predict(batch)
              ##other logic for saving the results here so that we have access to all the D1/D2 info
              res_manipulated = d2.process_output(res, **some_params)
              ?? = d1.save_prediction(res_manipulated, **some_other_params)

Hope this helps and does not confuse you :-) Thx for the enormous work done so far!

phoeenniixx · 2025-08-22T09:13:24Z

do you really mean model or are you referring to the base class or a model wrapper class? Hard to think that each model implements such a function right?

I mean the predict of BaseClass (see here)
Each model doesnt have to implement this, as the basic logic will lie inside the base class and like the current API, most of the models wont even require the .predict() implementation of their own. Although to pass these params to the BaseClass, we'd need the user to pass these params to the wrapper as well, so similar signature will be in the wrapper class as well

DataModule prepares the dataloader, what happens if you passes data_module to the predict? Probably here you want to get the prediction from the test set, right? So you need the test_dataloader? I see you do it after, my question is: do we want the predict method accepts also a dataModule object?

Well the user will pass the data_module to the predict of the wrapper, which will create the data loaders and then pass it to the model layer. We can have predict accept the data_module but ONLY if we want the user to give a way to not to use this wrapper at all and rather follow the same flow as wrapper manually and there they pass the data_module to the model layer. (Although I think if the user is going to do everything manually, they'd want the loading process to be done manually as well, If they want the data loading to be done manually, then they can pass those manually created data loaders directly to the model layer..?)

Why optionally save checkpoints?

Well I thought if someone is just trying out different models, and following the wrapper flow, everytime they'd call fit, it will always save the checkpoints, this could be frustrating as they were only trying out things and didnt want to save those ckpts, if the user actually want to save the ckpt, they can make save_ckpt=True. What do you think?

What do you think about this: I trained my model for 1000 epochs in 2days, I accidentally rerun the same script. I would like to be stopped in the case we see that there is already a trained model for that configuration. Do you think a simple boolean overwrite parameter can fit in the design?

Are you saying for the saved checkpoints? yes, we can have it, or rather better idea could be that we always save the checkpoints in a new folder inside the checkpoints folder, named like this - ckpt_date_time, in this even if they run multiple times, they could easily go back to the correct folder (much like GH commits)?

this can be critical: suppose we have a D1 layer that reads from a large collection of csv meaning that the preds can not be stored in memory. As I understand from the rest of the document, the self.model.predict takes care to eventually save the result to the disk BUT I don't see how the model can revert the values in case of the application of a scaler on the target for example. The scalers probably are saved in the D2 object (are we sure that all D2 have a scaler?).

Sorry I am not able to follow, can you please elaborate, or we could discuss in the meet today. But for the last question, I think from my current understanding, D2 layer should always have scalers, why would a D2 layer wont have scalers?

fkiraly · 2025-09-19T06:58:49Z

steps/26_predict_pipeline_for_ptf_v2/step.md

+## Aim
+
+
+Current beta version of `ptf-v2` doesnot have any functionality to do the predicitions and this design document aims to provide some possible ideas to implement the prediction pipeline


I would recommend to be clearer - e.g., "pipeline" is not exactly what happens here and might lead to confusion with sklearn pipelines.

pipeline -> vignette?

"does not have dedicated predict mode"

fkiraly · 2025-09-19T07:20:58Z

steps/26_predict_pipeline_for_ptf_v2/step.md

+#### Output Types:
+The output is a `Prediction` class type object which has different keys depending upon the `mode` and other params (like `return_x` etc).
+Here `N` is the size of validation data
+* "prediction" -> tensor of shape`(N, prediction_length)`


lack of information: where in the Prediction class are these?

Sure, I'll add a more info - exactly explaining how the class looks like

fkiraly · 2025-09-19T07:22:15Z

steps/26_predict_pipeline_for_ptf_v2/step.md

+
+In v2, we should try to design `.predict()` to be more general, composable, and predictable while retaining ease of use.
+
+### Requirements


very nice and well thought out

fkiraly · 2025-09-19T07:26:00Z

steps/26_predict_pipeline_for_ptf_v2/step.md

+* **Memory safety:** Large predictions can be streamed to disk without exhausting RAM.
+
+### High-Level Summary
+The proposed `.predict()` system for v2:


I think there are some open questions:

we have two layers, so which of the two layers (pkg or model) have predict-like functions?

considering alternatives, e.g., having predict, predict_quantiles, and predict_raw, as opposed to a mode argument. This is how sktime is doing it, and we did actually consider (without considering ptf as an inspiration) to have a mode-like arg and actively decided against it. The reason was, the function would just have ended up as large if/else blocks, and dispatching to each other would be as unpleasant as the methods to_quantiles etc currently like

not saying that this is how we need to do this, and the two layers can even handle it differently - only that this alternative should be discussed. Why is it worse for D2 if it looked better for sktime? This should be a conscious decision.

we have two layers, so which of the two layers (pkg or model) have predict-like functions?

if we go by naming - both classes would have a function named predict. But their working would be a little different:

For pkg: it would be a wrapper, calling predict() of model layer. See here for a basic idea how it would look like.

For model layer: it would be the actual predict() that would wrap the trainer.predict() and callbacks.

Now looking back at the high level summary in the EP, I think this is not clear enough, I would add more clear pointers :)

considering alternatives, e.g., having predict, predict_quantiles, and predict_raw, as opposed to a mode argument.

Hmm that is a good suggestion, and i agree it would be a a mess of if/else blocks. From here I think we could merge both ideas? Like for the wrapper, we would keep the modes, but for each mode, we call a different predict inside the pkg layer - this would make the code cleaner at model layer and at pkg layer it would just be a if/else block where we call a different type of predict for each mode.
Something like this:

Vignette would remain the same

prediction_output = pkg.predict( data_module, mode="quantiles", return_info=["index"] # return index to get time_idx and groups )

pkg layer predict

def predict(self, dataset, mode, return_info, write_interval, output_dir, to_dataframe, trainer_kwargs, **anyother_param_and_kwargs): predict_dm = self._build_datamodule(dataset) dataloader= self._create_dataloaders(predict_dm) if mode == "predict": self.model.predict(...) elif mode == "raw": self.model.predict_raw(...) elif mode == "quantile": self.model.predict_quantile(...) else: raise ValueError

I think this would be useful for the user? just change the mode, and rest is handled by the pkg layer. If there are different predicts at pkg layer, that would mean the user would have to change the whole function (and maybe even the signature, based on the requirements of the func) which would otherwise be handled by the wrapper.

What do you think?

fkiraly · 2025-09-19T07:26:57Z

steps/26_predict_pipeline_for_ptf_v2/step.md

+    return_info: list[str] | None = None,  # e.g. ["index", "x", "y", "decoder_lengths"]
+    write_interval,  # when to write to the disk?
+    output_dir: str | None = None,  # if provided, stream to disk
+    **trainer_kwargs,


question/idea: in D2, should trainer_kwargs move to __init__?

These trainer_kwargs are used for the trainer initialization (add some customizations), and as trainer is used only while predict I am not sure if we should move it to __init__. I think a user may want to run fit on some other kind of trainer (like run on cpu, not gpu) than predict (run on gpu). This provides more flexibility? Also, it may be a possibility that we are using a pre-trained model, at that time trainer_kwargs would be used only by predict and not anywhere else (atleast I cant think of any other place).

That's why it is not kept at __init__. We keep trainer_cfg for the fit in __init__, but if the user want a little different predict they could pass trainer_kwargs.

What are your thoughts on this?

fkiraly · 2025-09-19T07:27:49Z

steps/26_predict_pipeline_for_ptf_v2/step.md

+    ..., # other params like target_normalizer, num_workers etc
+)
+# init package
+pkg = model_pkg(model_cfg, trainer_cfg=trainer_cfg, datamodule_cfg=datamodule_cfg)


interesting. Question regarding alternatives: why not **trainer_cfg, **datamodule_cfg? Have you explicitly considered both options?

well model_pkg(model_cfg, trainer_cfg=trainer_cfg, datamodule_cfg=datamodule_cfg) keeps the cfgs separate, I am not sure **trainer_cfg, **datamodule_cfg could be used..
Are you saying something like this?
model_pkg(model_cfg, **trainer_cfg, **datamodule_cfg) - I think this would be harder to parse?
From my understanding, something like this:

pkg = model_pkg(model_cfg, **trainer_cfg, **datamodule_cfg)

would unpack trainer_cfg and datamodule_cfg and would pass the keys as args to the class. So the class should have all the parameters of trainer and datamodule that would make __init__ very complex where most of its params would only be used to initialise the datamodules and trainer? so why dont we keep them separate as dicts?

fkiraly

Really nice design, I think - some questions above.

phoeenniixx added 2 commits August 7, 2025 10:43

Design Document for Prediction pipeline of 'ptf-v2'

f2050e3

update step.md

b1cd913

fkiraly requested changes Aug 7, 2025

View reviewed changes

fkiraly added this to May - Sep 2025 mentee projects Aug 7, 2025

fkiraly moved this to PR in progress in May - Sep 2025 mentee projects Aug 7, 2025

fkiraly assigned phoeenniixx Aug 7, 2025

add vignettes and structure implementation plan

1f27101

phoeenniixx requested a review from fkiraly August 10, 2025 09:24

jgyasu moved this from PR in progress to PR under review in May - Sep 2025 mentee projects Aug 11, 2025

fkiraly reviewed Aug 14, 2025

View reviewed changes

steps/26_predict_pipeline_for_ptf_v2/step.md Outdated Show resolved Hide resolved

fkiraly requested changes Aug 14, 2025

View reviewed changes

fkiraly moved this from PR under review to PR in progress in May - Sep 2025 mentee projects Aug 14, 2025

update vignettes

0771659

phoeenniixx requested a review from fkiraly August 15, 2025 12:33

add comments

ef03169

jgyasu moved this from PR in progress to PR under review in May - Sep 2025 mentee projects Aug 15, 2025

phoeenniixx added 4 commits August 15, 2025 18:18

add disclaimer

9bd9a0c

add the layered approach

7b99965

update file paths

af13382

revert

e04bbf7

phoeenniixx moved this from PR under review to PR in progress in May - Sep 2025 mentee projects Aug 19, 2025

phoeenniixx added 6 commits August 19, 2025 22:50

update pkg

2e35fcd

update pkg

50b2e5b

update file paths

467c69c

revert

8e3ce93

add some docstrings

c7ecd27

add examples

8bb6211

phoeenniixx moved this from PR in progress to PR under review in May - Sep 2025 mentee projects Aug 21, 2025

add more docstrings

1fb04a3

jgyasu moved this from PR under review to PR in progress in May - Sep 2025 mentee projects Aug 26, 2025

update the workflow

9838477

phoeenniixx moved this from PR in progress to PR under review in May - Sep 2025 mentee projects Aug 29, 2025

phoeenniixx mentioned this pull request Sep 17, 2025

[ENH] Standardize output format for tslib v2 models sktime/pytorch-forecasting#1965

Merged

3 tasks

fkiraly reviewed Sep 19, 2025

View reviewed changes

fkiraly requested changes Sep 19, 2025

View reviewed changes

phoeenniixx mentioned this pull request Sep 26, 2025

[ENH] Work Items for ptf-v2 sktime/pytorch-forecasting#1974

Open

9 tasks

		## Aim


		Current beta version of `ptf-v2` doesnot have any functionality to do the predicitions and this design document aims to provide some possible ideas to implement the prediction pipeline


		In v2, we should try to design `.predict()` to be more general, composable, and predictable while retaining ease of use.

		### Requirements

[STEP] Design Document for Prediction pipeline of ptf-v2 #46

Are you sure you want to change the base?

[STEP] Design Document for Prediction pipeline of ptf-v2 #46

Uh oh!

Conversation

phoeenniixx commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phoeenniixx commented Aug 7, 2025

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

fkiraly commented Aug 7, 2025

Uh oh!

phoeenniixx commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fkiraly commented Aug 7, 2025

Uh oh!

fkiraly commented Aug 7, 2025

Uh oh!

agobbifbk commented Aug 7, 2025

Uh oh!

phoeenniixx commented Aug 10, 2025

Uh oh!

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

agobbifbk commented Aug 18, 2025

Uh oh!

phoeenniixx commented Aug 18, 2025

Uh oh!

phoeenniixx commented Aug 18, 2025

Uh oh!

phoeenniixx commented Aug 21, 2025

Uh oh!

agobbifbk commented Aug 21, 2025

Uh oh!

phoeenniixx commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phoeenniixx commented Aug 21, 2025

Uh oh!

agobbifbk commented Aug 21, 2025

Uh oh!

agobbifbk commented Aug 22, 2025

Uh oh!

phoeenniixx commented Aug 22, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fkiraly Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Vignette would remain the same

pkg layer predict

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phoeenniixx Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fkiraly left a comment

[STEP] Design Document for Prediction pipeline of `ptf-v2` #46

[STEP] Design Document for Prediction pipeline of `ptf-v2` #46

phoeenniixx commented Aug 7, 2025 •

edited

Loading

phoeenniixx commented Aug 7, 2025 •

edited

Loading

phoeenniixx commented Aug 21, 2025 •

edited

Loading

fkiraly Sep 19, 2025 •

edited

Loading

`pkg` layer `predict`

phoeenniixx Sep 19, 2025 •

edited

Loading