Add time-series-forecasting task schema#2126
Open
pjhul wants to merge 2 commits intohuggingface:mainfrom
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Scaffolds the new
time-series-forecastingtask with input/output JSON schemas, generated TypeScript types, task metadata, and registry wiring.pnpm check,test, andlint:checkon the tasks package all pass locally.Design decisions worth flagging
inputsis a list of structured series objects, not a string or binary payload. This diverges from existing HF tasks but aligns closely with mature time-series libraries (Darts, GluonTS, AWS SageMaker Chronos). Parallel-array encodings don't survive contact with ragged lengths or per-series optional fields (timestamps, covariates, metadata).targetis always 2D[num_timesteps][num_channels]. Univariate series use a length-1 channel axis. AvoidsoneOf 1D/2Dambiguity at length-1 inputs.Missing observations are
nullinline insidetarget, not a separateobserved_mask. Matches pandas, R, SQL, Arrow convention. HF's TimeSeriesTransformer usespast_observed_maskbecause PyTorch tensors can't hold null — but for API consumers dealing with a separate mask is much more cumbersome than in-line null's.Input uses
start+parameters.frequencyrather than an explicit timestamps array. Matches the DartsTimeSeriesmodel.quantile_predictionsis an array of{level, values}objects rather than a dict keyed by string-float. Matches HF's pattern for enumerated scored outputs (text-generationlogprobs, classification scores, fill-mask tokens). AWS Chronos uses a dict-keyed-by-string; we diverge intentionally for HF-ecosystem consistency.Three uncertainty channels on output —
mean(required),quantile_predictions,samples. All optional; they may coexist. No parametric distribution parameters.samplesis the universal fallback for any distributional output.Validation
pnpm --filter tasks-gen inference-codegenregenerates cleaninference.tsfrom the JSON schemaspnpm --filter @huggingface/tasks check(tsc)pnpm --filter @huggingface/tasks testpnpm --filter @huggingface/tasks lint:checkpackages/tasks/src/tasks/index.tsNote
Low Risk
Primarily additive schemas/types and task metadata with minimal impact on existing tasks; risk is limited to potential downstream type/API expectations for the new task registration.
Overview
Adds a new
time-series-forecastingtask, including input/output JSON schemas, generated TypeScript inference types, task documentation (about.md), and curated task metadata (data.ts).Wires the task into
packages/tasks/src/tasks/index.tsby exporting the new inference types and enabling the task page/registry entry (switching it fromundefinedtogetData(...)).Reviewed by Cursor Bugbot for commit 5da52ea. Bugbot is set up for automated code reviews on this repo. Configure here.