Skip to content

Add time-series-forecasting task schema#2126

Open
pjhul wants to merge 2 commits intohuggingface:mainfrom
pjhul:task/time-series-forecasting
Open

Add time-series-forecasting task schema#2126
pjhul wants to merge 2 commits intohuggingface:mainfrom
pjhul:task/time-series-forecasting

Conversation

@pjhul
Copy link
Copy Markdown

@pjhul pjhul commented Apr 25, 2026

Scaffolds the new time-series-forecasting task with input/output JSON schemas, generated TypeScript types, task metadata, and registry wiring. pnpm check, test, and lint:check on the tasks package all pass locally.

Design decisions worth flagging

  1. inputs is a list of structured series objects, not a string or binary payload. This diverges from existing HF tasks but aligns closely with mature time-series libraries (Darts, GluonTS, AWS SageMaker Chronos). Parallel-array encodings don't survive contact with ragged lengths or per-series optional fields (timestamps, covariates, metadata).

  2. target is always 2D [num_timesteps][num_channels]. Univariate series use a length-1 channel axis. Avoids oneOf 1D/2D ambiguity at length-1 inputs.

  3. Missing observations are null inline inside target, not a separate observed_mask. Matches pandas, R, SQL, Arrow convention. HF's TimeSeriesTransformer uses past_observed_mask because PyTorch tensors can't hold null — but for API consumers dealing with a separate mask is much more cumbersome than in-line null's.

  4. Input uses start + parameters.frequency rather than an explicit timestamps array. Matches the Darts TimeSeries model.

  5. quantile_predictions is an array of {level, values} objects rather than a dict keyed by string-float. Matches HF's pattern for enumerated scored outputs (text-generation logprobs, classification scores, fill-mask tokens). AWS Chronos uses a dict-keyed-by-string; we diverge intentionally for HF-ecosystem consistency.

  6. Three uncertainty channels on outputmean (required), quantile_predictions, samples. All optional; they may coexist. No parametric distribution parameters. samples is the universal fallback for any distributional output.

Validation

  • pnpm --filter tasks-gen inference-codegen regenerates clean inference.ts from the JSON schemas
  • pnpm --filter @huggingface/tasks check (tsc)
  • pnpm --filter @huggingface/tasks test
  • pnpm --filter @huggingface/tasks lint:check
  • Registered in packages/tasks/src/tasks/index.ts

Note

Low Risk
Primarily additive schemas/types and task metadata with minimal impact on existing tasks; risk is limited to potential downstream type/API expectations for the new task registration.

Overview
Adds a new time-series-forecasting task, including input/output JSON schemas, generated TypeScript inference types, task documentation (about.md), and curated task metadata (data.ts).

Wires the task into packages/tasks/src/tasks/index.ts by exporting the new inference types and enabling the task page/registry entry (switching it from undefined to getData(...)).

Reviewed by Cursor Bugbot for commit 5da52ea. Bugbot is set up for automated code reviews on this repo. Configure here.

@pjhul pjhul marked this pull request as ready for review April 25, 2026 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant