Skip to content

This is an official repository for "Harnessing Vision Models for Time Series Analysis: A Survey".

License

Notifications You must be signed in to change notification settings

D2I-Group/awesome-vision-time-series

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome Vision Models for Time Series Analysis

Awesome PRs Welcome arXiv PyPI - Version

This repository tracks the latest paper on Vision Models for Time series Analysis and serves as the official repository for Harnessing Vision Models for Time Series Analysis: A Survey. This repository is actively maintained by D2I Group@UH. We will update our reposititory and survey regularly.

🏆 Contribution | 📌 Taxonomy | ⚙️ Package | 🔗 Citation


Contribution

Time series analysis has witnessed the inspiring development from traditional autoregressive models, deep learning models, to recent Transformers and Large Language Models (LLMs). Efforts in leveraging vision models for time series analysis have also been made along the way but are less visible to the community due to the predominant research on sequence modeling in this domain. However, the discrepancy between continuous time series and the discrete token space of LLMs, and the challenges in explicitly modeling the correlations of variates in multivariate time series have shifted some research attentions to the equally successful Large Vision Models (LVMs) and Vision Language Models (VLMs). To fill the blank in the existing literature, this survey discusses the advantages of vision models over LLMs in time series analysis and provides a comprehensive and in-depth overview of the existing methods.

Figure 1: The general process of leveraging vision models for time series analysis
Figure 2: Image Transformation of Time Series
Figure 3: Illustration of different modeling strategies on imaged time series

The overall structure of our survey follows the general process of applying vision models for time series analysis as delineated in Figure 1. Based on the proposed dual view taxonomy, primary imaging methods on time series in Figure 2 and imaged modelling solutions in Figure 3, are reviewed in this survey, followed by the discussion including pre- & post-processing involved in this framework and future directions in this promising field.


Package

This package provides the common visualization methods for time series, including Line Plot, Heatmap, Spectrogram (STFT, Wavelet Transform, Filterbank), GAP and RP. We have uploaded our code package to PyPI, run the following command for installation.

pip install time2img

Our code is compatible with all common benchmarks found in Google Drive. You can run example to reproduce our illustration of different time series imaging methods (Figure 2) in the paper.


Taxonomy

Taxonomy are proposed as a dual view of Time Series to Image Transformation and Imaged Time Series Modeling. For the former, primary methods for imaging UTS or MTS are described and remarked on their pros and cons. For the latter, the existing methods are classified by conventional vision models, Large Vision Models (LVMs) and Large Multimodal Models (LMMs).

  • [2025] (General) Time-VLM: Exploring Multimodal Vision-Language Models for Augmented Time Series Forecasting [paper]
  • [2024] (General) VisionTS: Visual masked autoencoders are free-lunch zero-shot time series forecasters [paper][code]
  • [2024] (General) CAFO: Feature-centric explanation on time series classification [paper][code]
  • [2024] (Sensing) Multi-sensor data fusion and time series to image encoding for hardness recognition [paper]
  • [2024] (General) Utilizing image transforms and diffusion models for generative modeling of short and long time series [paper][code]
  • [2024] (General) Hierarchical context representation and self-adaptive thresholding for multivariate anomaly detection [paper]
  • [2024] (General) Fusion of image representations for time series classification with deep learning [paper][code]
  • [2024] (Audio) Vision language models are few-shot audio spectrogram classifiers [paper]
  • [2024] (General) Training-free time-series anomaly detection: Leveraging image foundation models [paper]
  • [2024] (General) On the feasibility of vision-language models for time-series classification [paper][code]
  • [2024] (General) See it, think it, sorted: Large multimodal models are few-shot time series anomaly analyzers [paper]
  • [2024] (General) Plots unlock time-series understanding in multimodal models [paper]
  • [2024] (General) ViTime: A visual intelligence-based foundation model for time series forecasting [paper][code]
  • [2024] (Health) TimEHR: Image-based time series generation for electronic health records [paper][code]
  • [2023] (General) TimesNet: Temporal 2d-variation modeling for general time series analysis [paper][code]
  • [2023] (General) Insight miner: A time series analysis dataset for cross-domain alignment with natural language [paper]
  • [2023] (Finance) Leveraging vision-language models for granular market change prediction [paper]
  • [2023] (General) Time series as images: Vision transformer for irregularly sampled time series [paper][code]
  • [2023] (General) Your time series is worth a binary image: machine vision assisted deep framework for time series forecasting [paper][code]
  • [2023] (General) Image-based time series forecasting: A deep convolutional neural network approach [paper]
  • [2023] (Physics) Classification of time series as images using deep convolutional neural networks: application to glitches in gravitational wave data [paper]
  • [2023] (Finance) From pixels to predictions: Spectrogram and vision transformer for better time series forecasting [paper]
  • [2023] (Audio) AST-SED: An effective sound event detection method based on audio spectrogram transformer [paper]
  • [2022] (Health) TTS-GAN: A transformer-based time-series generative adversarial network [paper][code]
  • [2022] (Audio) MAE-AST: Masked autoencoding audio spectrogram transformer [paper][code]
  • [2022] (Audio) SSAST: Self-supervised audio spectrogram transformer [paper][code]
  • [2021] (Audio) AST: Audio spectrogram transformer [paper][code]
  • [2021] (Finance) Visual time series forecasting: an image-driven approach [paper]
  • [2021] (Finance) Deep video prediction for time series forecasting [paper]
  • [2020] (General) Forecasting with Time Series Imaging [paper][code]
  • [2020] (Finance) Deep learning and time series-to-image encoding for financial forecasting [paper]
  • [2020] (Finance) Trading via image classification [paper]
  • [2019] (General) A Deep Neural Network for unsupervised anomaly detection and diagnosis in multivariate time series data [paper][code]
  • [2019] (General) Multivariate time series classification using dilated convolutional neural network [paper][code]
  • [2018] (General) Classification of Time-series Images using Deep Convolutional Neural Networks [paper]
  • [2017] (Traffic) Learning Traffic as Images: A Deep Convolutional Neural Network for Large-scale Transportation Network Speed Prediction [paper]
  • [2015] (Genearl) Imaging Time-series to Improve Classification and Imputation [paper]
  • [2015] (General) Encoding Time Series as Images for Visual Inspection and Classification Using Tiled Convolutional Neural Networks [paper]
  • [2014] (General) Extracting Texture Features for Time Series Classification [paper]
  • [2013] (General) Time Series Classification Using Compression Distance of Recurrence [paper]
  • [2005] (General) Time-series Bitmaps: a Practical Visualization Tool for Working with Large Time Series Databases [paper]

TS-Recover denotes recovering time series from predicted images. $*$: the method has been used to model the individual UTSs of an MTS. $^{\natural}$ : a new pre-trained model was proposed in the work. $^{\flat}$ : when pre-trained models were unused, Fine-tune refers to train a task-specific model from scratch.

Method TS-Type Imaging Multimodal Model Pre-trained Fine-tune Prompt TS-Recover Task Domain Code
Kumar et al., 2005 UTS TS-Bitmap Multiple Multiple General
Silva et al., 2013 UTS RP K-NN Classification General
Souza et al., 2014 UTS RP SVM $✔^\flat$ Classification General
Wang and Oates, 2015a UTS GAF CNN $✔^\flat$ $✔$ Classification General
Wang and Oates, 2015b UTS GAF CNN $✔^\flat$ $✔$ Classification & Imputation General
Ma et al., 2017 MTS Heatmap CNN $✔^\flat$ $✔$ Forecasting Traffic
Hatami et al., 2018 UTS RP CNN $✔^\flat$ Classification General
Yazdanbakhsh and Dick, 2019 MTS Heatmap CNN $✔^\flat$ Classification General
MSCRED MTS Other ConvLSTM $✔^\flat$ Anomaly General
Li et al., 2020 UTS RP CNN $✔$ $✔$ Forecasting General
Cohen et al., 2020 UTS LinePlot Ensemble $✔^\flat$ Classification Finance
Barra et al., 2020 UTS GAF CNN $✔^\flat$ Classification Finance
VisualAE UTS LinePlot CNN $✔^\flat$ $✔$ Forecasting Finance
Zeng et al., 2021 MTS Heatmap CNN, LSTM $✔^\flat$ $✔$ Forecasting Finance
AST UTS Spectrogram DeiT $✔$ $✔$ Classification Audio
TTS-GAN MTS Heatmap ViT $✔^\flat$ $✔$ Ts-Generation Health
SSAST UTS Spectrogram ViT $✔^\natural$ $✔$ Classification Audio
MAE-AST UTS Spectrogram MAE $✔^\natural$ $✔$ Classification Audio
AST-SED UTS Spectrogram SSAST, GRU $✔$ $✔$ EventDetection Audio
Jin et al., 2023 UTS LinePlot CNN $✔$ $✔$ Classification Physics
ForCNN UTS LinePlot CNN $✔^\flat$ Forecasting General
Vit-num-spec UTS Spectrogram ViT $✔^\flat$ Forecasting Finance
ViTST MTS LinePlot Swin $✔$ $✔$ Classification General
MV-DTSA UTS* LinePlot CNN $✔^\flat$ $✔$ Forecasting General
TimesNet MTS Heatmap CNN $✔^\flat$ $✔$ Multiple General
ITF-TAD UTS Spectrogram CNN $✔$ Anomaly General
Kaewrakmuk et al., 2024 UTS GAF CNN $✔$ $✔$ Classification Sensing
HCR-AdaAD MTS RP CNN, GNN $✔^\flat$ Anomaly General
FIRTS UTS Other CNN $✔^\flat$ Classification General
CAFO MTS RP CNN, ViT $✔^\flat$ Explanation General
ViTime UTS* LinePlot ViT $✔^\natural$ $✔$ $✔$ Forecasting General
ImagenTime MTS Other CNN $✔^\flat$ $✔$ Ts-Generation General
TimEHR MTS Heapmap CNN $✔^\flat$ $✔$ Ts-Generation Health
VisionTS UTS* Heatmap MAE $✔$ $✔$ $✔$ Forecasting General
InsightMiner UTS LinePlot $✔$ LLaVA $✔$ $✔$ $✔$ Txt-Generation General
Wimmer and Rekabsaz, 2023 MTS LinePlot $✔$ CLIP, LSTM $✔$ $✔$ Classification Finance
Dixit et al., 2024 UTS Spectrogram $✔$ GPT4o, Gemini & Claude3 $✔$ $✔$ Classification Audio
Daswani et al., 2024 MTS LinePlot $✔$ GPT4o, Gemini $✔$ $✔$ Multiple General
TAMA UTS LinePlot $✔$ GPT4o $✔$ $✔$ Anomaly General
Prithyani et al., 2024 MTS LinePlot $✔$ LLaVA $✔$ $✔$ $✔$ Classification General

Citation

@article{ni2025harnessing,
  title={Harnessing Vision Models for Time Series Analysis: A Survey},
  author={Ni, Jingchao and Zhao, Ziming and Shen, ChengAo and Tong, Hanghang and Song, Dongjin and Cheng, Wei and Luo, Dongsheng and Chen, Haifeng},
  journal={arXiv preprint arXiv:2502.08869},
  year={2025}
}

Releases

No releases published

Packages

No packages published

Languages