An exploration of representation and classification using a synthetic control data set from the University of California Irvine. Creates two new representation data sets that reduces the number of samples needed to represent the data. The first set utilizes PAA and the second uses SAX.
This dataset contains 600 examples of control charts synthetically generated by the process in Alcock and Manolopoulos (1999). There are six different classes of control charts:
- Normal
- Cyclic
- Increasing trend
- Decreasing trend
- Upward shift
- Downward shift
Download and open the project with MATLAB R2021b or MATLAB Online.
Run the file main.m
.
First, a figure will appear. This figure represents the original data set, divided into its cooresponding classes.
Press Enter. The figure will reset. The second figure represents the PAA representation of the data set.
Press Enter. The figure will reset. The third figure represents the SAX representation of the data set.
[1] Lin, J., Keogh, E., Lonardi, S. & Chiu, B. "A Symbolic Representation of Time Series, with Implications for Streaming Algorithms." In proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. San Diego, CA. June 13, 2003.
[2] Lin, J., Keogh, E., Patel, P. & Lonardi, S. "Finding Motifs in Time Series". In proceedings of the 2nd Workshop on Temporal Data Mining, at the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton, Alberta, Canada. July 23-26, 2002.
[3] Dua, D. and Graff, C. "Synthetic Control Chart Time Series Data Set". UCI Machine Learning Repository [https://archive.ics.uci.edu/ml/datasets/Synthetic+Control+Chart+Time+Series]. Irvine, CA: University of California, School of Information and Computer Science. 2019.