Topology-enhanced Machine Learning for Consonant Recognition

Code material for Topological Data Analysis in Consonant Recognition. The data that support the findings of this study are openly available in SpeechBox, ALLSSTAR Corpus, L1-ENG division at Home Page of SpeechBox

Data Preprocessing

Using Montreal Forced Aligner (MFA) to align each speech into phonetic segments. The detailed guidance of MFA can be found on the Installation Page. The following steps help align each speech into phonetic segments. See Montreal Forced Aligner Tutorial for more explanations.

Download the acoustic model and dictionary.

mfa model download acoustic english_us_arpa
mfa model download dictionary english_us_arpa

Convert sampling rate into 16kHz by wav_modification.

Align speeches, the output files are in .TextGrid format.

mfa align ~/mfa_data/my_corpus english_us_arpa english_us_arpa ~/mfa_data/my_corpus_aligned

TopCap Construction

Before constructing TopCap, there is a preliminary experiment that measures the performance of topological methods in time series. fre_amp_av helps understand how topological methods distinguish different vibration patterns in time series. The results are shown in observation_result_refined

TopCap is achieved in csv_writer_consonant, which captures the most significant topological features within those segmented phonetic time series. The output is a .csv file containing the birthtime and lifetime corresponding to the point in the persistent diagram with the longest lifetime.

Further discussions of TopCap are involved in

observation_dimension illustrates how dimension influences time delay embedding and persistent diagrams.
observation_dimension_plot includes parameters and graph in the discussion section.
observation_skip illustrates how skip influences computation time.

Machine Learning for Topological Features

Matlab (R2022b) classification learner application, 5-fold cross-validation, set aside 30% records as test data. Use the following automatic built-in algorithm: Optimizable Tree, Optimizable Discriminant, Efficient Logistic Regression, Optimizable Naive Bayes, Optimizable SVM, Optimizable KNN, Kernel, and Optimizable Ensemble.

Model Comparison

We built other state-of-art models for comparison with TopCap to comprehensively evaluate its performance. The MFCC-GRU classification model is obtained in MFCC_GRU_classification_model, the MFCC-Transformer classification model is obtained in MFCC_Transformer_classification_model, both the STFT-CNN classification model and the STFT-CNN^+ classification model are obtained in STFT_CNN_classification_model.

Data Preprocessing of other date set

The comparison experiments include the LJSpeech, TIMIT, and LibriSpeech repositories, along with four additional corpora from ALLSSTAR that do not appear in our main experiments. The data preprocessing files can be found in the folder dataset preprocessing.

Supplements

The folder supplements includes supplementary files for this project.

The results folder contains ROC, and AUC for machine learning, as well birthtime, lifetime of consonants.
The consonants_waveforms folder contains waveforms of pulmonic consonants. Audio for these consonants comes from Wiki-List of consonants. This gives consonants concrete shapes for readers.

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
dataset preprocessing		dataset preprocessing
model comparison		model comparison
supplements		supplements
README.md		README.md
csv_writer_consonant.ipynb		csv_writer_consonant.ipynb
fre_amp_av.ipynb		fre_amp_av.ipynb
observation_dimension.ipynb		observation_dimension.ipynb
observation_dimension_plot.ipynb		observation_dimension_plot.ipynb
observation_result_refined.ipynb		observation_result_refined.ipynb
observation_skip.ipynb		observation_skip.ipynb
stft_plot_maker_refine.ipynb		stft_plot_maker_refine.ipynb
wav_modification.ipynb		wav_modification.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Topology-enhanced Machine Learning for Consonant Recognition

Data Preprocessing

TopCap Construction

Machine Learning for Topological Features

Model Comparison

Data Preprocessing of other date set

Supplements

About

Releases

Packages

Contributors 5

Languages

sustech-topology/TopCap

Folders and files

Latest commit

History

Repository files navigation

Topology-enhanced Machine Learning for Consonant Recognition

Data Preprocessing

TopCap Construction

Machine Learning for Topological Features

Model Comparison

Data Preprocessing of other date set

Supplements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages