This model performs automation speech recognition to convert audio signals to text and then the text is further fed to a transformer that summarizes the text. This is an end to end pipeline that involves ingestion of data including text and audio, validation, transformation ,training, evaluation and deployement