Implementation of “End-to-end speaker segmentation for overlap-aware resegmentation” with modifications for speaker change detection. Learn more in the presentation.
This project is a machine learning architecture that takes in an audio file and outputs the timestamps where the active speaker changes. However, it can also identify when people are speaking and even distinguish speakers if they are talking simultaneously. It contains functions to prepare data, train, and perform inference for these two tasks, which are formally known as speaker change detection and speaker segmentation.
Project link: https://github.com/HHousen/speaker-change-detection