Abstract: Handcrafted audio descriptors and learned deep representations each bring distinct strengths and inherent limitations to speech emotion recognition (SER). Traditional handcrafted features ...
Abstract: Noise robustness is critical when applying automatic speech recognition (ASR) in real-world scenarios. One solution involves using speech enhancement (SE) models as the front end of ASR.