We present audio samples for the causal CleanUNet 2 model proposed in
CleanUNet 2: A Hybrid Speech Denoising Model in Time and time-frequency Domain.
We compare CleanUNet 2 to the state-of-the-art models including the FAIR-denoiser, FullSubNet, and CleanUNet.
Spectrogram-based methods like FullSubNet have noise leakage under high noise levels, which is caused by inaccurate phase extracted from noisy speech.
Waveform-based methods have smaller noise leakage even under high noise levels because these methods directly denoise the waveform.
However, prior methods such as FAIR-denoiser produce less natural sound.
The proposed CleanUNet-2 is a hybrid denoiser on both time and time-frequency domains, and have small noise leakage while retaining more natural sound.