There are two basic prolems existed in the state-of-the-art single-channel speech enhancement algorithms. First, the bias and the variance of the spectral estimator may introduce both musical noise and audible speech distortion. Second, the existing noise estimators could not track non-stationary noise in a fast way, which may result in greatly reducing the amount of noise reduction. To solve these two problems, we have proved that some non-stationary noise components could be suppressed without introducing audible speech disortion by using cepstrum-based postprocessing scheme, where this project will further study non-linear speech spectrum analysis-based single-channel speech enhancement. Compared with the conventional single-channel speech enhancement, the proposed method that uses both cepstral analysis and reassigned spectrogram has at least two advantages. First, the noise components can be separated from the speech components by using cepstral analysis. Second, we can fully utilize both the temporal correlation between successive speech frames and the frequency correlation between adjacent bands with the help of reassigned spectrogram. This project will study theoretical propertis of cepstral coefficients of the speech, where the thoeretical results can be used to improve the noise tracking capablity and applied to suppress non-stationary noise components. This project will further study the reassigned spectrogram in theory to further suppress the non-stationary noise components, which is based on the temporal correlation and the frequency correlation of the speech.By this study, we intend to improve the performance of the single-channel speech enhancement in real environment and make it more applicable in practical situation.
单通道语音增强面临两个根本性问题:一是谱估计方差既可能产生"音乐噪声",也可能造成语音失真;二是当前的噪声估计算法难以跟踪非平稳态噪声,低估噪声会导致大量的噪声残留。针对这两个根本性问题,我们验证了倒谱后处理能在不增加语音失真的情况下抑制部分非平稳态噪声,本项目将进一步开展基于非线性语音谱分析的单通道语音增强研究。相比于基于线性谱估计的传统方法,采用基于倒谱分析和重分配谱图的非线性语音谱估计具有如下优势:首先,利用倒谱分析可以将噪声从语音信号分离出来;其次,利用重分配谱图可以充分利用语音帧间和频间相关性。本项目拟对语音倒谱进行理论研究,研究其统计特性,该理论研究成果既应用于提高噪声跟踪性能,又应用于后处理抑制残留的非平稳态噪声。同时,进一步对重分配谱图进行理论研究,利用语音帧间和频间相关性抑制非平稳态噪声。该研究的理论成果将解决实际环境中的单通道语音增强问题,为实用化扫清障碍。
单通道语音增强面临两个根本性问题:一是谱估计方差既可能产生“音乐噪声”,也可能造成语音失真;二是当前的噪声估计算法难以跟踪非平稳态噪声,低估噪声会导致大量的噪声残留。针对这两个根本性问题,本项目开展了基于非线性语音谱分析的单通道语音增强研究。相比于基于线性谱估计的传统方法,采用基于倒谱分析和重分配谱图的非线性语音谱估计具有如下优势:首先,利用重分配谱图可以充分利用语音帧间和频间相关性;其次,利用倒谱分析可以将噪声从语音信号分离出来。本项目对重分配谱图进行理论研究,利用了瞬时信道频率(CIF: Channelized Instantaneous Frequency)和本地群延迟(LGD: Local Group Delay),反映了语音信号帧间和频间的特性。在多种噪声背景下的测试结果表明基于重分配谱图分析的算法能提高先验信噪比算法的估计性能。进一步,通过倒谱分析和重分配谱图相结合提升处理非平稳噪声的性能,能更好地抑制浊音的谐频成分,并减少部分较强的噪声成分误判为语音信号的情况,避免噪声功率谱过估和低估的情况。更进一步,针对先验信噪比单通道语音增强算法在信噪比较低时语音高次谐波失真较为严重的情况,本项目提出了基于二次谱谐波重构的先验信噪比估计方法,对增强后的信号进行二次谱处理,以加强语音信号的周期性,再进行谐波重构,提升谐波分量。实验研究表明,该算法在低信噪比时能够有效地增强语音谐波分量,相对于传统的先验信噪比估计算法有较少的语音失真。本项目通过非线性方法解决单通道语音增强的固有问题,提高了增强后语音的可懂度,从而推动语音增强的研究向更加实用化的方向发展。本项目的研究成果可以应用于目前的绝大多数语音通信系统,一方面降低环境噪声的干扰,另一方面提高语音的可懂度和自然度。
{{i.achievement_title}}
数据更新时间:2023-05-31
环境类邻避设施对北京市住宅价格影响研究--以大型垃圾处理设施为例
低轨卫星通信信道分配策略
固溶时效深冷复合处理对ZCuAl_(10)Fe_3Mn_2合金微观组织和热疲劳性能的影响
污染土壤高压旋喷修复药剂迁移透明土试验及数值模拟
业务过程成批处理配置优化方法
基于深度学习的单通道语音混响消除技术研究
基于概率声管模型的单通道语音分离研究
基于语音增强的鲁棒性语音识别方法研究
基于零空间追踪的单通道语音分离方法研究