This project aims at expanding our scientific knowledge on human mechanisms for voice and speech production so as to facilitate technological development and medical application in future. The principal mean for the project is the magnetic resonance imaging (MRI) technique to visualize anatomical structures and physic-physiological mechanisms involved in voice and speech production. High-quality static and motion images will be obtained for this purpose by solving technical difficulties that are known to date regarding MRI. Newer analysis techniques will be established to visualize smaller structures, explore unknown mechanisms, and elucidate acoustic phenomena. The structures and functions for speech production by articulation are recorded by both static and motion imaging techniques. The static images are collected and analyzed so as to distinguish muscles and cartilages from the surrounding structures. The motion images are acquired for the entire articulatory organs and analyzed for tissue-contact detection and marker-tracking analysis, which will describe time-space patterns of speech articulation numerically. Real-time MRI during sentence production is used for cross-linguistic studies in this project, which will contribute to the studies for language learning and clinical examination. Exploring individualized mechanism of voice and speech production is another topic in this project. Three-dimensional shapes of the vocal tract are visualized with MRI, and they are analyzed to discover certain unknown features in speech signals, such as the causal process of deriving individual vocal characteristics, or articulatory normalization of vowels cross male and female speakers. New knowledge thus obtained by MRI-based analysis is applied to refine a physiological articulatory model. Anatomical findings are used to revise muscle geometry in the model, and kinematic data are used to evaluate the performance of the model in speech articulation and synthesis. The project team is formed by the principal investigator having a 20-year experience in the research field and his colleagues at Tianjin University and the Chinese Academy of Social Science. His international colleagues in China, Japan, USA, and France will also support this project as volunteers or consultants. The MRI systems used for this project are modern powerful ones with a 3-Tesla static magnetic field. Both researchers’ experience and advanced research systems promote rapid advancement of the related fields in China and further stimulate speech science worldwide.
该项目旨在揭示人类语音产生机制,尤其是发音运动中的个性化语音特征的生理机理。通过利用MRI作为观测手段,结合cine-MRI和tagged-MRI的优点实现对于完整发音器官的观测及标志点追踪研究。通过MRI图像对于鼻腔、下咽腔等发音器官的几何形态观测及声学仿真,来揭示人的个性化发音特征的生理学机理。通过对女性发音人的下咽腔形态分析及相应的声学仿真,揭示女性下咽腔对于语音频域的影响。通过建立固态机械发音模型来定量研究发音器官与独立元音的声学特征对应关系。进而利用数字声学模型对动态发音过程进行仿真。利用本团队已有的生理发音运动模型,对个性化发音运动过程进行仿真研究。利用MRI观测数据及声学计算模型,从生理层、发音运动层到声学特征深入揭示人的个性化发音机理。
发音人的声道结构影响着语音的声学特性,本项目主要研究说话人个性化语音产生的生理机制。基于共振成像(MRI)以及快速扫描技术,我们建立了高分辨率的口腔、喉腔、咽腔以及鼻腔的三维中文发音数据库,包括静态和部分动态发音器官数据,实现了语音生成过程的可视化。对于说话人的静态特性研究,主要基于数据库中女性的元音数据,建立固体声道模型并进行声学实验,更进一步使用时域有限差分方法(FDTD)建立声学计算模型,并与已有的男性说话人研究结果进行比较。研究结果揭示出下咽腔结构在语音生成过程中对个性化语音特征的贡献,同时表明语音中男性和女性说话人在频谱特性上的明显差异。对于说话人的动态特性研究,我们使用MRI数据库中的静态数据定义了相对舌体大小(RTS),作为不会被控制的生理结构来表征说话人的个性化信息,并使用动态MRI图像计算元音到元音的舌体移动速度。结果表明相对舌体大小通过影响说话人的舌体运动速率改变了共振峰的变化速率。本项目的研究成果扩展了语音个性化的表征方式,进一步完善了语音生成的基础理论。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
粗颗粒土的静止土压力系数非线性分析与计算方法
正交异性钢桥面板纵肋-面板疲劳开裂的CFRP加固研究
硬件木马:关键问题研究进展及新动向
基于SSVEP 直接脑控机器人方向和速度研究
基于多模态观测的跨语言语音发音机理研究
基于大数据的个性化动态定价策略研究
发音阈压(PTP)和发音阈气流(PTF)在启动发音机制中的探讨
基于网上弱标注数据的个性化图像标注研究