The question answering system refers to the system in which the machine understands natural language problems and answers them. A robust QA system should have high-accuracy answers in different data distributions and different domains. Building a robust QA system to help transform the QA system from laboratory prototype research to practical information portal for new industries. It has great application value and research significance. The existing work focuses on the accuracy of the QA system under a single data distribution. In order to improve the robustness of the QA system, this project intends to solve the problem of model robustness and knowledge ubiquity. For the robustness of the model, we consider the model adaptation under supervised and unsupervised settings in the target domain. The project proposes the fine-grained question word sequence transfer learning based on semantic collocation for the supervised setting. It also proposes the robust representation learning with abandoned superficial information for the unsupervised setting. For knowledge ubiquity, a data-driven domain extraction method is proposed. The goal of this project is to improve the robustness of the system from the model and the data perspectives. The key scientific significance lies in solving (1) fine-grained transfer learning mechanism for cross-domain QA; (2) domain-related information filtering mechanism without domain guidance; and (3) remote supervision mechanism for domain knowledge boundary determination.
问答系统指的是机器理解自然语言问题并做出回答的系统。一个鲁棒的问答系统,应该是可以在不同数据分布、不同领域下做出高准确率回答的系统,研发鲁棒性强的问答系统,有助于将问答系统从实验室原型研究转化为赋能新兴产业发展的实在信息入口,具有重大应用价值和研究意义。而现有工作集中研究问答系统在单一数据分布下的准确率。为提升问答系统的鲁棒性,本项目拟分别解决问答模型鲁棒性和知识遍在性问题。对于模型鲁棒性,提出了在目标领域有监督和无监督情况下的模型适配,并分别使用了基于语义搭配的细粒度问题词序列迁移学习,和基于表面文本信息摒弃的问题深入理解方法。对于知识遍在性,提出了数据驱动的领域自动化抽取方法。本项目的目标是实现从模型和数据角度对系统鲁棒性的提升。关键科学意义在于,解决了(1)面向跨领域问答的细粒度语义迁移机制;(2)无领域指引的领域相关信息过滤机制;(3)面向领域知识边界判定的远程监督机制。
本项目旨在构建一个高可用性的知识问答系统,重点提升模型在可解释性、领域可迁移性、泛化性方面的能力。该系统能够为用户提供准确可靠的答案,并且能够解释答案的推理过程,从而帮助用户获取所需的信息,并理解答案的逻辑和知识依据。本项目从问题文本表示、知识表示、以及文本和知识融合三个角度,针对构建可控的、可迁移的、泛化能力更强的知识问答系统的关键科学问题,提出了创新的方法和技术,并在多个真实任务上进行了验证和评估。围绕本项目撰写并发表论文6篇,发表于NeurIPS、ACL、IJCAI等顶级人工智能、自然语言处理会议。指导博士生3名、硕士生5名,本科生2名,其中1名硕士生、2名本科生已毕业。本项目的研究成果提高了问答系统的性能和可用性,拓展了问答系统的应用和场景,具有重要的理论意义和实际价值。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
基于分形L系统的水稻根系建模方法研究
监管的非对称性、盈余管理模式选择与证监会执法效率?
拥堵路网交通流均衡分配模型
宁南山区植被恢复模式对土壤主要酶活性、微生物多样性及土壤养分的影响
基于多源异构数据的知识图谱构建、推理与问答研究
控制系统的鲁棒性研究
线性不确定系统的鲁棒融合卡尔曼滤波设计及其鲁棒性研究
基于GA和HMM的鲁棒性话者确认系统