Emerging Storage devices with high performance and sophisticated features inevitably reform the conventional design principles and architecture of storage stack based on rotating disks. This storage stack heavily hinders the potential performance improvement of key-value store as a representative data-intensive application. Based on analyzing the characteristics of SSDs and understanding the relevant new design principles, we found that mainstream existing key-value stores based on multi-stage tree structure lead to significant write amplification, increasing the writing penalty of SSDs while hardly leveraging the advantages of high IOPS of SSDs. This project proposes a key-value store based direct storage, KVDS for short, and designs a novel architecture and the corresponding mechanisms for KVDS. KVDS highly combining the data process of key-value stores with the characteristics of SSDs. The main contributions are as follows: 1) to propose a multi-stage forest structure based key-value store suitable to SSDs, dramatically reducing the write amplification and effectively compensating the structure-inherited read degradation by using the parallel read to benefit from high IOPS of SSDs; 2) to present a direct storage architecture for key-value stores, which consists of application, direct storage, and media management layers, as well as to design the corresponding interface, data layout, and processing flow; 3) to develop key implementation techniques based on the new architecture, including physical storage mapping, internal and external channel scheduling of SSDs, compaction optimization, near data processing, etc. KVDS can maximize the overall performance of key-value stores under a wide range of workload patterns. The fundamental architecture and techniques of KVDS can be naturally extended to other big-data applications. Therefore, KVDS actually enriches the theory and key technology of computer system.
新型存储器件具有高性能和复杂内部结构,不可避免地动摇基于磁盘的传统存储栈设计原则及架构。现有存储栈整体上制约了以键值对存储为代表的数据密集型应用在新硬件上的性能发挥。分析发现基于多级树结构的主流键值对存储具有内生的显著写放大问题,既增加固态盘的写代价,也无法发挥其性能优势。本项目提出键值对直接存储架构和机制,深度融合键值对存储结构、系统存储过程和新型存储介质特性。其创新点为:1)提出适应固态盘的键值对多级森林结构,减少写放大,设计并行读机制,弥补其结构潜在的读性能下降;2)提出面向键值对存储的直接存储架构,构建应用、直接存储和介质管理三层架构,设计相应接口、数据布局和处理流程;3)开发新架构下的关键实现技术,包括存储物理映射,介质内外通道调度、合并过程优化、近数据处理等。直接存储能够最大化、全面提升键值对在各种负载模式下的性能,可推广到其他大数据应用存储中,从而丰富计算机系统理论及技术。
新型存储器件具有高性能和复杂内部结构,不可避免地动摇基于磁盘的传统存储栈设计原则及架构。现有存储栈整体上制约了以键值对存储为代表的数据密集型应用在新硬件上的性能发挥。分析发现主流键值对存储具有内生的显著写放大问题,既增加固态盘的写代价,也无法发挥其性能优势。本项目提出键值对直接存储架构和机制,深度融合键值对存储结构、系统存储过程和新型存储介质特性。其创新点为:1)提出了键值对直接存储架构,使得键值对应用能够直接管理块存储空间,避免文件系统引入的IO代价;2)提出适应固态盘的键值对多级森林结构,减少写放大,设计并行读机制,弥补其结构潜在的读性能下降;3)提出面向键值对存储的直接存储架构及相应接口、数据布局和处理流程;4)分析和理解现有存储系统、固态盘内部的多种异构性,设计相应的IO调度机制。项目共发表论文17篇(CCF-A/B 13篇),申请3项发明专利。键值对直接存储研究推进了新型键值对存储的研究和应用,所提技术思想也能推广到其他大数据应用。项目研究成果丰富了新型存储系统设计理论和实现技术。
{{i.achievement_title}}
数据更新时间:2023-05-31
珠江口生物中多氯萘、六氯丁二烯和五氯苯酚的含量水平和分布特征
向日葵种质资源苗期抗旱性鉴定及抗旱指标筛选
复杂系统科学研究进展
基于LS-SVM香梨可溶性糖的近红外光谱快速检测
奥希替尼治疗非小细胞肺癌患者的耐药机制研究进展
键值存储系统架构设计与性能优化研究
基于分布式键值对网络存储的消息传递程序重播技术研究
云计算环境下键值存储系统查询优化技术研究
基于纠删码的异构分布式内存键值存储系统构建及性能优化