The stream data compression technique has been considered very promising for data transmission, data de-noising, and query approximation in database domains. However, the main drawback of conventional approaches are either not efficient enough to support online stream processing or inadequate to guarantee the approximation error on individual elements. These severely impede the further applications for stream real-time processing and quality-guaranteed querying. .To alleviate the efficiency and effectiveness for the whole stream process life cycle (from transmission to query), this project will make efforts to construct efficient quality-guaranteed compression algorithms and querying algorithms that work on compressed data. ..Initially proposed by the investigators and published in a privilege international journal, the shift technique is very compact, scalable, and sympathetic for online stream data processing. Unlike traditional wavelet-based approaches for synopses construction that are mostly based on dynamic-programming framework and rely heavily on examining the summations of wavelet coefficients, the shift technique is based on shift transformations with proven theoretical and practical significances. .Extensive experiments indicate that this technique is highly practical and surpass related ones in both efficiency and effectiveness...In this project, we shall use the shift technique to expand new research outcomes. The specific research tasks would include three interlocking parts: (i) Extending the achieved compression algorithms on maximum-error bound as well as developing new techniques that can significantly expedite processing procedures for large stream data; (ii) Constructing maximum-error bound query algorithms upon the compressed data of (i) and analyzing the inferences on error bounds, and (iii) Studying privacy-preserved/encrypted compression algorithms.
数据流压缩技术已应用于数据的传输、索引、噪声处理以及海量数据库的(近似)访问和查询。目前流行的定量压缩、均值平方和误差压缩、最大误差压缩算法,或不适合对数据流压缩,或存在效率问题,还远不能满足实际需求。项目针对巨量数据流实时压缩和查询处理中存在的效率、质量问题,此项目进一步研究质量保证的(Quality-guaranteed)压缩算法及不解压查询算法,主要包括:.1.进一步深入拓展已有的算法,构造新的压缩算法以适合更广阔的应用领域;.2.构造压缩-查询一体化算法,使压缩的数据在不解压缩的情况下直接用于应答用户查询并保证质量; .3. 扩展以上研究结果到多重、多维数据(multiple and multi-dimentional data),包括图像和视频,并根据数据特性,研发和使用新的压缩机制, 讨论算法在多种应用环境下的适应性, 以满足实际应在用需求之目的。
针对传统的定量压缩、均值平方和误差压缩、最大误差压缩算法,或不适合对数据流压缩,或存在效率的问题,我们构建了多种基于最大误差并具有线性时间复杂度的新压缩算法。具体来说,基于分段线性拟合思想,研制了用于生成最优存储的半连续分割和混合分割的压缩方法;拓展一维转化压缩的思想,研制了用于图像压缩的新方法并设计了并行化策略。理论和试验结果表明,这些算法相比于目前最先进的同类算法具有更高的执行效力和实用性。.针对直接基于压缩概要的分析查询问题,我们验证了在众多问题上,基于最大误差的压缩概要分析结果的质量保质性。具体来说,保证了脑电数据的特征提取和模式识别、数据库查询、齿鲸回声定位、图像缩放、图像增强以及图像检索等问题上的分析准确性。
{{i.achievement_title}}
数据更新时间:2023-05-31
论大数据环境对情报学发展的影响
基于 Kronecker 压缩感知的宽带 MIMO 雷达高分辨三维成像
资源型地区产业结构调整对水资源利用效率影响的实证分析—来自中国10个资源型省份的经验证据
多源数据驱动CNN-GRU模型的公交客流量分类预测
混采地震数据高效高精度分离处理方法研究进展
基于压缩域传感网数据流的状态识别算法研究
海量不确定数据流的分布并行Skyline查询处理模型与算法研究
城市固体垃圾应力、生物降解压缩特性及压缩机理研究
支持范围查询和多种查询的对称加密算法研究