北京大学学报(自然科学版)

XTrim: 一种基于XML Schema和微型数据块优化的XML压缩方法

仇睿恒,汤帜,胡薇,高良才   

  1. 北京大学计算机科学技术研究所, 北京 100871;
  • 收稿日期:2009-08-21 出版日期:2010-09-20 发布日期:2010-09-20

XTrim: An XML Compressor Based on XML Schema and Tiny Data Block Optimization

QIU Ruiheng, TANG Zhi, HU Wei, GAO Liangcai   

  1. Institute of Computer Science and Technology, Peking University, Beijing 100871;
  • Received:2009-08-21 Online:2010-09-20 Published:2010-09-20

摘要: 提出了一种基于 Schema 和微型数据块优化的XML方法(XTrim)。XTrim对 XML Schema 信息进行优化, 并提出了最小化结构信息方法, 即利用优化后的 XML Schema 信息对 XML 文档的结构进行压缩, 同时改进了分组存储策略来提高压缩率。此外, XTrim优化了微型数据块的存储, 进一步提高了压缩效果。实验数据表明, 与一些现有的方法相比,XTrim 取得了更好的压缩效果。

关键词: XML, XMLSchema, 微型数据块优化, 压缩

Abstract: The authors propose an XML compressor based on XML Schema and tiny data block optimization (XTrim), which minimizes the size of the structure in XML documents and improves the data grouping strategy by utilizing information in XML Schema. Especially, tiny data blocks in XML document are optimized by XTrim to achieve a higher compression ratio. Experimental results show that the proposed approach outperforms other compressors when handling XML documents.

Key words: XML, XML Schema, tiny data block optimization, compress

中图分类号: