Compressed XML Database and Query Evaluation over XML Databases
2012; Volume: 1; Issue: 4 Linguagem: Inglês
ISSN
2277-9043
AutoresVijay S. Gulhane, Md. Sujan Ali,
Tópico(s)Advanced Database Systems and Queries
ResumoExtensible Markup Language (XML) [XML 1.0 (Second Edition) W3C Recommendation, October (2000)] is proposed as a standardized data format designed for specifying and exchanging data on the Web. With the proliferation of mobile devices, such as palmtop computers, as a means of communication in recent years, it is reasonable to expect that in the foreseeable future, a massive amount of XML data will be generated and exchanged between applications in order to perform dynamic computations over the Web. However, XML is by nature verbose, since terseness in XML markup is not considered a pressing issue from the design perspective [ Smith S. Nair et al]. In practice, XML documents are usually large in size as they often contain much redundant data. The size problem hinders the adoption of XML since it substantially increases the costs of data processing, data storage, and data exchanges over the Web. As the common generic text compressors, such as Gzip, Bzip2, WinZip, PKZIP, or MPEG-7 (BiM) , are not able to produce usable XML compressed data, many XML specific compression technologies have been recently proposed. The essential idea of these technologies is that, by utilizing the exposed structure information in the input XML document during the compression process, they pursue two important goals at the same time. First, they aim at achieving a good compression ratio and time compared to the generic text compressors mentioned above. Second, they aim at generating a compressed XML document that is able to support efficient evaluation of queries over the data.. The aim of this paper is to introduce the system which has the ability of compressing the XML document and retrieving the required information from the compressed version with less decompression required according to queries. The system first compressed the XML document by proposed algorithm. The compressed file is divided into different relational databases doing so there is no need to decompress the complete file for retrieving the results of any query. Only the required information is decompressed and submitted to the user. The average compression ratio of the designed compressor is considered competitive compared to other queriable XML compressors. Based on several experiments, the query processor part had the ability to answer different kinds of queries that require retrieving information from several compressed XML documents.
Referência(s)