Fast index for XML data

Ping Yi, Yafei Hou, Yue Wu, Jianhua Li

Research output: Contribution to journalArticle

Abstract

With the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. This poses a new challenge concerning indexing and searching XML data, because conventional approaches based on relational model may not meet the processing requirements for XML data. In this paper, a path index based on Patricia-tries is proposed, namely PT index. The PT index structure offers several novel features. First, the index can support to fast search data by its structure based on Patricia-tries. Second, the path indexes are compressed so that PT index can be stored in memory. Thirdly, because PT index includes structure and text of XML data, the results can be got form the PT index without reading original XML data from disk. The time complexity and space complexity of PT index further analyze. Experimental results from the prototype system implementation show that the PT index can outperform some representative index approaches include DataGuide, B+tree index and Index Fabric.

Original languageEnglish
Pages (from-to)97-107
Number of pages11
JournalJournal of Computational Information Systems
Volume3
Issue number1
Publication statusPublished - Feb 2007
Externally publishedYes

Fingerprint

XML
Internet
Data storage equipment
Processing

Keywords

  • Index
  • Patricia-tries
  • XML

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications

Cite this

Yi, P., Hou, Y., Wu, Y., & Li, J. (2007). Fast index for XML data. Journal of Computational Information Systems, 3(1), 97-107.

Fast index for XML data. / Yi, Ping; Hou, Yafei; Wu, Yue; Li, Jianhua.

In: Journal of Computational Information Systems, Vol. 3, No. 1, 02.2007, p. 97-107.

Research output: Contribution to journalArticle

Yi, P, Hou, Y, Wu, Y & Li, J 2007, 'Fast index for XML data', Journal of Computational Information Systems, vol. 3, no. 1, pp. 97-107.
Yi, Ping ; Hou, Yafei ; Wu, Yue ; Li, Jianhua. / Fast index for XML data. In: Journal of Computational Information Systems. 2007 ; Vol. 3, No. 1. pp. 97-107.
@article{ce10a283faf945a881dc962a7b4a7fc5,
title = "Fast index for XML data",
abstract = "With the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. This poses a new challenge concerning indexing and searching XML data, because conventional approaches based on relational model may not meet the processing requirements for XML data. In this paper, a path index based on Patricia-tries is proposed, namely PT index. The PT index structure offers several novel features. First, the index can support to fast search data by its structure based on Patricia-tries. Second, the path indexes are compressed so that PT index can be stored in memory. Thirdly, because PT index includes structure and text of XML data, the results can be got form the PT index without reading original XML data from disk. The time complexity and space complexity of PT index further analyze. Experimental results from the prototype system implementation show that the PT index can outperform some representative index approaches include DataGuide, B+tree index and Index Fabric.",
keywords = "Index, Patricia-tries, XML",
author = "Ping Yi and Yafei Hou and Yue Wu and Jianhua Li",
year = "2007",
month = "2",
language = "English",
volume = "3",
pages = "97--107",
journal = "Journal of Computational Information Systems",
issn = "1553-9105",
publisher = "Binary Information Press",
number = "1",

}

TY - JOUR

T1 - Fast index for XML data

AU - Yi, Ping

AU - Hou, Yafei

AU - Wu, Yue

AU - Li, Jianhua

PY - 2007/2

Y1 - 2007/2

N2 - With the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. This poses a new challenge concerning indexing and searching XML data, because conventional approaches based on relational model may not meet the processing requirements for XML data. In this paper, a path index based on Patricia-tries is proposed, namely PT index. The PT index structure offers several novel features. First, the index can support to fast search data by its structure based on Patricia-tries. Second, the path indexes are compressed so that PT index can be stored in memory. Thirdly, because PT index includes structure and text of XML data, the results can be got form the PT index without reading original XML data from disk. The time complexity and space complexity of PT index further analyze. Experimental results from the prototype system implementation show that the PT index can outperform some representative index approaches include DataGuide, B+tree index and Index Fabric.

AB - With the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. This poses a new challenge concerning indexing and searching XML data, because conventional approaches based on relational model may not meet the processing requirements for XML data. In this paper, a path index based on Patricia-tries is proposed, namely PT index. The PT index structure offers several novel features. First, the index can support to fast search data by its structure based on Patricia-tries. Second, the path indexes are compressed so that PT index can be stored in memory. Thirdly, because PT index includes structure and text of XML data, the results can be got form the PT index without reading original XML data from disk. The time complexity and space complexity of PT index further analyze. Experimental results from the prototype system implementation show that the PT index can outperform some representative index approaches include DataGuide, B+tree index and Index Fabric.

KW - Index

KW - Patricia-tries

KW - XML

UR - http://www.scopus.com/inward/record.url?scp=34248630553&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34248630553&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:34248630553

VL - 3

SP - 97

EP - 107

JO - Journal of Computational Information Systems

JF - Journal of Computational Information Systems

SN - 1553-9105

IS - 1

ER -