On mining XML structures based on statistics

Hiroshi Ishikawa, Shohei Yokoyama, Manabu Ohta, Kaoru Katayama

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

We propose an approach to dynamically generate database schemas for well-formed XML data. Our approach controls the number of tables to be divided based on statistics of XML so that the total cost of processing queries is reduced. We devise schemas appropriate for complex data such as text formatting and child elements with the small maximum number of occurrences in order to reduce the number of tables. To this end, we define three functions NULL expectation, Large Leaf Fields, and Large Child Fields for controlling the tables to be divided. We evaluated typical XML queries over the generated schemas and normalized schemas and measured and compared both of the costs. Through this, we successfully validated our approach.

Original languageEnglish
Title of host publicationKnowledge-Based Intelligent Information and Engineering Systems - 9th International Conference, KES 2005, Proceedings
PublisherSpringer Verlag
Pages379-390
Number of pages12
ISBN (Print)3540288945, 9783540288947
DOIs
Publication statusPublished - Jan 1 2005
Externally publishedYes
Event9th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, KES 2005 - Melbourne, Australia
Duration: Sep 14 2005Sep 16 2005

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3681 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other9th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, KES 2005
CountryAustralia
CityMelbourne
Period9/14/059/16/05

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'On mining XML structures based on statistics'. Together they form a unique fingerprint.

Cite this