Abstract
We propose an approach to dynamically generate database schemas for well-formed XML data. Our approach controls the number of tables to be divided based on statistics of XML so that the total cost of processing queries is reduced. We devise schemas appropriate for complex data such as text formatting and child elements with the small maximum number of occurrences in order to reduce the number of tables. To this end, we define three functions NULL expectation, Large Leaf Fields, and Large Child Fields for controlling the tables to be divided. We evaluated typical XML queries over the generated schemas and normalized schemas and measured and compared both of the costs. Through this, we successfully validated our approach.
Original language | English |
---|---|
Title of host publication | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Pages | 379-390 |
Number of pages | 12 |
Volume | 3681 LNAI |
Publication status | Published - 2005 |
Externally published | Yes |
Event | 9th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, KES 2005 - Melbourne, Australia Duration: Sep 14 2005 → Sep 16 2005 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 3681 LNAI |
ISSN (Print) | 03029743 |
ISSN (Electronic) | 16113349 |
Other
Other | 9th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, KES 2005 |
---|---|
Country | Australia |
City | Melbourne |
Period | 9/14/05 → 9/16/05 |
Fingerprint
ASJC Scopus subject areas
- Computer Science(all)
- Biochemistry, Genetics and Molecular Biology(all)
- Theoretical Computer Science
Cite this
On mining XML structures based on statistics. / Ishikawa, Hiroshi; Yokoyama, Shohei; Ohta, Manabu; Katayama, Kaoru.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 3681 LNAI 2005. p. 379-390 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3681 LNAI).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
}
TY - GEN
T1 - On mining XML structures based on statistics
AU - Ishikawa, Hiroshi
AU - Yokoyama, Shohei
AU - Ohta, Manabu
AU - Katayama, Kaoru
PY - 2005
Y1 - 2005
N2 - We propose an approach to dynamically generate database schemas for well-formed XML data. Our approach controls the number of tables to be divided based on statistics of XML so that the total cost of processing queries is reduced. We devise schemas appropriate for complex data such as text formatting and child elements with the small maximum number of occurrences in order to reduce the number of tables. To this end, we define three functions NULL expectation, Large Leaf Fields, and Large Child Fields for controlling the tables to be divided. We evaluated typical XML queries over the generated schemas and normalized schemas and measured and compared both of the costs. Through this, we successfully validated our approach.
AB - We propose an approach to dynamically generate database schemas for well-formed XML data. Our approach controls the number of tables to be divided based on statistics of XML so that the total cost of processing queries is reduced. We devise schemas appropriate for complex data such as text formatting and child elements with the small maximum number of occurrences in order to reduce the number of tables. To this end, we define three functions NULL expectation, Large Leaf Fields, and Large Child Fields for controlling the tables to be divided. We evaluated typical XML queries over the generated schemas and normalized schemas and measured and compared both of the costs. Through this, we successfully validated our approach.
UR - http://www.scopus.com/inward/record.url?scp=33745324045&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33745324045&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:33745324045
SN - 3540288945
SN - 9783540288947
VL - 3681 LNAI
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 379
EP - 390
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ER -