Extracting Schema from Semistructured Data.
Svetlozar Nestorov, Serge Abiteboul, Rajeev Motwani:
Extracting Schema from Semistructured Data.
SIGMOD Conference 1998: 295-306@inproceedings{DBLP:conf/sigmod/NestorovAM98,
author = {Svetlozar Nestorov and
Serge Abiteboul and
Rajeev Motwani},
editor = {Laura M. Haas and
Ashutosh Tiwary},
title = {Extracting Schema from Semistructured Data},
booktitle = {SIGMOD 1998, Proceedings ACM SIGMOD International Conference
on Management of Data, June 2-4, 1998, Seattle, Washington, USA},
publisher = {ACM Press},
year = {1998},
isbn = {0-89791-995-5},
pages = {295-306},
ee = {http://doi.acm.org/10.1145/276304.276331, db/conf/sigmod/NestorovAM98.html},
crossref = {DBLP:conf/sigmod/98},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
Semistructured data is characterized by the lack of any fixed and
rigid schema, although typically the data has some implicit
structure. While the lack of fixed schema makes extracting
semistructured data fairly easy and an attractive goal, presenting and
querying such data is greatly impaired. Thus, a critical problem is
the discovery of the structure implicit in semistructured data and,
subsequently, the recasting of the raw data in terms of this
structure. In this paper, we consider a very general form of
semistructured data based on labeled, directed graphs. We show that
such data can be typed using the greatest fixpoint semantics of
monadic datalog programs. We present an algorithm for approximate
typing of semistructured data. We establish that the general problem
of finding an optimal such typing is NP-hard, but present some
heuristics and techniques based on clustering that allow efficient and
near-optimal treatment of the problem. We also present some
preliminary experimental results.
Copyright © 1998 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
CDROM Version: Load the CDROM "DiSC, Volume 1 Number 1" and ...
Online Version (ACM WWW Account required): Full Text in PDF Format
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
Printed Edition
Laura M. Haas, Ashutosh Tiwary (Eds.):
SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2-4, 1998, Seattle, Washington, USA.
ACM Press 1998, ISBN 0-89791-995-5
,
SIGMOD Record 27(2),
June 1998
Contents
[Abstract]
[Full Text (Postscript)]
Long Version
http://www-db.stanford.edu/pub/papers/extract-schema.ps
References
- [1]
- Serge Abiteboul:
Querying Semi-Structured Data.
ICDT 1997: 1-18

- [2]
- Serge Abiteboul, Richard Hull, Victor Vianu:
Foundations of Databases.
Addison-Wesley 1995, ISBN 0-201-53771-0
Contents

- [3]
- Antonio Albano, Roberto Bergamini, Giorgio Ghelli, Renzo Orsini:
An Object Data Model with Roles.
VLDB 1993: 39-51

- [4]
- ...
- [5]
- ...
- [6]
- Peter Buneman:
Semistructured Data.
PODS 1997: 117-121

- [7]
- Peter Buneman, Susan B. Davidson, Mary F. Fernandez, Dan Suciu:
Adding Structure to Unstructured Data.
ICDT 1997: 336-350

- [8]
- Peter Buneman, Susan B. Davidson, Gerd G. Hillebrand, Dan Suciu:
A Query Language and Optimization Techniques for Unstructured Data.
SIGMOD Conference 1996: 505-516

- [9]
- R. G. G. Cattell:
The Object Database Standard: ODMG-93 (Release 1.1).
Morgan Kaufmann 1994

- [10]
- Roy Goldman, Jennifer Widom:
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases.
VLDB 1997: 436-445

- [11]
- ...
- [12]
- ...
- [13]
- ...
- [14]
- ...
- [15]
- Svetlozar Nestorov, Jeffrey D. Ullman, Janet L. Wiener, Sudarshan S. Chawathe:
Representative Objects: Concise Representations of Semistructured, Hierarchial Data.
ICDE 1997: 79-90

- [16]
- Dallan Quass, Anand Rajaraman, Yehoshua Sagiv, Jeffrey D. Ullman, Jennifer Widom:
Querying Semistructured Heterogeneous Information.
DOOD 1995: 319-344

- [17]
- Dan Suciu:
Query Decomposition and View Maintenance for Query Languages for Unstructured Data.
VLDB 1996: 227-238

- [18]
- Jeffrey D. Ullman:
Principles of Database and Knowledge-Base Systems, Volume I.
Computer Science Press 1988, ISBN 0-7167-8158-1
Contents

- [18-2]
- Jeffrey D. Ullman:
Principles of Database and Knowledge-Base Systems, Volume II.
Computer Science Press 1989, ISBN 0-7167-8162-X
Contents

- [19]
- Moshé M. Zloof:
Query-by-Example: A Data Base Language.
IBM Systems Journal 16(4): 324-343(1977)

Copyright © Mon Dec 21 21:56:53 2009
by Michael Ley (ley@uni-trier.de)