dblp.uni-trier.de www.dagstuhl.de www.uni-trier.de

Unstructured Data Bases or Very Efficient Text Searching.

Gaston H. Gonnet: Unstructured Data Bases or Very Efficient Text Searching. PODS 1983: 117-124
@inproceedings{DBLP:conf/pods/Gonnet83,
  author    = {Gaston H. Gonnet},
  editor    = {Ronald Fagin and
               Philip A. Bernstein},
  title     = {Unstructured Data Bases or Very Efficient Text Searching},
  booktitle = {Proceedings of the Second ACM SIGACT-SIGMOD Symposium on Principles
               of Database Systems, March 21-23, 1983, Colony Square Hotel,
               Atlanta, Georgia, USA},
  publisher = {ACM},
  year      = {1983},
  isbn      = {0-89791-097-4},
  pages     = {117-124},
  ee        = {http://doi.acm.org/10.1145/588058.588074, db/conf/pods/Gonnet83.html},
  crossref  = {DBLP:conf/pods/83},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

We present several algorithms to search data bases that consist of text. The algorithms apply mostly to very large data bases that are difficult to structure.

We describe algorithms which search the original data base without transformation and hence could be used as general text searching algorithms. We also describe algorithms requiring preprocessing, the best of them achieving a logarithmic behaviour. These efficient algorithms solve the "plagiarism" problem among n papers

The problem of misspellings, ambiguous spellings, simple errors, endings, positional information, etc is nicely treated using signature functions.

Copyright © 1983 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


Load The ACM SIGMOD Anthology, CDROM Edition, Volume 1-3, PODS '82-'98. and ... Load The ACM SIGMOD Anthology, Silver Edition, DVD 1, Proceedings. and ...

Printed Edition

Ronald Fagin, Philip A. Bernstein (Eds.): Proceedings of the Second ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, March 21-23, 1983, Colony Square Hotel, Atlanta, Georgia, USA. ACM 1983, ISBN 0-89791-097-4
Contents CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Online Edition: ACM Digital Library


References

[1]
Alfred V. Aho, Margaret J. Corasick: Efficient String Matching: An Aid to Bibliographic Search. Commun. ACM 18(6): 333-340(1975) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[2]
Burton H. Bloom: Space/Time Trade-offs in Hash Coding with Allowable Errors. Commun. ACM 13(7): 422-426(1970) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[3]
Robert S. Boyer, J. Strother Moore: A Fast String Searching Algorithm. Commun. ACM 20(10): 762-772(1977) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[4]
Larry Carter, Robert W. Floyd, John Gill, George Markowsky, Mark N. Wegman: Exact and Approximate Membership Testers. STOC 1978: 59-65 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[5]
Zvi Galil: On Improving the Worse Case Running Time of the Boyer-Moore String Matching Algorithm. Commun. ACM 22(9): 505-508(1979) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[6]
...
[7]
R. Nigel Horspool: Practical Fast Searching in Strings. Softw., Pract. Exper. 10(6): 501-506(1980) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[8]
Donald E. Knuth: The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley 1973, ISBN 0-201-03803-X
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[9]
Donald E. Knuth, James H. Morris Jr., Vaughan R. Pratt: Fast Pattern Matching in Strings. SIAM J. Comput. 6(2): 323-350(1977) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[10]
John L. Pfaltz, William J. Berman, Edgar M. Cagley: Partial-Match Retrieval Using Indexed Descriptor Files. Commun. ACM 23(9): 522-528(1980) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[11]
Ronald L. Rivest: Partial-Match Retrieval Algorithms. SIAM J. Comput. 5(1): 19-50(1976) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[12]
Charles S. Roberts: Partial-Match Via the Method of Superimposed Codes. Proceedings of the IEEE 67(12): 1624-1642(1979) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[13]
Alan L. Tharp, Kuo-Chung Tai: The Practicality of Text Signatures for Accelerating String Searching. Softw., Pract. Exper. 12(1): 35-44(1982) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Last update Fri Sep 14 17:28:23 2012 CET by the DBLP TeamThis material is Open Data Data released under the ODC-BY 1.0 license — See also our legal information page