An algebra for structured text

Srinivasulu Erva, University of Nevada, Las Vegas

Abstract

The Standard Generalized Markup Language (SGML) is generally used to mark the logical structure of a document. In general, the structure information obtained from SGML documents can be used by an IR system to perform structure-level retrieval. In this thesis, we present a formal model and a modified version of Abiteboul and Beeri's complex object algebra to manipulate the content and structure of SGML documents. Furthermore, we provide an extensive list of queries and their formulations to show the algebra's expressive power for manipulation of textual objects.