next up previous
Next: A Few Words Up: SGML Basic Principles Previous: Different Types of

Generalized Logical Markup

 

The principle of logical markup consists in marking the structure of a document, and its definition has two different phases:

  1. the definition of a set of ``tags'' identifying all elements of a document, and of formal ``rules'' expressing the relations between the elements and its structure (this is the role of the DTD);
  2. entering the markup into the source of the document according to the rules laid out in the DTD.

Several document instances can belong to the same document ``class'', they are described by the same DTD---in other words they have the same logical structure. As an example let us consider two source texts of an article (see Figure gif), where the specific structures look different, but the logical structure is built according to the same pattern: a title, followed by one or more sections, each one subdivided into zero or more subsections, and a bibliography at the end. We can say that the document instances belong to the document class ``article''.

  
Figure: Two instances of the same document class ``article''

To describe the formal structure of all documents of type ``article'' one has to construct the Document Type Definition (or DTD). of the document class ``article''. A DTD is expressed in a language defined by the SGML Standard and identifies all the elements that are allowed in a document belonging to the document class being defined (sections, subsections, ). The DTD assigns a name to each such structural element, often an abbreviation conveying the function of the element in question (for example, ``sec'' for a section). If needed, the DTD also associates one or more descriptive attributes to each element, and describes the relations between elements (for example, the bibliography always comes at end of the document, while sections can, but need not contain subsections). Note that the relations between elements do not always have to be hierarchical, for instance the relation between a section title and a cross-reference to that title three sections further down is not a hierarchical type of relation. In general, DTDs use element attributes to express these kinds of cross-link.

Having defined the DTD one can then start marking up the document source itself (article A or article B), using the ``short'' names defined for each document element. For instance, with ``sec'' on form the tag <sec> for marking the start of a section and </sec> to mark its end, and similarly one has <ssec> and </ssec> for subsection, and so on.

<article>
<tit>An introduction to SGML</tit>
<sec>SGML: the basic principles</sec>
<P>  ...
<ssec>Generalized logical markup</ssec>
<P>  ...



next up previous
Next: A Few Words Up: SGML Basic Principles Previous: Different Types of



Janne Saarela
Tue Jun 20 12:14:59 METDST 1995