The principle of logical markup consists in marking the structure of a document, and its definition has two different phases:
Several document instances can belong to the same document ``class'',
they are described by the same DTD---in other words they have
the same logical structure. As an example let us consider two source
texts of an article (see Figure
), where the
specific structures look different, but the logical structure is
built according to the same pattern: a title, followed by one or more
sections, each one subdivided into zero or more subsections, and a
bibliography at the end. We can say that the document instances
belong to the document class ``article''.
Figure: Two instances of the same document class ``article''
To describe the formal structure of all documents of type ``article'' one has to construct the Document Type Definition (or DTD). of the document class ``article''. A DTD is expressed in a language defined by the SGML Standard and identifies all the elements that are allowed in a document belonging to the document class being defined (sections, subsections, ). The DTD assigns a name to each such structural element, often an abbreviation conveying the function of the element in question (for example, ``sec'' for a section). If needed, the DTD also associates one or more descriptive attributes to each element, and describes the relations between elements (for example, the bibliography always comes at end of the document, while sections can, but need not contain subsections). Note that the relations between elements do not always have to be hierarchical, for instance the relation between a section title and a cross-reference to that title three sections further down is not a hierarchical type of relation. In general, DTDs use element attributes to express these kinds of cross-link.
Having defined the DTD one can then start marking up the document source itself (article A or article B), using the ``short'' names defined for each document element. For instance, with ``sec'' on form the tag <sec> for marking the start of a section and </sec> to mark its end, and similarly one has <ssec> and </ssec> for subsection, and so on.
<article> <tit>An introduction to SGML</tit> <sec>SGML: the basic principles</sec> <P> ... <ssec>Generalized logical markup</ssec> <P> ...