XML, XSL, two of a family of extensible languages
PREVIOUSFIRSTLASTNEXT

Understanding the XML Specification

The extended Backus-Naur format (EBNF)

The XML specification contains a grammar using the extended Backus-Naur format for constructing XML documents. Each rule of the grammar has the form:
 symbol ::= expression

symbol
if defined by regular expression: initial capital, lower case otherwise;
expression
right-hand side of rule which has the syntax shown below to match strings of one or more characters.
#xN
where N is an hexadecimal integer (Unicode or ISO/IEC 10646 BMP code point).
[a-zA-Z], [#xN-#xN]
matches any character with a value in the range(s) indicated (inclusive).
[^a-zA-Z], [^#xN-#xN]
matches any character with a value outside the range indicated.
[^abc], [^#xN#xN#xN]
matches any character with a value not among the characters given.
'texte' or "texte"
matches a literal string matching that given inside the single (double) quotes.

These symbols may be combined to match more complex patterns as follows, where A and B represent simple expressions:

(expression)
expression is treated as a unit and may be combined as described in this list.
A?
matches A or nothing; optional A
A B
matches A followed by B.
A|B
matches A or B but not both.
A-B
matches any string that matches A but does not match B.
A+
matches one or more occurrences of A.
A*
matches zero or more occurrences of A.

Other notations used in the productions are:

/* ... */
comment.
[ wfc: ... ]
well-formedness constraint; this identifies by name a constraint on well-formed documents associated with a production.
[ vc: ... ]
validity constraint; this identifies by name a constraint on valid documents associated with a production.

Last updated: September 10th 1999