SGML2PL has four modes for handling white-space. The initial mode can
be switched using the space(SpaceMode)
option to
load_structure/3
and set_sgml_parser/2.
In XML mode, the mode is further controlled by the xml:space
attribute, which may be specified both in the DTD and in the document.
The defined modes are:
\r\n
is still translated to \n
.
To preserve whitespace exactly, use space(strict)
(see below)sgml
space-mode, all consequtive white-space
is reduced to a single space-character. This mode canonicalises all
white space.default
, all leading and trailing
white-space is removed from CDATA
objects. If, as a result,
the CDATA
becomes empty, nothing is passed to the
application. This mode is especially handy for processingādata-orientedādocuments,
such as RDF. It is not suitable for normal text documents. Consider the
HTML fragment below. When processed in this mode, the spaces between the
three modified words are lost. This mode is not part of any standard;
XML 1.0 allows only default
and preserve
.
Consider adjacent <b>bold</b> <ul>and</ul> <it>italic</it> words.