The ISO standard specifies the Prolog syntax in ASCII characters. As SWI-Prolog supports Unicode in source files we must extend the syntax. This section describes the implication for the source files, while writing international source files is described in section 3.1.3.
The SWI-Prolog Unicode character classification is currently based on version 14.0.0 of the Unicode standard. Please note that char_type/2 and friends, intended to be used with all text except Prolog source code, is based on the C library locale-based classification routines.
\uXXXX
and \UXXXXXXXX
(see
section 2.15.1.3)
were introduced to specify Unicode code points in ASCII files.
ID_Start
followed by
a sequence of ID_Continue
codes. Such sequences are handled
as a single token in SWI-Prolog. The token is a variable iff it
starts with an uppercase character or an underscore (_
).
Otherwise it is an atom. Note that many languages do not have the notion
of character case. In such languages variables must be written
as
_name
.
Decimal number characters (Nd) are accepted to form numbers,
regardless of the Unicode block in which they appear. Currently this is
supported for integers, rational numbers (see section
2.15.1.6) and floating point numbers. In any number, all
digits must come from the same block, i.e., if the nominator of a
rational is uses Indian script, so must the demoninator. All special
characters such as the sign, rational separator, floating point
,
and floating point exponent must use their usual ASCII character.
.
==
: an
unquoted sequence of symbol characters are combined into an atom).
Other characters (this is mainly No
: a numeric
character of other type) are currently handled as‘solo’.