Prolog Terms

The generic name for all forms of Prolog data is "term". The data your program works with is all terms of one form or another. The program itself is made up of terms. Prolog execution is simply the repetitive matching of patterns in these terms. This section describes the various forms of terms. They are:

Numbers

Numbers in Amzi! Prolog may be either 32-bit integers or 32-bit floating point numbers. Thus integers are in the range -2, 147, 483, 648 to 2, 147, 483, 647 and floating point numbers are in the approximate range 3.4E-38 to 3.4E38 with approximately seven digits of accuracy.

Integers

Integers can be read in normal decimal notation, or in hexadecimal. Hexadecimal numbers must begin with 0x.

Examples

In negative numbers there can be no space between the minus sign and the first digit. For example:

are numbers, while:

is the same as the structure '-'(456) (see below).

A character preceded by a back quote (`), is translated into the integer ASCII value for that character, for example:

Also see Character Constants below.

Floating Point Numbers

Floating point numbers must have a decimal point and at least one digit after the decimal point. They can also optionally include an exponent field. Numbers less than 1 must begin with a 0. Internally, floats are stored as double precision floating point numbers.

Examples:

Atoms

Atoms are fundamental data elements. An atom is simply a named object. Unlike most other programming languages there is no value associated with an atom, rather it is a value manipulated by the program. (This is why Prolog is classified as a symbolic language, it manipulates symbols, and the basic symbol is an atom.)

The name of an atom is composed of letters, numbers, the underscore character "_" and the dollar sign "$". An atom must begin with a lower case letter. The following are atoms:

Atoms may also have names which do not follow the rule above. In this case the name, which may be composed of any characters, must be enclosed in single quotation marks, e.g.:

Notice in this case that the first two names stand for different atoms (differing number of spaces).

The fourth example shows the use of international characters which can be entered using the [Alt] key. The extended character codes for international characters, 128-167, are treated the same as lower-case letters. This means they can be used in atom names without quoting, and can be used as the first character of atom names.

Internally all atoms are stored as Unicode (wide) character strings. Further, Prolog source text can be stored in Unicode format, enabling the inclusion of Unicode characters directly in Prolog programs.

Failing to quote names that begin with uppercase letters is a very common Prolog error. It can also be tricky to catch, because a list of characters beginning with an uppercase letter is the way to represent Prolog variables, so usually an unquoted name makes perfect sense in the program.

For example:

In addition to alphanumeric characters, if a name is made up entirely out of the following symbols @#$%^&></+*-=\~`:?;, then the atom need not be quoted. This is to allow you to build your own operators with their own significance to your applications.

For example, you might want to build a custom inference runtime and use the symbol => to indicate implies.

It is useful to keep this example in mind, because it is the rationale behind the Prolog choice of symbols representing greater than or equal and less than or equal (see the section on arithmetic). Prolog constructs these symbols so they do NOT look like arrows. This is so you can use arrows in your programs. Don't however, use the arrow -->; it is used by the DCG translator.

Certain special characters may be embedded inside a quoted atom by using a backslash followed by a token. See the section on Escape Characters for details.

Built-in Atoms

There are a number of built-in atoms, which have predetermined values that can be used in arithmetic expression.

e
The value "e" (2.718282..)
pi
The value "pi" (3.14159 ..)
random
A random floating point number >= 0.0 and < 1.0.
cputime
A floating point number with the number of CPU seconds expired. It is useful for timing functions.

Escape Characters

Certain special characters may be embedded inside quoted atoms, character lists or strings by use of the escape character (backslash) and a token. The handling of escape characters has been enhanced to be more in keeping with the emerging ISO standard, which is close to the C standard specification.

Enabling and Disabling Escape Processing

The use of a backslash (\) as an escape character can be irritating, especially in applications that use PC path names, as all pathnames must use double backslashes in place of a single backslash when you open, consult or load.

A mode setting called string_esc allows you to enable or disable processing of escape characters. To turn it on:

To turn it off:

It can also be set from the .cfg file:

The default setting is 'on'.

Escape Codes

When escape processing is enabled, the backslash (\) is the escape character and does not become part of the string. The character following the backslash is interpreted as follows:

a
alert (bell) character
b
backspace
f
formfeed
n
newline
r
carriage return
t
horizontal tab
v
vertical tab
ooo
up to three octal digits representing a character
xhh
up to two hex digits representing a character
\
a single backslash character

Any other character following a backslash is just the character.

This is a change from the way it is documented and worked in previous versions. If you have strings with backslashes in them that are intended to be backslashes, they must be changed to double backslashes.

Using Escape Codes

When are backslashes interpreted as escapes? Anytime the Prolog term reader in invoked. This includes responses to the read/1 predicate, query terms entered at the listener prompt, query terms built using the string functions in the API, and code in a file that is either interpreted or compiled. The escape causes the Prolog reader to convert the escape sequence into the correct ASCII character(s) in the input string, atom, or character list.

When are they not considered escapes? Once the string has been read it stays converted. Other I/O predicates, such as read_string/1, do not use the Prolog term reader and process backslashes as plain backslashes.

Examples

\n represents a newline character, for instance the atom:

\t represents the "tab" character, a preset number of spaces (which depends on the type of terminal being used). For instance:

To override this mechanism, that is to have \n print out as the two characters \ and n, it is necessary to use an extra \ as follows:

A quote within a quote can be represented in two ways. One is with a preceding \ and the other is by using a double quote. So if you want an atom named it's, then it can be entered as either

Strings

A string is an efficient way of representing text. It is denoted by text enclosed in matching dollar signs, $. Strings may also have embedded formatting characters exactly like atoms (as described in the section on Escape Characters). For example:

To represent the dollar sign within a string use two dollar signs:

Strings are primarily used to represent text which is being printed out or read in to Prolog. Strings occupy less memory in the computer than do atoms. Also the Prolog system has a fixed number of atoms it can hold, whereas the number of strings is limited entirely by memory size.

Internally all strings are stored as Unicode (wide) character strings. This means the full Unicode character set can be used when reading and displaying information, and in Prolog source code.

Strings do not occupy space in the atom table and the space they occupy is automatically collected and reused by the system once the string is no longer needed. Consequently, strings can be a very memory-efficient way of dealing with text.

There exist mechanisms for converting between atoms and strings and doing elementary processing of strings.

Variables

Variables in Prolog have a unique feature. They are "stand-ins" for Prolog terms which may later be "filled in" with another Prolog term. We will discuss this in detail below.

A variable is represented by a series of letters, numbers and the underscore character. It must begin either with an uppercase character or the underscore character. The following are valid variable names:

Two Prolog variables with the same name represent the same variable if they are in the same clause. Otherwise they are different variables (which just happen to have the same name). There is one exception to this rule:

Unlike every other variable which is uniquely determined by its name in the clause in which it appears, the variable _ is unique every time it occurs. It is called the anonymous variable.

Since it can never be referenced by naming it, the anonymous variable is used to stand in place for a value about which we care very littlewe will never try to inspect the value of the variable.

Structures

Structures are the fundamental data types of Prolog. A structure is determined by its name (sometimes called the principal functor) and its arguments. The functor is an atom and the arguments may be any Prolog terms, including other structures. A structure is written as follows:

There must be no space between the name and the opening parenthesis "(". The number of arguments in a structure is called the arity.

An atom is really a degenerate structure of arity 0.

The maximum arity of a structure is 255.

Structures are used to represent data. Following are some examples of a structure whose functor is "likes" and whose arity is 2.

Here are some more complex nested structures.

Structures are also the heads of Prolog clauses, and the goals of the bodies of those clauses. For example:

All Prolog really does is match up structures with each other.

Lists

Lists are used to represent ordered collections of Prolog terms. Lists are indicated by squared brackets "[" and "]". There are two kinds of listsdefinite and indefinite, sometimes also called closed and open.

A definite list has a known number of elements which are written down, separated by commas within the brackets. The elements can be arbitrary Prolog terms, including other lists, e.g.:

Here the first list has three elements, the numbers 1, 2 and 3. The second list is a special list called the empty list; it has no elements. The third list also has three elements. The first is a structure of arity 1, the second a sub-list with two elements, and the third element is a variable called X.

The second kind of list has a given initial number of known elements, and then from some point on the list is unknown. We represent this by using the vertical bar "|" to mean:

" .. and then if we remove the elements to the left of | from the given list, the list left over is everything to the right of |"

For example:

is exactly the same as the list [a, b, c, d], because if the first element a is removed from [a, b, c, d], then the list [b, c, d] is left. So this is really a definite list masquerading as an indefinite list. However, the list:

is a list whose first element is 1, second element is 2 and whose remaining elements are as yet unknown (remaining to be filled in when the variable X is filled in and made into a list). Thus in some sense, [1, 2 | X] is the most general possible list which begins with the sequence 1, 2, .... This is a very important aspect of Prolog which is exploited often.

Note that lists are really just a nice way of writing what is a conventional Prolog data structure, nested to arbitrary depth. You could implement your own lists, with exactly the same behavior, by using a structure called list with two arguments, a head and a tail. The tail can be another list, and so on and so forth. For example, the list [a, b, c] would be represented:

If, in fact, you use the display predicate to write a list structure, you will see this true nature of lists revealed, only the predicate name is called dot, represented by a period.

The nature of an open list is that it has a variable as the tail of its innermost structure.

[1, 2 | X] is the same as .(1, .(2, X))

While the normal list notation is easier to read and write, sometimes it is useful to think of the structure notation of lists when trying to understand predicates that manipulate lists.

Character Lists

Lists whose elements are ASCII codes for printable characters are often used in Prolog. Prolog recognizes some special syntax's to make this use more convenient.

A string of characters of the form "<text string>" is converted into the list of ASCII character codes of the text string. For example, "abc" is read in as [97, 98, 99].

It is important to realize that once in the Prolog interpreter, Prolog has no way of knowing whether a list was originally a character string or not.

Individual ASCII values can be represented by a back quote and the character. So, assuming the predicate member/2 finds members of list, then the following statements are reasonable.

It is exactly the same as saying

To represent a " character within a character list, use it twice. For example:

Character Constants

Because Unicode characters are unsigned ints and are often referred to by their hexadecimal value, a new integer constant has been added to Amzi! Prolog. Internally it is an unsigned 2-byte integer. They are entered using a similar syntax to hex numbers, only using a 'w' where hex uses an 'x'. They are always displayed in that syntax.

atom_codes/2, string_list/2 and the back quote character notation (`c) all use the character constants. For example, to create the atom duck:

To create an atom with Japanese characters:

Character constants can be used in arithmetic and will unify with integers.

Database References

A database reference is a term which may be thought of as the "address" of a rule or fact in memory A database reference prints out as the symbol "@" followed by a number, e.g. @1782673.

The main purpose of a database reference is to allow the rapid retrieval of a rule in memoryavoiding Prolog's built-in search mechanism. There are a number of built-in predicates that provide you this information and let you use it.

Since the location in memory depends on the particular environment, databases references can only be generated by the Prolog systemthere is no way for you to input them.

Comments

Comments may appear anywhere in the source code. They are preceded by a % sign. All text following the percent up to the end of the line is considered part of the comment.

Also, although it is non-standard, Amzi! Prolog allows multi-line comments encased in C-style delimiters, /* and */.

Operators

Recall that a structure is written as name(arg1, arg2, ...argn).

As we will see later, there are some special structures whose names are familiar, such as: +, /, *, -. When the arity of a structure is two, we might like to write the structure as:

arg1 name arg2

so we can write "3 + 4", rather than "+(3, 4)". Similarly when the arity is 1, we might like to write:

name arg1, or arg1 name 

rather than name(arg1).

In order to do this we have to inform Prolog via an operator declaration that a certain name may optionally be used before, after, or in between its arguments; we speak of name as being an operator. Even if name is declared to be an operator, it can still be used in the usual structure notation.

We emphasize that declaring operators only serves to alter the way structures may look on input or output. Once inside the Prolog system, all structures are kept in a totally different internal form.

If an operator is declared to be used between its two arguments, we say it is an infix operator. If it is to be used before its single argument then it is a prefix operator; if it is to be used after its argument it is a postfix operator. Operators may be declared to be both infix and either pre- or post- fix, in this case they are called mixed operators.

Just declaring the "fix" of an operator is not enough however since this can lead to ambiguities. For example suppose that + and - have been declared to be infix operators. Consider:

a + b - c

What is the second argument of +? It might be b, in which case the term is

'-'( '+'(a, b), c)

or it might be the whole term b - c, in which case the term is

'+'(a, '-'(b, c))

These are very different terms so which should Prolog choose?

One way to force an interpretation is to use parentheses. So if we wanted the first interpretation we would write:

(a + b) - c

If we wanted the second we should use:

a + (b - c)

exactly as in high school algebra. However we still wish to agree on a consistent interpretation in the absence of overriding parentheses.

Prolog solves this problem by requiring two extra pieces of information about each operator at the time of its declaration: precedence and associativity.

Precedence

The first piece of information required for each operator (whether pre, in or post -fix) is a number between 1 and 1200 called the precedence of the operator.

When combining different operators together, the principal functor of a term represented by a series of operators is the operator with highest precedence.

For example, suppose + is defined to have precedence 500 and * is defined to have precedence 400. Consider:

a + b * c

We start reading from the left. + has higher precedence, so it must be the principal functor of the constructed term. Therefore the term must be:

'+'(a, '*'(b, c))

This corresponds naturally to the high school algebra rule "do multiplications first".

Associativity

The other piece of information required is the operator's associativity. Not only does this specify the "fix" of the operator but it also handles the ambiguity remaining in operator usagenamely how to handle consecutive operators of the same precedence.

The associativity of an operator can be one of the following atoms:

xfx             yfx             fx              xf
xfy             yfy             fy              yf

where x and y stand for the arguments and f stands for the operator. Thus:

?f?
is an infix operator
?f
is a postfix operator
f?
is a prefix operator

The meaning of x versus y is a little more subtle. x means that the precedence of the argument (i.e., the precedence of the principal functor of the argument) must be less than the precedence of f. y means that the precedence of the corresponding argument may be less than or equal to the precedence of f.

op(Precedence, Type, Oper)

op/3 is used to define an operator's precedence (Precedence), position and associativity (using Type). Precedence must be bound to an integer between 0 and 1200. Type must be bound to one of the atoms fx, fy, xf, yf, xfx, xfy, yfx and Oper to either the atom which is to be made an operator or a list of such atoms (in which case all the atoms are given the same specified associativity/precedence).

For example:

?- op(500, yfx, +).
so now: 
a + b + c

must be the same as:

(a + b) + c 

Operators can have at most one infix declaration and one declaration of either pre- or post-fix in force at any time. Subsequent operator declarations supersede earlier ones. For example:

?- op(500, xfy, +).     % + is an infix operator
?- op(1200, fx, +).     % + is now both infix and prefix.
?- op(1200, xf, +).     % .. but is now infix and postfix

The final argument in op may be either a single atom or a list of atoms. In the latter case all the atoms are given the same specified associativity and precedence.

Term1 : Term2

The colon is a meaningless operator that can be used to associate two terms. For example, attr uses it for foreground and background colors, "white:blue".

Predefined Prolog Operators

The following Prolog operators are declared at initialization time. They can be subsequently redefined by using the op predicate (but it's not a good idea because they are used by the Prolog system). (Note: In release 4.1 the precedence of the -> operator was changed to be less than that of ';', to conform to the ISO standard. This could cause a change in behavior of programs with complex if-then-else (-> ;) statements.)


:- (op(1200, xfx, [:-, -->])).
:- (op(1200, fx, [?-, :-])).
:- (op(1100, fx, [import, export, dynamic, 
        multifile, discontiguous])).
:- (op(1100, xfy, ';')).
:- (op(1050, xfy, ->)).
:- (op(1000, xfy, ',')).
:- (op( 900, fy, [\+, not, once])).
:- (op( 700, xfx, [=, \=, is, =.., ==, \==, =:=, =\=, 
        <, >, =<, >=,
        @<, @>, @=<, @>=])).
:- (op( 600, xfy, :)).
:- (op( 500, yfx, [+, -, /\, \/, xor])).
:- (op( 400, yfx, [rem, mod, divs, mods, divu, modu])).
:- (op( 400, yfx, [/, //, *, >>, <<])).
:- (op( 200, xfx, **)).
:- (op( 200, xfy, ^)).
:- (op( 200, fy, [+, -, \])).

Copyright ©1987-2000 Amzi! inc. All Rights Reserved.