A string in ATS is represented in the same manner as in C: It is
sequence of adjacently stored non-null characters followed by the null
character, and its length is the number of non-null characters in the
sequence. Conventionally, such strings are often referred to as C-style
strings, which are notoriously difficult to be processed safely (as
indicated by so many bugs and breaches due to misusing such strings). As a
matter of fact, ATS is the first practical programming language that I know
can fully support safe processing of C-style strings. In ATS,
string is a type constructor of the sort (int) ->
type . Given a static integer n, string(n) is the
type for strings of the length n. Note that string also refers
to a non-dependent type for strings of unspecified length, which is
basically equivalent to the type String defined as follows:
The following two functions are commonly used for traversing a given string:
Obviously, either one of them can be implemented based on the other. As an
example, the following code implements a function that computes the length
of a string:
Note that the function
loop in the body of
string_length is defined tail-recursively. Although this
implementation of
string_length looks fairly plain now, it was
actually an exciting achievement in the pursuit of supporting practical
programming with dependent types.
The following two functions are for accessing and updating characters stored
in strings:
The type constructor
char is of the sort
(char) ->
t@ype, which takes a static character c to form a singleton type
char(c) for the only character equal to c. Thus, the type
c1har is for all non-null characters. The following defined
function
string_find traverses a string from left to right to
see if a given character occurs in the string:
If the character
c0 occurs in the string
str,
then a value of the form
option0_some(i) is returned, when i
refers to the position of the first occurrence of
c0 (counting
from left to right). Otherwise, the value
option0_none() is
returned.
There is some inherent inefficiency in the implementation of
string_find: A given position i is first checked
to see if it is strictly less than the length of the string
str by calling string_isnot_at_end, and, if it
is, the character stored at the position in the string is fetched by
calling string_get_char_at. These two function calls are
merged into one function call in the following implementation:
The interface for the function
string_test_char_at
is given as follows:
By checking the return value of a call to
string_test_char_at,
we can readily tell whether the position
i is at the end of
the string
str.
Handling strings safely and efficiently is a complicated matter in
programming language design, and a great deal of information about a
programming language can often be revealed by simply studying its treatment
of strings. In ATS, properly processing C-style strings also makes
essential use of linear types, which I will cover in another part of this
book.