A string in ATS is represented in the same manner as in C: It is a sequence of adjacently stored non-null characters followed by the null character, and its length is the number of non-null characters in the sequence. Conventionally, such strings are often referred to as C-style strings, which are notoriously difficult to be processed safely (as is clearly indicated by so many bugs and breaches due to misusing such strings). As a matter of fact, ATS is the first practical programming language that I know can fully support safe processing of C-style strings. In ATS, string is a type constructor of the sort (int) -> type. Given a static integer n, string(n) is the type for strings of length n. Note that string also refers to a non-dependent type for strings of unspecified length, which is basically equivalent to the type String defined as follows:
The following two functions are commonly used for traversing a given string:// fun string_is_atend {n:int}{i:nat | i <= n} (str: string(n), i: size_t(i)): bool(i==n) // end of [string_is_atend] // fun string_isnot_atend {n:int}{i:nat | i <= n} (str: string(n), i: size_t(i)): bool(i < n) // end of [string_isnot_atend] //
fun string_length {n:nat} ( str: string(n) ) : size_t(n) = let fun loop{i:nat | i <= n} .<n-i>. (str: string(n), i: size_t(i)): size_t(n) = if string_isnot_atend(str, i) then loop(str, succ(i)) else i // end of [loop] in loop(str, i2sz(0)) end // end of [string_length]
The following two functions are for accessing and updating characters stored in strings:
// typedef charNZ = [c:int | c != 0] char(c) // fun string_get_at {n:int}{i:nat | i < n}(str: string n, i: size_t i): charNZ overload [] with string_get_at // fun string_set_at {n:int}{i:nat | i < n}(str: string n, i: size_t i, c: charNZ): void overload [] with string_set_at //
// typedef sizeLt(n:int) = [i:nat | i < n] size_t(i) // fun string_find{n:nat} ( str: string n, c0: char ) : Option(sizeLt n) = let typedef res = sizeLt(n) fun loop{i:nat | i <= n} ( str: string n, c0: char, i: size_t i ) : Option(res) = let val isnot = string_isnot_atend(str, i) in if isnot then if (c0 = str[i]) then Some{res}(i) else loop(str, c0, succ(i)) // end of [if] else None() // end of [if] end (* end of [loop] *) in loop(str, c0, i2sz(0)) end // end of [string_find] //
There is some inherent inefficiency in the implementation of string_find: A given position i is first checked to see if it is strictly less than the length of the string str by calling string_isnot_atend, and, if it is, the character stored at the position in the string is fetched by calling string_get_at. These two function calls are merged into one in the following implementation:
// // This implementation does the same as [string_find] // but should be more efficient. // fun string_find2{n:nat} ( str: string n, c0: char ) : Option(sizeLt n) = let // fun loop {i:nat | i <= n} ( str: string(n) , c0: char, i: size_t(i) ) : Option(sizeLt n) = let typedef res = sizeLt(n) val c = string_test_at(str, i) in if c != '\000' then ( if (c0 = c) then Some{res}(i) else loop(str, c0, succ(i)) // end of [if] ) else None((*void*)) // end of [if] end // end of [loop] // in loop(str, c0, i2sz(0)) end // end of [string_find2]
fun string_test_at {n:int}{i:nat | i <= n} ( str: string(n), i: size_t(i) ) : [c:char | (c != NUL && i < n) || (c == NUL && i >= n)] char(c) // end of [string_test_at]
Handling strings safely and efficiently is a complicated matter in programming language design, and a great deal of information about a programming language can often be revealed by simply studying its treatment of strings. In ATS, properly processing C-style strings also makes essential use of linear types, which I will cover in another part of this book.