The datatype mechanism in ATS is adopted from ML directly, and it is really a signatory feature in functional programming. However, the datatypes we have seen so far are not very precise when employed to classify values. For instance, given a type T, the type list0(T) is for values representing both empty and non-empty lists consisting of elements of the type T. Therefore, empty and non-empty lists cannot be distinguished at the level of types. This limitation severely diminishes the effectiveness of datatypes of ML-style in capturing program invariants. In ATS, datatypes of ML-style can be refined into dependent datatypes of DML-style, where DML refers to the programming language Dependent ML, the immediate precursor of ATS. With such refinement, datatypes can classify values with greatly enhanced precision.
The code employed for illustration in this chapter plus some additional code for testing is available on-line.
The syntax for declaring dependent datatypes is mostly similar to the syntax for declaring non-dependent datatypes: For instance, the dependent datatype list in ATS is declared as follows:
datatype list(t@ype+, int) = | {a:t@ype} list_nil(a, 0) of () // [of ()] is optional | {a:t@ype}{n:nat} list_cons(a, n+1) of (a, list(a, n))
There two data (or value) constructors list_nil and list_cons associated with list, which are declared to be of the following types:
Given a type T and a static integer I, the type list(T, I) is for values representing lists of length I in which each element is of the type T. Hence, the types of list_nil and list_cons mean that list_nil forms a list of length 0 and list_cons forms a list of length n+1 if applied to an element and a list of length n. Note that it is also possible to declare list as follows in a more concise style:datatype list(a:t@ype+, int) = | list_nil(a, 0) of () // [of ()] is optional | {n:nat} list_cons(a, n+1) of (a, list(a, n))
As an example of programming with dependent datatypes, the following code implements a function template for computing the length of a given list:
fun{ a:t@ype } list_length {n:nat} .<n>. (xs: list(a, n)): int(n) = case+ xs of | list_nil() => 0 | list_cons(_, xs1) => 1 + list_length<a>(xs1) // end of [list_length]
Given that the two clauses typecheck properly, the case-expression containing them can be assigned the type int(n). Therefore, the definition of list_length typechecks.
As the recursive call in the body of the above defined function list_length is not a tail-call, the function may not be able to handle a long list (e.g., one that contains 1 million elements). The following code gives another implementation for computing the length of a given list:
fun {a:t@ype} list_length{n:nat} .<>. (xs: list(a, n)): int(n) = let // // loop: {i,j:nat} (list(a, i), int(j)) -> int(i+j) // fun loop{i,j:nat} .<i>. // .<i>. is a termination metric (xs: list(a, i), j: int(j)): int(i+j) = case+ xs of | list_cons(_, xs1) => loop(xs1, j+1) | list_nil() => j // end of [loop] // in loop(xs, 0) end // end of [list_length]
This time, list_length is based on a tail-recursive function loop and thus can handle lists of any length in constant stack space. Note that the type assigned to loop indicates that loop takes a list of length i and an integer of value j for some natural numbers i and j and returns an integer of value i+j. Also, loop is verified to be terminating.
There is also a dependent datatype option in ATS for forming optional values:
As an example, the following function template list_last tries to find the last element in a given list:
fun{ a:t@ype } list_last {n:nat} .<>. ( xs: list(a, n) ) : option(a, n > 0) = let // fun loop {n:pos} .<n>. ( xs: list(a, n) ) : a = let val+ list_cons(_, xs1) = xs in case+ xs1 of | list_nil() => let val+list_cons(x, _) = xs in x end // end of [list_nil] | list_cons(_, _) => loop(xs1) end // end of [loop] // in case+ xs of | list_nil() => None() | list_cons _ => Some(loop(xs)) end // end of [list_last]
The inner function loop is evidently tail-recursive and it is verified to be terminating.
After a programmer becomes familar with list and option, there is little incentive for him or her to use list0 and option0 anymore. Internally, values of list and list0 have exactly the same representation and there are cast functions of zero run-time cost in ATS for translating between values of list and list0. The same applies to values of option and option0 as well.