ATS is a rich language with a highly expressive type system, and the syntax of ATS can be a daunting obstacle for beginners trying to understand ATS code. In order to alleviate this problem, we may employ colors to differentiate various syntatical components in a program written in ATS. The convention for coloring ATS syntax is summarized as follows:
ATS consists of a static component (statics), where types are formed and reasoned about, and a dynamic component (dynamics), where programs are constructed and evaluated.
There are also some primitive constants c in the statics, each of which is assigned a constant sort (or c-sort, for short) of the following form:
Intuitively, given static terms s_1,...,s_n of sorts sigma_1,...,sigma_n, respectively, c(s_1,...,s_n) is a static term of sort sigma if c is assigned the c-sort (sigma_1,...,sigma_n) => sigma. For instance, each boolean value is given the sort () => bool and each integer is given the c-sort () => int. In the following table, we list some of the commonly used constants in the statics of ATS:
~ | : | (int) => int | ||
+ | : | (int, int) => int | ||
- | : | (int, int) => int | ||
* | : | (int, int) => int | ||
/ | : | (int, int) => int | ||
< | : | (int, int) => bool | ||
<= | : | (int, int) => bool | ||
> | : | (int, int) => bool | ||
>= | : | (int, int) => bool | ||
== | : | (int, int) => bool | ||
<> | : | (int, int) => bool | ||
~ | : | (bool) => bool | ||
&& | : | (bool, bool) => bool | ||
|| | : | (bool, bool) => bool | ||
bool | : | () => type | ||
bool | : | (bool) => type | ||
int | : | () => type | ||
int | : | (int) => type |
Note that many symbols are overloaded in ATS. For instance, ~ is for negation on integers as well as on booleans; bool and int are both sorts and static constants. In ATS, a type refers to a static term s of sort type. For instance, bool and bool(true) are types, and int and int(2+3) are types, too.
sortdef nat = {a:int | a >= 0}It is important to note that a subset sort is not regarded as a (regular) sort. The sole purpose of introducing subset sorts is to provide a form of syntactic sugar (to be used together with quantifiers), which is to be explained shortly. Following are some commonly used subset sorts:
sortdef two = {a:nat | a < 2} sortdef sgn = {a:int | -1 <= a && a <= 1} sortdef sgn = {a:int | a == -1 || a == 0 || a == 1} // another definition sortdef pos = {a:int | a > 0} sortdef neg = {a:int | a < 0}where && and || stand for conjunction and disjunction, respectively. Note that we may also use semicolon ; for conjunction. For instance, the subset sort sgn can be defined as follows:
sortdef sgn = {a:int | -1 <= a; a <= 1}
The syntax for literal characters, literal integers, literal floats and literal strings is the same as that specified in the ANSI C.
The use of dependent types in ATS is ubiquitous. Given a boolean b, we can form a type bool(b) and the only value that can be assigned this type is the boolean value b. In other words, bool(b) is a singleton type. Similarly, we can form a type int(i) for each integer i and the only value that can be assigned this type is the integer value i. The dependent types Bool and Int, which are for boolean values and integer values, respectively, can be defined as follows in ATS:
typedef Bool = [a:bool] bool(a) typedef Int = [a:int] int(a)where we use [...] for existential quantification. Given an integer i, we can form a type string(i) for strings of length i. The type String for strings is defined as follows:
typedef String = [a:nat] string (a)
#if 0 #then // [#then] is optional (all the code here is commented out) #endifThe only requirement is that the code that is commented out must represent a list of syntactically correct declarations in ATS.
The code used for illustration is available here.
In ATS, we use the filename extensions ".sats" and ".dats" to indicate static and dynamic files, respectively. These two extensions have some special meaning attached to them and thus should not be replaced arbitrarily.
A static file may contain sort definitions, datasort declarations, static definitions, abstract type declarations, exception declarations, datatype declarations, macro definition, interfaces for dynamic values and functions, etc. In terms of functionality, a static file in ATS is similar to a header file (with the extension ".h") in C or an interface file (with the extension ".mli") in Objective Caml.
A dynamic file may contain everything in a static file. In addition, it may also contain defintions for dynamic values and functions.
In general, the syntax for constructing code in a static file can also be used for constructing code in a dynamic file. The only exception involves declaring interfaces for dynamic values and functions. For instance, in a static file, we can use the following syntax to declare interfaces (or types) for a value named pi and a function named area_of_circle.
val pi : double fun area_of_circle (radius: double): doubleWhen doing the same thing in a dynamic file, we need to use the following slightly different syntax:
extern val pi : double extern fun area_of_circle (radius: double): doublewhere extern is a keyword in ATS.
As a convention, we often use the filename extension ".cats" to indicate that a file contains some C code that is supposed to be combined with ATS code in certain manner. However, the use of this filename extension is not mandatory.
As is in C, file inclusion in ATS can be done by using the directive #include. For instance, the following line indicates that a file named foobar is included, that is, this line is to be replaced with the content of the file foobar:
#include "foobar.hats"Note that the included file is parsed according to the syntax for statics or dynamics depending whether the file is included in a static or dynamic file. As a convention, the name of an included file often ends with the extension ".hats".
A common use of file inclusion is to keep some constants, flags or parameters being defined consistently across a set of files. For instance, the file prelude/params.hats serves such a purpose.
File inclusion can also be used to emulate (in a limited but rather useful manner) the feature of functors (a module-level language construct) as is supported in languages such as SML and Objective Caml. Examples illustrating such a use of file inclusion can be found elsewhere.
The command for compiling ATS programs is atscc, which is stored in the directory "$ATSHOME/bin/" after ATS/Anairiats is installed. The ATS compiler translates ATS programs into C code, which is then compiled by the GNU C compiler gcc.
atscc accepts all the flags gcc recognizes and passes them to gcc directly. The following is a list of flags that are special to atscc:
As an example, the following command only typechecks the ATS code in "foo.dats" and "bar.sats":
atscc -tc foo.dats bar.satswhile the following command compiles "foo.dats" and "bar.sats" into "foo_dats.c" and "bar_sats.c", respectively, if typechecking "foo.dats" and bar.sats" succeeds:
atscc -cc foo.dats bar.satsTypically, atscc is used as follows:
atscc -o foobar -O3 foo.dats bar.satsWhen this command is issued, atscc first generates "foo_dats.c" and "bar_sats.c", and then invokes gcc to compile "foo_dats.c" and "bar_sats.c" to generate an executable file named "foobar". The flag "-O3" requires that gcc perform all optimizations classified at level 3 or below. We may also issue the following command to produce some debugging information for tools like gdb and valgrind:
atscc -o foobar -g foo.dats bar.satsATS programs can run with or without (systematic) garbage collection. If garbage collection is needed, then the following command can be issued to generate the executable "foobar":
atscc -o foobar -O3 foo.dats bar.sats -D_ATS_GCATS
For your information, the command atscc is not implemented in a scripting language. It is one of the very first programs implemented in ATS/Geizella (and then ported to ATS/Anairiats). The code for atscc (and several other commands) can be found at utils/scripts/.
The main function in ATS is declared as follows:
fun main {n:int | n > 0} (argc: int n, argv: &(@[string][n])): void = "mainats"So this function takes an integer argc greater than 0 and a string array argv of size argc and returns no value. The syntax argv: &(@[string][n]) indicates that argv is a call-by-reference argument. If we followed C++ kind of syntax, then this would be written as something like &argv: @[string][n].
The name "mainats", which is global, can be used in C code to refer to this function. When a program in ATS is compiled that implements the main function in ATS, the following implementation of the main function in C is automatically included in the C code generated by the ATS compiler:
main (int argc, char *argv[]) { // some initialization code is included here mainats (argc, argv) ; return 0 ; }As an example, the following ATS program prints out the command line on the standard output:
implement main (argc, argv) = let fun loop {n,i:nat | i <= n} // [loop] is tail-recursive (i: int i, argc: int n, argv: &(@[string][n])): void = if i < argc then begin if i > 0 then print (' '); print argv.[i]; loop (i+1, argc, argv) end // end of [if] // end of [loop] in loop (0, argc, argv); print_newline () end // end of [main]There are also situations where the function mainats may need to be implemented in C. If this happens, the function main_dummy needs to be implemented as follows:
implement main_dummy () = ()This allows the compiler to generate proper code for the main function in C.
As an example, we present as follows a typical scenario in GTK+ programming, where the function gtk_init needs to be called to modify the arguments passed from a command line:
// some function implemented in ATS extern fun main_work {n:pos} (argc: int n, argv: &(@[string][n])): void = "main_work" implement main_work (argc, argv) = ( (* some ATS code that does the main work *) ) implement main_dummy () = () // indicating [mainats] being implemented in C %{$ ats_void_type mainats (ats_int_type argc, ats_ptr_type argv) { gtk_init ((int*)&argc, (char ***)&argv) ; main_work (argc, argv) ; return ; } /* end of [mainats] */ %}
The code used for illustration is available here.
Given a function f of arity n, the standard syntax for applying f to n arguments v_1, ..., v_n is f(v_1, ..., v_n). It is, however, allowed in ATS to use infix notation for a binary function application, and prefix or postifix notation for a unary function application.
In ATS, each identifier can be assigned one of the following fixities: prefix, infix and postfix. The fixity declarations for many commonly used identifiers can be found in prelude/fixity.ats. We often use the name operator to refer to an identifier that is assigned a fixity.
For instance, we use the following syntax to declare that + and - are infix operators of precedence value 50:
infixl 50 + -After this declaration, we can write an expression like 1 + 2 - 3, which is parsed into -(+(1, 2), 3) in terms of the standard syntax for function application. The keyword infixl indicates that the declared infix operators are left-associative. For right-associative and non-associative infix operators, please use the keywords infixr and infix, respectively. If the precedence value is omitted in a fixity declaration, it is assumed to equal 0.
We can also use the following syntax to declare that iadd, fadd, padd and uadd are left-associative infix operators with a precedence value equal to that of the operator +:
infixl (+) iadd fadd padd uaddThis is useful as it is difficult in practice to remember the precedence values of (a large collection of) declared operators. Sometimes, we may need to specify that the precedence value of one operator in relation to that of another one. For instance, the following syntax declares that opr2 is a left-associative infix operator and its precedence value is that of opr1 plus 10:
infixl (opr1 + 10) opr2If the plus sign (+) is changed to the minus sign (-), then the precedence value of opr2 is that of opr1 minus 10.
We can also turn an identifier opr into a non-associative infix operator (of precedence value 0) by putting the backslash symbol \ in front of it. For instance, the expression $exp_1 \opr $exp_2 stands for opr ($exp_1, $exp_2), where $exp_1 and $exp_2 refer to some expressions, either static or dynamic.
The syntax for declaring (unary) prefix and postfix operators are similar. For instance, the following syntax declares that ~ and ? are prefix and postfix operators of precedence values 61 and 69, respectively:
prefix 61 ~ postfix 69 ?Please find an example involving a postfix operator here.
For a given occurrence of an operator, we can deprive it of its assigned fixity by simply putting the keyword op in front of it. For instance 1 + 2 - 3 can be writen as op- (op+ (1, 2), 3). It is also possible to (permanently) deprive an operator of its assigned fixity. For instance, the following syntax does so to the operators iadd, fadd, padd and uadd:
nonfix iadd fadd padd uaddLastly, please note that fixity declarations are lexically scoped, and each fixity declaration is only effective within its legal scope.
A symbol in ATS can be overloaded with multiple functions. The following syntax introduces a symbol with the name foo for overloading.
symintr foo // symbol introduction for overloadingSuppose that foo1, foo2 and foo3 are names of three functions in ATS. Then we can overload foo with these three functions as follows:
overload foo with foo1 overload foo with foo2 overload foo with foo3An overloaded function symbol is resolved according to the number of arguments it takes and, if needed, the types of these arguments.
The code used for illustration is available here.
There are two kinds of macros in ATS. One kind is C-like and the other kind is LISP-like, though they are much simpler as well as weaker than their counterparts in C and LISP, respectively.
The following two declarations bind the identifiers N1 and N2 to the abstract syntax trees (not strings) representing 1024 and N1 + N1, respectively:
#define N1 1024 #define N2 N1 + N1Suppose we have the following value declaration appearing in the scope of the above macro delarations:
val x = N1 * N2Then N1 * N2 first expands into 1024 * (N1 + N1), which further expands into 1024 * (1024 + 1024). Note that if this example is done in C, then N1 * N2 expands into 1024 * 1024 + 1024, which is different from what we have here. Also note that it makes no difference if we reverse the order of the previous macro definitions:
#define N2 N1 + N1 #define N1 1024If we declare a marco as follows:
#define LOOP (LOOP + 1)then an infinite loop is entered (or more precisely, some macro expansion depth is to be reached) when the identifier LOOP is expanded.
Macros in Long Form As a macro in short form can simply be considered a special kind of macro in long form, we first give some explanantion on the latter. A macro definition in long form is introduced via the use of the keyword macrodef. For instance, the following syntax introduces a macro name one that refers to some code, that is, abstract syntax tree (AST) representing the integer number 1.
macrodef one = `(1)The special syntax `(...), where no space is allowed between the backquote "`" and the left parenthsis "(", means to form an abstract syntax tree representing what is written inside the parentheses. This is often referred to as backquote-notation. Intuitively, one may think that a backquote-notation exerts an effect that "freezes" everything inside it. Let us now define another macro as follows:
macrodef one_plus_one = `(1 + 1)The defined macro name one_plus_one refers to some code (i.e., AST) representing 1 + 1. At this point, it is important to stress that the code representing 1 + 1 is different from the code representing 2. The macro name one_plus_one can also be defined as follows:
macrodef one_plus_one = `(,(one) + ,(one))The syntax ,(...), where no space is allowed between the comma "," and the left parenthesis "(", indicates the need to expand (or evaluate) whatever is written inside the parentheses. This is often referred to as comma-notation. A comma-notation is only allowed inside a backquote-notation. Intuitively, a comma-notation cancels out the "freezing" effect of the enclosing backquote-notation.
In addition to macro names, we can also define macro functions. For instance, the following syntax introduces a macro function square_mac:
macrodef square_mac (x) = `(,(x) * ,(x)) // [x] should refer to some codeHere are some examples that make use of square_mac:
fun square_fun (i: int): int = ,(square_mac `(i)) fun area_of_circle_fun (r: double): doubld = 3.1416 * ,(square_mac `(r))Macros in Short Form The previous macro function square_mac can also be defined as follows:
macdef square_mac (x) = ,(x) * ,(x) // [x] should refer to some codeThe keyword macdef introduces a macro definition in short form. The previous examples that make use of square_mac can now be written as follows:
fun square_fun (i: int): int = square_mac (i) fun area_of_circle_fun (r: double): doubld = 3.1416 * square_mac (r)In terms of syntax, a macro function in short form is just like an ordinary function. In general, if a unary macro function fmac in short form is defined as as follows:
macdef fmac (x) = $exp_defthen one may essentially think that a macro definition in long form is defined as follows:
macrodef fmac_long (x) = `($exp_def) // please note the backquoteand each occurrence of fmac($exp_arg) is automatically rewritten into ,(fmac_long(`($exp_arg))). Note that macro functions in short form with multiple arguments are handled in precisely the same fashion. The primary purpose for introducing macros in short form is to provide a form of syntax that seems more accessible. While macros in long form can be defined recursively (as is to be explained later), macros in short form cannot.
Recursive Macro Definitions (to be written later)
The code used for illustration is available here.
It is possible in ATS to differentiate at the level of types functions without environment from functions with environment. This is an indispensable feature for interfacing functions in C directly inside ATS as these functions all have no environment. A function with an environment is often referred to as a closure.
fn add (x: int):<cloref> int -<cloref> int = begin let fn add_x (y: int):<cloref> int = x + y in add_x end end // end of [add]The syntax <cloref> indicates that the defined functions add and add_x are both (persistant) closure references. Applying add to a given integer i, we obtain a unary function that adds i to its argument. This function is represented as a pair (add_x_env, [x -> i]), where we use [x->i] for something often referred to as an environment that binds x to i, and add_x_env for the (toplevel) function defined in the following (pseudo) code:
fun add_x_env (env, y) = env.x + yNote that the (pseudo) syntax env.x stands for the selection of the value to which x is bound in the environment env.
We use the name closure to refer to a pair like (add_x_env, [x -> i]). Given that functions may occur as arguments (of other functions) in ATS, it is necessary that all functions be represented uniformly as closures (if functions without environments cannot be differentiated from functions with environments at compile-time). For instance, the previously defined (toplevel) function add needs to be represented as a pair (add_env, []), where [] stands for the empty environment and add_env is the function defined in the following (pseudo) code:
fun add_env (env, x) = (add_x_env, [x -> x])In functional languages like ML and Haskell, all functions are represented as closures. Unfortunately, this requirement for representing all functions as closures, can cause a serious difficulty when we try to use in ATS a higher-order function implemented in C. Let us see a concrete example.
The function qsort is declared in <stdlib.h> with the following type:
void qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *));Clearly, qsort demands that its fourth argument be a function (not a closure). In order to use qsort in ATS directly, we need a way to construct functions represented as code pointers (instead of closures).
val foo : (T_1, ...., T_n) -<cloref> T_0
val bar : (T_1, ...., T_n) -<fun> T_0 fun bar (x_1: T_1, ...., x_n: T_n): T_0The previously mentioned function qsort can be given the following type in ATS:
fun qsort {a:viewt@ype} {n:nat} (base: & @[a][n], nmemb: size_t n, size: sizeof_t a, compar: (&a, &a) -<fun> int): voidThis type indicates that qsort itself is a function without environment and its fourth argument is also a function without environment. As an example, the following code implements a simple test on qsort:
fn test_qsort () = let fun pr_loop {n,i:nat | i <= n} .<n-i>. (A: &(@[int][n]), n: size_t n, i: size_t i): void = if i < n then begin if i > 0 then print ", "; print A.[i]; pr_loop (A, n, i+1) end // end of [if] // end of [pr_loop] // creating a linear array of size 10 val (pf_gc, pf_arr | p_arr, asz) = $arrsz {int} (1, 9, 2, 8, 3, 7, 4, 6, 5, 0) val () = (print "before quicksort:\n") val () = pr_loop (!p_arr, asz, 0) val () = print_newline () val () = begin qsort {int} (!p_arr, asz, sizeof<int>, lam (x, y) => compare (x, y)) end // end of [val] val () = (print "after quicksort: ") val () = pr_loop (!p_arr, asz, 0) val () = print_newline () in array_ptr_free {int} (pf_gc, pf_arr | p_arr) end // end of [test_qsort]
fn add0 (x: int):<fun> int -<cloref> int = begin let fn add0_x (y: int):<cloref> int = x + y in add0_x end end // end of [add0]The syntax :<fun> is an annotation indicating that the type ascribed to add0 is int -<fun> (int -<cloref> int). If we change int -<cloref> int into int -<fun> int, then an error is reported at compile-time as the function add0_x does require a nonempty environment that binds the variable x to some value. If add0 is needed in a place where a closure is expected, we can simply write lam x => add0 (x) instead.
When a function is declared via the keyword fun or fn, it is assumed by default that the function is without environment. For instance, the following code is equivalent to the previous implementation of add0:
fn add0 (x: int): int -<cloref> int = begin let fn add0_x (y: int):<cloref> int = x + y in add0_x end end // end of [add0]In contrast, the following code, which is used at the beginning of this tutorial, implements add as a closure:
fn add (x: int):<cloref> int -<cloref> int = begin let fn add_x (y: int):<cloref> int = x + y in add_x end end // end of [add]
The code used for illustration is available here.
As in C, a funciton in ATS may also take an indefinite number of arguments. Let us use the function printf in ATS, which corresponds to the function of the same name in C, as an example to explain this feature.
The type of printf is given as follows:
fun printf : {ts:types} (printf_c ts, ts) -> voidWe use printf_c for a type constructor that forms types for format strings (in C) when applied to lists of types. For instance, printf_c(char, double, int) is a type for format strings that require a character, a double, and an integer to be supplied. Given a character c, a double d and an integer i, @(c, d, i) is an argument of types (char, double, int), and the following expression is well-typed in ATS:
printf ("c = %c and d = %f and i = %i", @(c, d, i))
The type of the format string
"c = %c and d = %f and i = %i" is computed to be
printf_c (char, double, int)
and then @(c, d, i) is checked to be of the type
(char, double, int). Note that a format string must be a constant
in order for its type to be computed during typechecking.
As an example, we present as follows a program that prints out a multiplication table for single digits:
#define N 9
implement main (argc, argv) = loop1 (0) where {
// [loop1] and [loop2] are verified to be terminating based on the supplied metrics
// [.< N-i, 0 >.] is a termination metric
// Ignore it if you have not learned this feature yet
fun loop1 {i:nat | i <= N} .< N-i, 0 >. (i: int i): void =
if i < N then loop2 (i+1, 0) else ()
// end of [loop1]
// [.< N-i, N+1-j >.] is a termination metric
// Ignore it if you have notlearned this feature yet
and loop2 {i,j:nat | i <= N; j <= i} .< N-i, i-j+1 >. (i: int i, j: int j): void =
if j < i then begin
if (j > 0) then print '\t';
printf ("%1d*%1d=%2.2d", @(j+1, i, (j+1) * i));
loop2 (i, j+1)
end else begin
print_newline (); loop1 (i)
end // end of [if]
// end of [loop2]
} // end of [main]
The following text is the output of the program:
1*1=01 1*2=02 2*2=04 1*3=03 2*3=06 3*3=09 1*4=04 2*4=08 3*4=12 4*4=16 1*5=05 2*5=10 3*5=15 4*5=20 5*5=25 1*6=06 2*6=12 3*6=18 4*6=24 5*6=30 6*6=36 1*7=07 2*7=14 3*7=21 4*7=28 5*7=35 6*7=42 7*7=49 1*8=08 2*8=16 3*8=24 4*8=32 5*8=40 6*8=48 7*8=56 8*8=64 1*9=09 2*9=18 3*9=27 4*9=36 5*9=45 6*9=54 7*9=63 8*9=72 9*9=81
The code used for illustration is available here.
Probably the single most important optimization performed by the ATS/Anairiats compiler is the translation of tail-recursive function calls into direct (local) jumps.
// [sum1] is recursive but not tail-recursive
fun sum1 (n: int): int = if n > 0 then n + sum1 (n-1) else 0
This function is recursive but not tail-recursive. The stack space it
consumes is proportional to the value of its argument. Essentially,
the ATS compiler translates the definition of
sum1 into the following C code:
int sum1 (int n) { if (n > 1) return n + sum1 (n-1) ; else return 1 ; }When applied to an integer n, the following defined function sum2 also sums up integers from 1 to n.
fn sum2 (n: int): int = let // sum2 is non-recursive // [loop] is tail-recursive fun loop (n: int, res: int): int = if n > 0 then loop (n-1, res+n) else res // end of [loop] in loop (n, 0) end // end of [sum2]The inner function loop in the definition of sum2 is tail-recursive. The stack space consumed by loop is a constant independent of th value of the argument of sum2. Essentially, the ATS compiler translates the definition of sum2 into the following C code:
int sum2_loop (int n, int res) { loop: if (n > 1) { res = res + n ; n = n - 1; goto loop; } else { // do nothing } return res ; } int sum2 (int n) { return sum2_loop (n, 0) ; }
// [fn*] indicates the need to combine two or more functions
// so as to translate tail-recursive calls into direct jumps
fn* even (n: int): bool = if n > 0 then odd (n-1) else true
and odd (n: int): bool = if n > 0 then even (n-1) else false
The keyword fn* is used to indicate to the ATS compiler
that the functions even and
odd need to be combined together so as to turn (mutually)
tail-recursive function calls into direct jumps. Essentially, the ATS
compiler emits the following C code after compiling this example:
bool even_odd (int tag, int n) { bool res ; switch (tag) { 0: goto even ; 1: goto odd ; default : exit (1) ; } even: if (n > 0) { n = n - 1; goto odd; } else { res = true; goto done; } odd: if (n > 0) { n = n - 1; goto even; } else { res = false; goto done; } done: return res ; } /* end of [even_odd] */ bool even (int n) { return even_odd (0, n) ; } bool odd (int n) { return even_odd (1, n) ; }Note that mutually recursive functions can be combined in such a manner only if they all have the same return type. In the above case, both even and odd have the same return type bool.
When translating C code involving embedded loops, we often encounter mutual tail-recursion. For instance, the following C code prints out ordered pairs of digits:
int main (int argc, char *argv[]) { int i, j ; for (i = 0; i <= 9; i += 1) { for (j = i; j <= 9; j += 1) { if (i < j) printf (", ") ; printf ("(%i, %i)", i, j) ; } /* for */ printf ("\n") ; } /* for */ return 0 ; }A straightforward translation of the C code into ATS (in functional style) is given as follows:
implement main (argc, argv) = let
fn* loop1 {i:nat} (i: int i): void =
if i <= 9 then loop2 (i, i) else ()
// end of [loop1]
and loop2 {i,j:nat} (i: int i, j: int j): void =
if j <= 9 then begin
if i < j then begin
print ", "; printf ("(%i, %i)", @(i, j)); loop2 (i, j+1)
end // end of [if]
end else begin
print_newline (); loop1 (i+1)
end // end of [if]
// end of [loop2]
in
loop1 0
end // end of [main]
where the mutually tail-recursive funtions loop1 and loop2
correspond to the outer and inner loops in the C code, respectively.
The code used for illustration is available here.
In ATS, the programmer is allowed to supply termination metrics for verifing the termination of recursively defined functions. This is really an indispensable feature for supporting programming with theorem proving as proof functions, namely, functions representing proofs, must be proven to be pure and terminating.
A termination metric is a tuple (M1, ..., Mn) of natural numbers, where n >= 0 . We use the standard well-founded lexicographical ordering on natural numbers to order such tuples.
// [fact] implements the factorial function
fun fact {n:nat} .< n >. (n: int n): Int = if n > 0 then n * fact (n-1) else 1
The syntax .< n >. indicates that the metric supplied for verifying
the termination of the defined function is a singleton tuple (n). In
the definition of fact, the metric for the recursive call to
fact is (n-1), which is strictly less than (n). So the
function fact is terminating.
// [gcd] computes the greates common divisors of two positive integers
fun gcd {m,n:int | m > 0; n > 0} .< m+n >.
(m: int m, n: int n): [r:nat | 1 <= r; r <= min(m, n)] int r =
if m > n then gcd (m - n, n)
else if m < n then gcd (m, n - m)
else m
The syntax .< m+n >. indicates that the termination metric
(m+n) should be used to verify that the defined function gcd
is terminating. In the definition of gcd, the termination metric for
the first recursive call to gcd is (m-n)+n=m, which is
strictly less than the original termination metri m+n (as n
is positive); the termination metric for the second recursive call to
gcd is m+(n-m)=n, which is also strictly less than the
original termination metric m+n (as m is positive). Thus,
gcd is a terminating function.
As another example, we implement as follows the Ackermann's function:
// [ack] implements the Ackermann's function
fun ack {m,n:nat} .< m, n >.
(m: int m, n: int n): Nat =
if m > 0 then
if n > 0 then ack (m-1, ack (m, n-1)) else ack (m-1, 1)
else n+1
The syntax .< m, n >. indicates that the termination metric is
a pair of natural numbers: (m, n). We use the lexicographical
ordering on natural numbers to compare such metrics.
To verify that ack is terminating, we need to solve the
following constraints:
// mutually recursive functions
fun isEven {n:nat} .< 2*n+2 >. (n: int n): bool =
if n > 0 then ~(isOdd n) else true
and isOdd {n:nat} .< 2*n+1 >. (n: int n): bool =
if n > 0 then isEven (n-1) else false
Clearly, we may also verify the termination of these two functions by using
the metrics
.< n, 1 >.
and
.< n, 0 >.
for isEven and isOdd, respectively.
fun foo ():<> void and bar ():<> voidMoreover, suppose that the following implementation of foo is given in a file named foo.dats:
implement foo () = $Bar.bar ()while the following implementation of bar is given in another file named bar.dats:
implement bar () = $Foo.foo ()Clearly, neither foo nor bar is terminating. In practice, it is difficult to resolve this issue of calling cycles among terminating functions by solely relying on termination metrics. Instead, atscc can generate run-time code for detecting calling cycles among terminating functions if the flag -D_ATS_TERMINATION_CHECK is present. For instance, if foo.dats and bar.dats are compiled as follows:
atscc -D_ATS_TERMINATION_CHECK foo.dats and bar.datsthen a run-time error is to be reported to indicate a calling cycle when either foo.dats or bar.dats is loaded dynamically.
The code used for illustration is available here.
The type system of ATS can currently track the following list of effects that a program may incur during its execution:
The code used for illustration is available here.
Parametric polymorphism (or polymorphism for short) offers a flexible and effective approach to supporting code reuse. For instance, given a pair (v1, v2) where v1 is a a boolean and v2 a character, the function swap_bool_char defined below returns a pair (v2, v1):
fun swap_bool_char (xy: @(bool, char)): @(char, bool) = (xy.1, xy.0)Now suppose that a pair of integers need to be swapped, and this results in the implementation of the following function swap_int_int:
fun swap_int_int (xy: @(int, int)): @(int, int) = (xy.1, xy.0)The code duplication between swap_bool_char and swap_int_int is obvious, and it can be easily avoided by implementing a function template as follows:
fun{a,b:t@ype} swap (xy: @(a, b)): @(b, a) = (xy.1, xy.0)Now the functions swap_bool_char and swap_int_int can simply be replaced with swap<bool,char> and swap<int,int>, respectively. The function template swap cannot be compiled into executable binary code directly as the sizes of type variables a and b are unknown: The special sort t@ype is for classifying types whose sizes are unspecified. If swap<T1,T2> is used for some types T1 and T2 of known sizes, then an instantiation of swap is created where type variables a and b are replaced with T1 and T2, respectively, and then compiled into executable binary code. For those who know the feature of templates in C++, this should sound rather familiar. In contrast to swap, swap_type_type is defined below as a polymorphic function (rather than a function template):
fun swap_type_type {a,b:type} (xy: @(a, b)): @(b, a) = (xy.1, xy.0)This function can be compiled into executable binary code as the sizes of type variables a and b are known: The special sort type is for classifying types whose sizes equal exactly one word, that is, the size of a pointer. For example, the size of a string is one word, and the size of any declared datatype is also one word. Given strings s1 and s2, an application of swap_type_type to @(s1, s2) can be written as follows:
swap_type_type {string,string} @(s1, s2)where the expression {string,string} is often referred to as a static argument. As in this case, most static arguments do not have to be provided explicitly since they can be automatically inferred. However, such static arguments, if provided, can often enhance the quality and precision of the error messages reported in case of typechecking failure. This is a topic to be explored elsewhere in great depth.
extern fun{a,b:t@ype} swap (xy: @(a, b)): @(b, a)Just like a declared function interface, a declared template interface can be implemented. For instance, the following code implements the interface declared for the swap function template:
implement{a,b} swap (xy) = (xy.1, xy.0)This form of template implementation is often referred to as generic template implementation in contrast to specialized template implementation presented as follows. It is also allowed to implement specialized templates in ATS. For instance, the following code implements the above swap function template that is specialized with the type variables a and b being set to int and int, respectively:
implement swap<int,int> (xy) = let val x = xy.0 and y = xy.1; val s = x + y in (s - x, s - y) end // end of [swap<int,int>]
The code used for illustration is available here.
Lists are by far the most commonly used data structure in functional programming. We say that a data structure is persistent if it is heap-allocated and can only be freed by the GC. In contrast, a data structure is said to be linear if it is either stack-allocated or heap-allocated and can be freed by the user as well as by the GC.
The datatype for persistent lists in ATS is declared as follows:
datatype list (a:t@ype+, int) = // t@ype+: covariant
| list_nil (a, 0)
| {n:int | n >= 0} list_cons (a, n+1) of (a, list (a, n))
Given a type T and an integer I, the type list(T, I)
is for lists of length I in which each element is of type T.
Let us first introduce some abbreviations for the list constructors:
#define nil list_nil #define cons list_cons #define :: list_consThe following syntax creates a list consisting of 1, 2 and 3:
cons (1, cons (2, cons (3, nil ()))) // [nil ()] can be replaced with [nil]This kind of syntax is a bit unwieldy if longer lists need to be handled, and some alternatives are given as follows:
'[1, 2, 3] // the first character is quote (') $lst (1, 2, 3) // this is equivalent to '[1, 2, 3] $lst {Nat} (1, 2, 3) // [Nat] is given as the type for the list elementsThe interfaces for various functions on lists can be found in the file prelude/SATS/list.sats.
We now present some simple programs involving lists. The following code implements a function template that computes the length of a given list:
// This implementation is not tail-recursive
fun{a:t@ype} length {n:nat} (xs: list (a, n)): int n =
case+ xs of _ :: xs => 1 + length xs | nil () => 0
This is a rather poor implementation as it is not tail-recursive.
A better one, which is tail-recursive, is given as follows:
// This implementation is tail-recursive
fn{a:t@ype} length {n:nat} (xs: list (a, n)): int n = let
fun loop {i,j:int} (xs: list (a, i), j: int j): int (i+j) =
case+ xs of _ :: xs => loop (xs, j+1) | nil () => j
in
loop (xs, 0)
end // end of [length]
The code used for illustration is available here.
The keywords for introducing value identifiers and variable identifiers are val and var, respectiveily. The essential difference between a value identifier and a variable identifier is that the value referred to by the former cannot be changed during the course of evaluation while the value referred to by the latter can. This difference is clearly reflected in the following two styles of implementation of the factorial function:
// functional style fn fact_val (x: int): int = loop (x, 1) where { fun loop (x: int, res: int): int = if x > 0 then loop (x-1, x * res) else res // end of [loop] } // end of [fact_val] // imperative style fn fact_var (x: int): int = let var x: int = x; var res: int = 1 val () = while (x > 0) (res := x * res; x := x - 1) in res end // end of [fact_var]In a functional language such as ML, where variable identifiers are not available, an imperative style of implementation of the factorial function may have to be written as follows:
// imperative style based on persistent references, which looks // awkward and runs inefficiently (in terms of both time and memory) fn fact_ref (x: int): int = let val x = ref<int> (x); val res = ref<int> (1) val () = while (!x > 0) (!res := !x * !res; !x := !x - 1) in !res end // end of [fact_ref]An implementation as such is often written by a beginner in functional programming who has previously programmed in imperative languages (e.g., C, C++, Java). The function fact_ref is clearly inefficient as each call to it needs to allocate two references on heap (corresponding to x and res), which can only be reclaimed by GC later.
We give another implementation of the factorial function as follows that involves passing the addresses of local variables as function parameters:
fun loop {x,res:addr} (pf_x: !int @ x, pf_res: !int @ res | p_x: ptr x, p_res: ptr res): void = if !p_x > 0 then begin !p_res := !p_x * !p_res; !p_x := !p_x - 1; loop (pf_x, pf_res | p_x, p_res) end // end of [loop] // end of [loop] fn fact_var2 (x: int): int = let var x: int = x; var res: int = 1 in loop (view@ x, view@ res | &x, &res); res end // end of [fact_var2]Each variable identifier is assoicated with two pieces of properties: its address L, which is referred to by the same identifier, and a proof of VT@L, where VT is the viewtype of the content stored at L. For instance, in the implementation of fact_var2, we use view@ x for the proof associated with the variable x and &x for the address of x.
For each variable that is declared of viewtype VT, it is required that a proof of VT?@L is available at the end of the legal scope of the variable, where L is the address of the variable. This requirement guarantees that a local variable, while its address can be taken out of its scope, can never be accessed out of its scope due to the unavailability of a proof needed for accessing the address.
The code used for illustration is available here.
The feature of call-by-reference in ATS is similar to the corresponding one in C++. What is special in ATS is the way in which this feature is handled by the type system. In general, if f is given a type of the following form for some viewtypes VT1 and VT2:
(..., &VT1 >> VT2, ...) -> ...then a function call f(..., x, ...) on some variable x of the viewtype VT1 is to change the viewtype of x into VT2 upon its return. In the case where VT1 and VT2 are the same, &VT1 >> VT2 can simply be written as &VT1. Note that the variable x may be replaced with other forms of left-values.
As an example, an implementation of the factorial function is given as follows that makes use of call-by-reference:
fun fact (x: int): int = let fun loop {l:addr} (x: int, res: &int): void = if x > 0 then (res := res * x; loop (x-1, res)) else () var res: int = 1 in loop (x, res); res end // end of [fact]Note that if the line for introducing the variable res in the implementation is replaced with the following one:
val res: int = 1 // [res] is now a value, not a variable!then a type error should occur as res is no longer a left-value when it is passed as an argument to loop.
The code used for illustration is available here.
A significant achievement of ATS lies in its support for safe and flexible use of pointers. This is done in a programming paradigm that is often referred to as programming with theorem proving.
Generally speaking, stateful views are linear propositions for describing memory layouts. For instance, given a type T and a memory location L, T@L is a primitive (stateful) view stating that a value of type T is stored at L. We can also form compound views in terms of primitive views. For instance, given types T_1 and T_2 and an address L, we can form a view (T_1@L, T_2@L+sizeof(T_1)) to mean that a value of type T_1 and another value of type T_2 are stored at addresses L and L+sizeof(T_1), respectively, where sizeof(T_1) is the size of a value of type T. Given a term of some view V, we often say that the term proves the view V and thus refer to the term as a proof (of V).
There are two built-in functions ptr_get_t and ptr_set_t that are given the following types:
fun{a:t@ype} ptr_get_t : {l:addr} (a @ l >> a @ l | ptr l) -<> a fun{a:t@ype} ptr_set_t : {l:addr} (a? @ l >> a @ l | ptr l, a) -<> voidThese two functions are used to read from and write to a given pointer. Clearly, the type of ptr_get_t indicates that ptr_get_t requires a proof of view T@L for some type T when reading from a pointer L. This requirement disallows reading from a dangling pointer as such a proof cannot be found for any dangling pointers. Similarly, the type of ptr_set_t means that writing to a dangling pointer is also disallowed. When reading from a pointer L with a proof of view T@L, ptr_get_t consumes the proof and then generates a proof of the same view (and stores it in the same variable where the orginal proof was stored). On the other hand, when writing a value of type T to a pointer with a proof of view T?@L, where the type T? is for possibly uninitialized values of type T, ptr_set_t consumes the proof and then generates a proof of the view T@L, which simply attests to the fact that a value of type T is stored at L after the writing is done.
As an example, we implement a (template) function swap1 as follows that swaps the contents stored at two memory locations:
fn{a:viewt@ype} swap1 {l1,l2:addr} (pf1: !a @ l1, pf2: !a @ l2 | p1: ptr l1, p2: ptr l2): void = let val tmp = ptr_get_vt<a> (pf1 | p1) in ptr_set_vt<a> (pf1 | p1, ptr_get_vt (pf2 | p2)); ptr_set_vt<a> (pf2 | p2, tmp) endNote that the (linear) proofs are manipulated explicitly in the implementation of swap1. This can be burdensome in practice. In ATS/Anairiats, (certain) proof manipulation can also be done implicitly by the typechecker. For instance, the following code implements a (template) function swap2 that is equivalent to swap1:
fn{a:viewt@ype} swap2 {l1,l2:addr} (pf1: !a @ l1, pf2: !a @ l2 | p1: ptr l1, p2: ptr l2): void = let val tmp = !p1 in !p1 := !p2; !p2 := tmp endThere is no explicit manipulation of (linear) proofs in the implementation of swap2.
The code used for illustration is available here.
A reference is essentially a persistent array of size 1. Given a viewtype VT, the type for references to values of viewtype VT is ref(VT). The type constructor ref is abstract in ATS, though it can be defined as follows:
typedef ref (a: viewt@ype) = [l:addr] (vbox (a@l) | ptr l)The interfaces for various functions on references can be found in the file prelude/SATS/reference.sats.
Reference Creation There is a function template ref_make_elt of the following type:
fun{a:viewt@ype} ref_make_elt : a -<> ref aGiven a value v of some viewtype VT, ref_make_elt<VT> (v) creates a reference of type ref(VT). For instance, the following code creates some references:
val r_int = ref_make_elt<int> (0) val r_double = ref_make_elt<double> (0.0) val r_fun = ref_make_elt<int->int> (lam x => x+1)There is also a function ref_make_view_ptr of the following type for turning pointers into references:
fun ref_make_view_ptr : {a:viewt@ype} {l:addr} (vbox(a @ l) | ptr l) -> ref aNote that ref_make_view_ptr is a polymorphic function; it is not a function template.
Reference Read and Write Given a reference r, the syntax for reading from r is !r and the syntax for writing (a value v) to r is !r := v. Note that in both SML and Objective Caml, the syntax for writing (a value v) to a reference r is r := v (instead of !r := v).
There are also two function templates ref_get_elt and ref_set_elt of the following types:
fun{a:t@type} ref_get_elt : ref a -<!ref> a fun{a:t@type} ref_set_elt : (ref a, a) -<!ref> voidwhich can used to read from and write to a given reference, respectively. As an example, we give a simple implementation of counters as follows:
// [int64] is the type for 64-bit integers in ATS typedef counter = '{ // the type for counter objects get= () -<cloref1> int64 // get the value of the counter , set= int64 -<cloref1> void // set the value of the counter , inc= () -<cloref1> void // increase the value of the counter , dec= () -<cloref1> void // decrease the value of the counter } fn counter_make (): counter = let // [int64_of_int] coerce an integer into a 64-bit integer val r = ref_make_elt<int64> (int64_of_int 0) in '{ // record creation get= lam () => !r // read from [r] , set= lam (x) => !r := x // write to [r] , inc= lam () => !r := succ !r , dec= lam () => !r := pred !r } end // end of [counter_make]
The code used for illustration is available here.
Though arrays are widely used in practice, there are many difficult issues involving arrays that seem to have not been adequately addressed in type theory. In particular, programming can become rather tricky when arrays are used to store resources (e.g., pointers to allocated memory).
In ATS, there are two forms of arrays: linear arrays and persistent arrays. On one hand, a linear array cannot be shared but a persistent array can. On the other hand, a linear array can be explicitly freed but a persistent one cannot. However, a persistent array may be freed by GC. Although persistent arrays are built in terms of linear arrays, it seems easier to explain persistent arrays as such arrays are available in probably any programming language that may be considered practical.
fun{a:viewt@ype} array_size {n:nat} (A: array (a, n)): int nIf the size of such an array is needed, it may be stored somewhere explicitly by the programmer.
Array Creation There are several approaches to creating persistent arrays in ATS. For instance, the following code in ATS creates an array of integers:
val digits = array $arrsz{int}(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)The type assigned to the variable digits is array(int, 10). We can certainly be more precise by stating that digits is indeed an array of digits:
typedef digit = [i:nat | i < 10] int (i) // digits: array (digit, 10) val digits = array $arrsz{digit}(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
Array Subscripting The following two function templates array_get_elt_at and array_set_elt_at are for accessing and updating a persistent array, respectively:
fun{a:t@ype} array_get_elt_at {n:nat} (A: array (a, n), i: sizeLt n):<!ref> a fun{a:t@ype} array_set_elt_at {n:nat} (A: array (a, n), i: sizeLt n, x: a):<!ref> voidThe symbol [] is overloaded with both array_get_elt_at and array_set_elt_at:
overload [] with array_get_elt_at overload [] with array_set_elt_atAs an example, we implement as follows a function that squares each element in a given array of doubles:
fn array_square {n:nat} (A: array (double, n), sz: int n): void = loop (0) where { fun loop {i:nat | i <= n} .<n-i>. (i: int i):<cloptr1> void = if i < sz then let val x = A[i] in A[i] := x * x; loop (i+1) end else () } // end of [array_square]In this code, A[i] and A[i] := x * x are shorthands for array_get_elt_at(A, i) and array_set_elt_at(A, i, x * x), respectively.
The interfaces for various functions on matrices can be found in the file prelude/SATS/matrix.sats . Given a viewtype VT and an integer I and another integer J, the type matrix(VT, I, J) is for persistent matrices of dimensions I (row) and J (column) in which each element is of type VT. Internally, a value of the type matrix(VT, I, J) is just a pointer to a piece of consecutive memory holding I*J elements of the type VT, where the elements are placed in the row-major format.
Matrix Creation The approaches to creating matrices resemble those to creating arrays. For instance, the following code creates a square matrix of dimension 10 by 10:
val mat_10_10 = matrix (10, 10) $arrsz {Int} ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 , 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 , 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 , 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 , 40, 41, 42, 43, 44, 45, 46, 47, 48, 49 , 50, 51, 52, 53, 54, 55, 56, 57, 58, 59 , 60, 61, 62, 63, 64, 65, 66, 67, 68, 69 , 70, 71, 72, 73, 74, 75, 76, 77, 78, 79 , 80, 81, 82, 83, 84, 85, 86, 87, 88, 89 , 90, 91, 92, 93, 94, 99, 96, 97, 98, 99 ) // end of [val]Matrix Subscripting The following two function templates matrix_get_elt_at and matrix_set_elt_at are for accessing and updating a persistent matrix, respectively:
fun{a:t@ype} matrix_get_elt_at {m,n:nat} (A: matrix (a,m,n), i: sizeLt m, n: int n, j: sizeLt n): <!ref> a fun{a:t@ype} matrix_set_elt_at {m,n:nat} (A: matrix (a,m,n), i: sizeLt m, n: int n, j: sizeLt n, x: a): <!ref> voidNote that the number of columns in a matrix as well as the row and column indices are needed in order to perform matrix subscripting. As an example, the following code implements a function template that transposes a given square matrix in-situ:
fn{a:t@ype} matrix_transpose {n:nat} (A: matrix (a, n, n), n: int n): void = let fn get {i,j:nat | i < n; j < n} (A: matrix (a, n, n), i: int i, j: int j):<cloref1> a = matrix_get_elt_at__intsz (A, i, n, j) fn set {i,j:nat | i < n; j < n} (A: matrix (a, n, n), i: int i, j: int j, x: a):<cloref1> void = matrix_set_elt_at__intsz (A, i, n, j, x) overload [] with get; overload [] with set // [fn*] indicates a request for tail-call optimization fn* loop1 {i:nat | i <= n} .< n-i+1, 0 >. (i: int i):<cloptr1> void = begin if i < n then loop2 (i, 0) else () end and loop2 {i,j:nat | i < n; j <= i} .< n-i, i-j+1 >. (i: int i, j: int j):<cloptr1> void = begin if j < i then let val x = A[i,j] in A[i,j] := A[j,i]; A[j,i] := x; loop2 (i, j+1) end else begin loop1 (i+1) end end // end of [loop2] in loop1 0 end // end of [matrix_transpose] fn{a:t@ype} matrix_transpose {n:nat} (M: matrix (a, n, n), n: size_t n): void = let prval pf = unit_v () val () = matrix_iforeach_clo<a> {unit_v} (pf | M, !p_f, n, n) where { var !p_f = @lam (pf: !unit_v | i: sizeLt n, j: sizeLt n, _: a): void =<clo,!ref> if i > j then let val x = M[i, n, j] in M[i, n, j] := M[j, n, i]; M[j, n, i] := x end // end of [if] // end of [var] } // end of [val] prval unit_v () = pf in // empty end // end of [matrix_transpose]
The code used for illustration is available here.
Given a viewtype VT and integer I, the special static term @[VT][I] is a viewtype for an array of elements of the type VT (and thus the size of @[VT][I] is I times the size of VT). If VT happens to be a type (instead of viewtype), then @[VT][I] is also a type. If a linear proof of the view (@[VT][I]) @ L is available, then we say that a linear array of size I is stored at the location L in which each element is of the type VT. A view of the form @[VT][I] @ L is often referred to as an array view.
The interfaces for various functions on linear arrays can be found in the file prelude/SATS/array.sats. In particular, we have the following functions for manipulating array views:
viewdef array_v (a:viewt@ype, n:int, l: addr) = @[a][n] @ l praxi array_v_nil : {a:viewt@ype} {l:addr} () -<prf> array_v (a, 0, l) praxi array_v_unnil : {a:viewt@ype} {l:addr} array_v (a, 0, l) -<prf> void praxi array_v_cons : {a:viewt@ype} {n:nat} {l:addr} (a @ l, array_v (a, n, l+sizeof a)) -<prf> array_v (a, n+1, l) praxi array_v_uncons : {a:viewt@ype} {n:int | n > 0} {l:addr} array_v (a, n, l) -<prf> (a @ l, array_v (a, n-1, l+sizeof a))
fun{a:viewt@ype} array_ptr_alloc {n:nat} (asz: int n):<> [l:addr | l <> null] (free_gc_v (a, n, l), array_v (a?, n, l) | ptr l)The view constructor free_gc_v is abstract, and a proof of the view free_gc_v (a, n, l) can be thought of as a certificate that must be provided when the allocated array is to be freed. The array view array_v (a?, n, l) is for an uninitialized array of size n in which each element should be of the type a if initialized.
fn{a:t@ype} array_ptr_initialize_list {n:nat} (A: &(@[a?][n]) >> @[a][n], xs: list (a, n)) :<> void = loop (view@ A | &A, xs) where { fun loop {n: nat} {l:addr} .<n>. ( // [loop] is tail-recursive! pf_arr: !array_v (a?, n, l) >> array_v (a, n, l) | p_arr: ptr l, xs: list (a, n) ) :<> void = case+ xs of | list_cons (x, xs) => let prval (pf1_at, pf2_arr) = array_v_uncons {a?} (pf_arr) val () = !p_arr := x val () = loop (pf2_arr | p_arr + sizeof<a>, xs) in pf_arr := array_v_cons {a} (pf1_at, pf2_arr) end // end of [list_cons] | list_nil () => let prval () = array_v_unnil {a?} (pf_arr) in pf_arr := array_v_nil {a} () end // end of [list_nil] // end of [loop] } // end of [array_ptr_initialize_list]Note that the first argument of the function array_ptr_initialize_list is passed as a reference (call-by-reference). We use the expression view@ A for some particular left-value in which a proof of the view associated with the variable A is stored, and the expression &A for the address of the variable A.
fn{a:t@ype} array_ptr_swap {n:nat} {i,j:nat | i < n; j < n} (A: &(@[a][n]), i: int i, j: int j):<> void = begin let val tmp = A.[i] in A.[i] := A.[j]; A.[j] := tmp end end // endof [array_ptr_swap]The following two function templates are also available:
fun{a:t@ype} array_ptr_get_elt_at {n:nat} (A: &(@[a][n]), i: natLt n):<> a fun{a:t@ype} array_ptr_set_elt_at {n:nat} (A: &(@[a][n]), i: natLt n, x:a):<> void overload [] with array_ptr_get_elt_at overload [] with array_ptr_set_elt_atBecause of the overloading of [] with both array_ptr_get_elt_at and array_ptr_set_elt_at, the function array_ptr_swap can also be implemented as follows:
fn{a:t@ype} array_ptr_swap {n:nat} {i,j:nat | i < n; j < n} (A: &(@[a][n]), i: int i, j: int j):<> void = begin let val tmp = A[i] in A[i] := A[j]; A[j] := tmp end end // endof [array_ptr_swap]The use of [i] for .[i] possibly makes the code look more conventional.
For accessing an array of linear elements, the following function template array_ptr_takeout can be used:
fun{a:viewt@ype} array_ptr_takeout {n,i:nat | i < n} {l0:addr} ( pf: array_v (a, n, l0) | base: ptr l0, offset: int i ) :<> [l:addr] ( a @ l , a @ l -<lin,prf> array_v (a, n, l0) | ptr l ) // end of [array_ptr_takeout]
fun array_ptr_free {a:viewt@ype} {n:int} {l:addr} (pf_gc: free_gc_v (a, n, l), pf_arr: array_v (a?, n, l) | _: ptr l):<> voidNote that in order to free an array of I elements allocated at an address L, a proof of the view free_gc_v (VT, I, L) must be provided, where the viewtype VT is employed for specifying the size of each element in the array.
The code used for illustration is available here.
The representation of a string in ATS is exactly the same as in C, allowing most functions declared in the header file <string.h> to be imported into ATS directly. A variety of string functions are declared in prelude/SATS/string.sats.
Given an integer I, a string of length I is a pointer to some memory location where a seqence of bytes are stored: the first I bytes in this sequence are non-null bytes, and the next one is the null byte, and then the rest of bytes are unspecified. We use string both as a type and a type constructor of the sort int -> type. The type string is for strings of unspecified length while the type string (I) is for strings whose length equals I, and the following casting function is relating the former to the latter:
castfn string1_of_string (s: string): [n:nat] string n
fun string_is_at_end {n,i:nat | i <= n} (s: string n, i: int i):<> bool (i == n) = "atspre_string_is_at_end"Given a string of length n and a natural number i bounded by n, the function string_is_at_end tests whether i equals n. Essentially, it checks whether the ith character in the string, where counting starts from 0, is the null byte.
As an example, the following implementation of the length function for strings makes use of the function string_is_at_end:
fn string_length {n:nat} (s: string n):<> size_t n = loop (s, 0) where { fun loop {i:nat | i <= n} .<n-i>. (s: string n, i: size_t i):<> size_t n = if string_is_at_end (s, i) then i else loop (s, i+1) // end of [loop] } // end of [string_length]The following two functions string_get_char_at and string_set_char_at are for accessing and updating charaters stored in a string:
#define NUL '\000' fun string_get_char_at {n:nat} (s: string n, i: natLt n):<> [c:char | c <> NUL] char c fun string_set_char_at {n:nat} {c:char | c <> NUL} (s: string n, i: natLt n, c: char c):<!ref> void overload [] with string_get_char_at overload [] with string_set_char_atNote that only non-null characters can be stored in a string.
fun strbuf_is_at_end {m,n,i:nat | i <= n} (sb: &strbuf (m, n), i: int i):<> bool (i == n) fun strbuf_get_char_at {m,n:nat} (sb: &strbuf (m, n), i: natLt n):<> [c:char | c <> NUL] char c fun strbuf_set_char_at {m,n:nat} {c: char | c <> NUL} (sb: &strbuf (m, n), i: natLt n, c: char c):<> voidAs an example, the following program implements a function that turns each lowercase letter in a string buffer into the corresponding uppercase one:
fn strbuf_toupper {m,n:nat} (s: &strbuf (m, n)):<> void = loop (s, 0) where { extern fun toupper (c: c1har):<> c1har = "atspre_char_toupper" fun loop {i:nat | i <= n} .<n-i>. (s: &strbuf (m, n), i: size_t i):<> void = if strbuf_is_at_end (s, i) then () else let val () = s[i] := toupper (s[i]) in loop (s, i+1) end // end of [if] } // end of [strbuf_toupper]Note that c1har is just a shorthand for [c:char | c <> NUL] char c, which is a type for all non-null characters. Let us see another example. In C, the length of a string can be computed as follows:
int strlen (char *s) { char *p = s ; while (*p != '\000') ++p ; return (p - s) ; }We give as follows an implementation for computing the length of a string buffer that corresponds to the above C code:
extern fun strlen {m,n:nat} (s: &strbuf (m, n)):<> size_t n implement strlen {m,n} (s) = let stadef bsz = sizeof(byte) macdef bsz = sizeof<byte> fun loop {m,n:nat} {l:addr} {ofs:int} .<m>. ( pf: !strbuf (m, n) @ l , pf_mul: MUL (n, bsz, ofs) | p: ptr l ) :<> ptr (l + ofs) = let prval (pf1, pf2) = strbuf_v_uncons (pf) val c = !p in if (c = NUL) then let prval strbufopt_v_none pf2 = pf2 prval MULbas () = pf_mul in pf := strbuf_v_null (pf1, pf2); p end else let prval () = eqsize_byte_char () prval strbufopt_v_some pf2 = pf2 prval pf1_mul = mul_add_const {~1} (pf_mul) val p_end = loop (pf2, pf1_mul | p+bsz) in pf := strbuf_v_cons (pf1, pf2); p_end end // end of [if] end // end of [loop] val p_beg = &s prval pf_mul = mul_istot {n,bsz} () val p_end = loop (view@ s, pf_mul | p_beg) prval () = eqsize_byte_one () where { extern praxi eqsize_byte_one (): [bsz == 1] void } // end of [val] prval () = mul_elim {n,1} (pf_mul) in size1_of_ptrdiff1 (p_end - p_beg) end // end of [strlen]Please see prelude/SATS/string.sats for various dataview constructors and proof functions in this implementation. In general, it can be quite involved in ATS to handle strings in a style that is often employed in C code.
fun string_make_substring {n:int} {st,ln:nat | st + ln <= n} (s: string n, st: int st, ln: int ln):<> string lnIn order to implement a string creation function, we often need a function such as the following one for initializing a string buffer:
fun strbuf_initialize_cloptr {m,n:nat | m > n} ( pf_buf: !b0ytes m @ l >> strbuf (m, n) @ l | p: ptr l, n: int n, f: !natLt n -An implementation of strbuf_initialize_cloptr is given below:c1har ) :<> void
implement strbuf_initialize_cloptr {m,n} (pf_buf | p_buf, n, f) = loop (pf_buf | p_buf, f, 0) where { fun loop {i:nat | i <= n} {l:addr} .<n-i>. ( pf: !b0ytes (m-i) @ l >> strbuf (m-i, n-i) @ l | p: ptr l, f: !sizeLt n -<cloptr> c1har, i: size_t i) :<cloref> void = let prval () = eqsize_byte_char () prval (pf1, pf2) = array_v_uncons {byte?} (pf) prval pf1 = char_v_of_b0yte_v (pf1) in if i < n then let val () = !p := f (i) val () = loop (pf2 | p + sizeof<byte>, f, i + 1) prval () = pf := strbuf_v_cons (pf1, pf2) in // empty end else let val () = !p := NUL prval () = pf := strbuf_v_null (pf1, pf2) in // empty end // end of [if] end // end of [loop] } // end of [val]With strbuf_initialize_cloptr, the string creation function string_make_substring can be readily implemented as follows:
implement string_make_substring (str, st, ln) = let val (pf_gc, pf_buf | p_buf) = malloc_gc (ln + 1) prval () = free_gc_elim (pf_gc) // give the certificate to GC val () = strbuf_initialize_cloptr (pf_buf | p_buf, ln, lam (i) => str[st + i]) in string1_of_strbuf (pf_buf | p_buf) end // end of [string_make_substring]Note that a linear closure (cloptr) representing lam (i) => str[st + i] is formed for each call to strbuf_initialize_cloptr and this closure is freed once the call returns. The string created by a call to string_make_substring is persistent, and it can only be freed by GC.
The code used for illustration is available here.
The feature of datatypes in ATS is directly adopted from ML. In addition to what is available in ML, we can also form in ATS dependent datatypes and guarded recursive datatypes (GRDTs), also known as generalized algebraic datatypes (GADTs). There is also a form of linear datatypes (dataviewtypes) available in ATS, which is to be explained elsewhere.
datatype intlst0 = // simple datatype | INTLST0nil // the first bar (|) is optional | INTLST0cons of (int, intlst0) fun length_intlst0 (xs: intlst0): int = case+ xs of // [case+] demands exhaustive pattern matching // the bar (|) in the first clause is optional | INTLST0cons (_, xs) => 1 + length_intlst0 (xs) | INTLST0nil () => 0 // end of [length_intlst0]Please note that any valid identifiers for variables can also be used as identifiers for constructors. The pattern for a constructor C with no arguments is C( ). If we write C instead of C( ), then C is assumed to be a variable pattern, which matches any value.
datatype intlst1 (int) = // dependent datatype | INTLST1nil (0) // the first bar (|) is optional | {n:nat} INTLST1cons (n+1) of (int, intlst1 n) // end of [intlst1]We may see intlst1 as a refinement of intlst0. Given an integer n, intlst1(n) is a type for integer lists of length n. The syntax for declaring intlst1 introduces two constructors of the following types:
INTLST1nil : intlst1 0 INTLST1cons : {n:nat} (int, intlst1 n) -> intlst (n+1)So INTLST1nil is a list of length 0 and INTLST1cons constructs a list of length n+1 if given an integer and an integer list of length n.
The function for computing the length of a given integer list can now be implemented as follows:
fun length_intlst1 {n:nat} (xs: intlst1 n): int n = case+ xs of // the bar (|) in the first clause is optional // the bar (|) in the first clause is optional | INTLST1cons (_, xs) => 1 + length_intlst1 (xs) | INTLST1nil () => 0 // end of [length_intlst1]Note that length_intlst1 is assigned the following type:
{n:nat} intlst1 n -> int nwhich clearly indicates that the returned value of a call to length_intlst1 is the length of the argument of the call.
As another example, the following code implements the list subscripting function:
// the index is a natural number less than the size of the indexed length fun nth_intlst1 {n,i:int | 0 <= i; i < n} (xs: intlst1 n, i: int i): int = // [val+] demands exhaustive pattern matching let val+ INTLST1cons (x, xs) = xs in if i > 0 then nth_intlst1 (xs, i-1) else x end // end of [let] // end of [nth_intlst1]Note that the pattern matching involved in the value declaration val+ INTLST1cons (x, xs) = xs is guaranteed to be exhaustive as n is positive. As a consequence, there is no tag checking in the C code generated from the compilation of nth_intlst. This simple example demonstrates that safety can actually enhance efficiency.
// [t@ype] is a sort for types of unrestricted size datatype list (a:t@ype+, int) = // polymorphic datatype | nil (a, 0) | {n:nat} cons (a, n+1) of (a, list (a, n))The syntax used for declaring list introduces two constructors of the following types:
nil : {a:t@ype} list (a, 0) cons : {a:t@ype} {n:nat} (a, list (a, n)) -> list (a, n+1)We use a:t@ype+ (instead of a:t@ype) to indicate that the type constructor list is covariant at this argument, that is, list (T1, n) is considered to be a subtype of list (T2, n) whenever T1 is a subtype of T2.
The function for appending two polymorphic lists can be implemented as follows:
fun{a:t@ype} append_list {m,n:nat} (xs: list (a, m), ys: list (a, n)): list (a, m+n) = case+ xs of | cons (x, xs) => cons (x, append_list (xs, ys)) | nil () => ys // end of [append_list]This function definition is a template. It can be typechecked but cannot be compiled until the type variable a is instantiated with a type of known size. The feature of templates in ATS is to be explained elsewhere.
As another example, the following code implements a (template) function that zips together two given lists of the same length:
fun{a1,a2:t@ype} zip_list {n:nat} (xs1: list (a1, n), xs2: list (a2, n)): list ('(a1, a2), n) = case+ (xs1, xs2) of | (cons (x1, xs1), cons (x2, xs2)) => cons ('(x1, x2), zip_list (xs1, xs2)) | (nil (), nil ()) => nil () | (_, _) =/=>> () // none of these cases can occur // end of [zip_list]Note that the last clause in the definition of zip_list is presented only for the purpose of illustration, and it can simply be omitted.
The code used for illustration is available here.
A prop is similar to a type. If a prop is assigned to a dynamic term, then the term is guaranteed to be pure and total, that is, the evaluation of the term generates no effects and always terminates. We refer to dynamic terms classified by props as proof terms, or simply proofs. Dataprops are often declared for encoding recursively defined relations. For instance, multiplication on integers can be defined as follows in terms of integer addition:
0 * n | = | 0 | ||
(m+1) * n | = | m * n + n | ; if m >= 0 | |
m * n | = | -(-m * n) | ; if m < 0 | |
Let MUL(m, n, p) be a relation on integers such that MUL(m, n, p) holds if and only if m * n = p. Then this relation can be encoded by the followng dataprop:
dataprop MUL (int, int, int) = | {n:int} MULbas (0, n, 0) | {m,n,p:int | m >= 0} MULind (m+1, n, p+n) of MUL (m, n, p) | {m,n,p:int | m > 0} MULneg (~m, n, ~p) of MUL (m, n, p) // end of [MUL]Proposition (Totality) For each pair of integers m and n, there exists an integer p such that m * n = p holds.
This proposition can be encoded as the following prop in ATS:
{m,n:int} () -< prf > [p:int] MUL (m, n, p)which is assigned to the function MULprop_total defined below:
prfun MULprop_total {m,n:int} .< max(2*m,~2*m+1) >. (): [p:int] MUL (m, n, p) = sif m > 0 then MULind (MULprop_total {m-1,n} ()) else sif m < 0 then MULneg (MULprop_total {~m,n} ()) else MULbas () // end of [MULprop_total]Note that we use sif for constructing a static conditional in which the condition is a static proposition, that is, a static term of sort bool. The conditions for the two occurrences of sif are m > 0 and m < 0, which are both static propositions.
Given that the prop {m,n:int} () -< prf > [p:int] MUL (m, n, p) is inhabited, we know that the proposition it encodes must hold.
Proposition (Uniqueness) Given integers m,n,p1,p2 such that m * n = p1 and m * n = p2, then p1 = p2 holds.
This proposition can be encoded as the following prop in ATS:
{m,n,p1,p2:int} (MUL (m, n, p1), MUL (m, n, p2)) -< prf > [p1 == p2] voidwhich is assigned to the function MULprop_unique defined below:
prfun MULprop_unique {m,n,p1,p2:int} .< max(2*m, ~2*m+1) >. (pf1: MUL (m, n, p1), pf2: MUL (m, n, p2)): [p1 == p2] void = case+ (pf1, pf2) of | (MULbas (), MULbas ()) => () | (MULind pf1, MULind pf2) => MULprop_unique (pf1, pf2) | (MULneg pf1, MULneg pf2) => MULprop_unique (pf1, pf2) // end of [MULprop_unique]Therefore, the prop is inhabited and thus the proposition it encodes must hold.
The distributivity (over addition), commutativity and associativity of multiplication can all be esstablished similarly.
The code used for illustration is available here.
A view is a linear prop, where the meaning of the word linear is in the sense of linear logic. Dataviews are often declared to encode recursively defined relations on linear resources. As an example, we declare a dataview as follows to describe and then reason about the probably most common data structure in use: arrays.
dataview array_v (a: viewt@ype+, int, addr) = | {l:addr} array_v_none (a, 0, l) | {n:nat} {l:addr} array_v_some (a, n+1, l) of (a @ l, array_v (a, n, l+sizeof a))The syntax introduces a view constructor array_v and two proof constructors array_v_none and array_v_some associated with array_v, which are assigned the following props:
array_v_none : {a:viewt@ype} {l:addr} () -> array_v (a, 0, l) array_v_some : {a:viewt@ype} {n:nat} {l:addr} (a @ l, array_v (a, n, l+sizeof a)) -> array_v_some (a, n+1, l)Given a viewtype VT, an integer I and an address L, the view array_v(VT, I, L) indicates that an array of size I is located at the address L and each element in the array is of viewtype VT.
The prop assigned to array_v_none indicates that we can assume an array of size 0 located at any given address, while the prop assigned to array_v_some states that an array of I+1 elements of viewtype VT is located at some address L if an element of viewtype VT is located at L and also an array of I elements of viewtype VT is located at L+sizeof(VT), where sizeof(VT) is the size of VT, or more precisely, the size of an element of viewtype VT.
Let us establish some propoerties on arrays. The following proof function array_v_split shows that an array of size N can be decomposed into two adjacently located arrays of sizes I and N - I, respectively, where I can be any natural number satisfying I <= N.
(* * array_v_split * {a:viewt@ype} {n,i:nat | i <= n} {l:addr} {off: int} .<i>. * (pf_arr: array_v (a, n, l), pf_mul: MUL (i, sizeof a, off)) * : (array_v (a, i, l), array_v (a, n-i, l+off)) *) prfun array_v_split {a:viewt@ype} {n,i:nat | i <= n} {l:addr} {off: int} .<i>. (pf_arr: array_v (a, n, l), pf_mul: MUL (i, sizeof a, off)) : (array_v (a, i, l), array_v (a, n-i, l+off)) = sif i > 0 then let prval array_v_some (pf_at, pf_arr) = pf_arr prval MULind pf_mul = pf_mul prval (pf_arr1, pf_arr2) = array_v_split (pf_arr, pf_mul) in (array_v_some (pf_at, pf_arr1), pf_arr2) end else let prval MULbas () = pf_mul in (array_v_none (), pf_arr) endThe next proof function array_v_unsplit is the inverse of array_v_split: It states that two adjacently located arrays of sizes N_1 and N_2 can be combined into a single array of sizes N_1+N_2.
(* * array_v_unsplit * {a:viewt@ype} {n1,n2:nat} {l:addr} {off: int} .<n1>. * (pf_arr1: array_v (a, n1, l), * pf_arr2: array_v (a, n2, l+off), * pf_mul: MUL (n1, sizeof a, off)) * : array_v (a, n1+n2, l) *) prfun array_v_unsplit {a:viewt@ype} {n1,n2:nat} {l:addr} {off: int} .<n1>. (pf_arr1: array_v (a, n1, l), pf_arr2: array_v (a, n2, l+off), pf_mul: MUL (n1, sizeof a, off)) : array_v (a, n1+n2, l) = sif n1 > 0 then let prval array_v_some (pf_at, pf_arr1) = pf_arr1 prval MULind pf_mul = pf_mul prval pf_arr = array_v_unsplit (pf_arr1, pf_arr2, pf_mul) in array_v_some (pf_at, pf_arr) end else let prval array_v_none () = pf_arr1 prval MULbas () = pf_mul in pf_arr2 endThe following proof function states that given an array of size N and a natural number I satisfying I < N, the element indexed by I can be taken out of the array, leaving the array with a hole inside.
(* * array_v_takeout * {n,i:nat | i < n} {l:addr} {off: int} * (pf_arr: array_v (a, n, l), pf_mul: MUL (i, sizeof a, off)) * : (a @ (l+off), a @ (l+off) -<lin> array_v (a, n, l)) *) prfun array_v_takeout {a:viewt@ype} {n,i:nat | i < n} {l:addr} {off: int} .<i>. (pf_arr: array_v (a, n, l), pf_mul: MUL (i, sizeof a, off)) : (a @ (l+off), a @ (l+off) -<lin> array_v (a, n, l)) = sif i > 0 then let prval array_v_some (pf_at, pf_arr) = pf_arr prval MULind pf_mul = pf_mul prval (pf_out, pf_rst) = array_v_takeout (pf_arr, pf_mul) in (pf_out, llam pf_out => array_v_some (pf_at, pf_rst pf_out)) end else let prval MULbas () = pf_mul prval array_v_some (pf_at, pf_arr) = pf_arr in (pf_at, llam pf_at => array_v_some (pf_at, pf_arr)) endThe following code shows how the proof function array_v_takeout can be used to implement functions for accessing and updating arrays.
extern // a template function for read through a pointer fun{a:t@ype} ptr_get {l:addr} (pf: a @ l | p: ptr l): (a @ l | a) extern // a template function for write through a pointer fun{a:t@ype} ptr_set {l:addr} (pf: a @ l | p: ptr l, x: a): (a @ l | void) fn{a:t@ype} // a template function for array read array_ptr_get_at {n,i:nat | i < n} {l:addr} (pf_arr: array_v (a, n, l) | p: ptr l, i: size_t i) : (array_v (a, n, l) | a) = let val (pf_mul | off) = mul2_size1_size1 (i, sizeof<a>) prval (pf_elt, pf_rst) = array_v_takeout {a} (pf_arr, pf_mul) val (pf_elt | x) = ptr_get<a> (pf_elt | p + off) in (pf_rst pf_elt | x) end fn{a:t@ype} // a template function for array write array_ptr_set_at {n,i:nat | i < n} {l:addr} (pf_arr: array_v (a, n, l) | p: ptr l, i: size_t i, x: a) : (array_v (a, n, l) | void) = let val (pf_mul | off) = mul2_size1_size1 (i, sizeof<a>) prval (pf_elt, pf_rst) = array_v_takeout {a} (pf_arr, pf_mul) val (pf_elt | ()) = ptr_set<a> (pf_elt | p + off, x) in (pf_rst pf_elt | ()) endNote that the code here is primarily for the purpose of illustration. In practice, array access and update functions in ATS/Anairiats are implemented as primtives for the sake of efficiency.
The code used for illustration is available here.
The feature of dataviewtype in ATS is an advanced one.
A viewtype is just a linear type. The name viewtype is chosen as a viewtype often consists of a view and a type. A dataviewtype is similar to a datatype, but it is linear. With a dataviewtype, the programmer is allowed to explicitly free (or deallocate) in a safe manner the memory used for storing constructors associated with the dataviewtype. This is particularly important in a situation where garbage collection needs to be significantly reduced or even completely avoided.
An an example, the following dataviewtype list_vt is declared to model singly-linked lists in ATS:
// [list_vt] is declared to model singly-linked lists dataviewtype list_vt (a:viewt@ype+, int) = | list_vt_nil (a, 0) | {n:nat} list_vt_cons (a, n+1) of (a, list_vt (a, n))Given a viewtype VT and an integer I, the viewtype list_vt(VT, I) is for singly-linked lists of length I in which each element is of viewtype VT. Let us define List_vt as follows:
viewtypedef List_vt (a:viewt@ype) = [n:nat] list_vt (a, n)Then roughly speaking, the viewtype List_vt(VT) corresponds to the following struct type in C:
typedef struct sllist_struct { VT head ; sllist_struct *tail ; } *sllist ;where VT refers to some type in C.
We now assume that nil and cons have been introduced as shorthands for list_vt_nil and list_vt_cons, respectively.
// An implementation of the length function on singly-linked lists fn{a:viewt@ype} list_vt_length {n:nat} (xs0: !list_vt (a, n)): int n = let fun loop {i,j:nat} .< i >. (xs: !list_vt (a, i), j: int j): int (i+j) = case+ xs of | cons (_, !p_xs1) => begin let val n = loop (!p_xs1, j+1) in fold@ xs; n end end | nil () => (fold@ xs; j) in loop (xs0, 0) end // end of [list_vt_length]The function list_vt_length approximately corresponds to the following function sllist_length implemented in C:
int sllist_length (sllist xs) { int i = 0 ; while (xs != NULL) { i += 1 ; xs = xs->tail ; } return i ; }
// An implementation of the append function on singly-linked lists fun{a:viewt@ype} list_vt_append {m,n:nat} (xs0: &list_vt (a, m) >> list_vt (a, m+n), ys: list_vt (a, n)): void = case+ : (xs0: list_vt (a, m+n)) => xs0 of | cons (_, !p_xs) => (list_vt_append (!p_xs, ys); fold@ xs0) | ~nil () => (xs0 := ys) // end of [list_vt_append]The function list_vt_append approximately corresponds to the following function sllist_append implemented in C:
void sllist_append (sllist *p_xs, sllist ys) { sllist xs ; xs = *p_xs ; while (xs != NULL) { p_xs = &xs->tail ; xs = *p_xs ; } *p_xs = ys ; return ; } /* end of [sllist_append] */
// An implementation of the reverse function on singly-linked lists fn{a:viewt@ype} list_vt_reverse {n:nat} (xs: &list_vt (a, n)): void = let fun revapp {m,n:nat} .< m >. (xs: list_vt (a, m), ys: list_vt (a, n)): list_vt (a, m+n) = case+ xs of | cons (_, !p_xs1) => begin let val xs1 = !p_xs1 in !p_xs1 := ys; fold@ xs; revapp (xs1, xs) end end | ~nil () => ys in xs := revapp (xs, nil ()) end // end of [list_vt_reverse]The function list_vt_reverse approximately corresponds to the following function sllist_reverse implemented in C:
void sllist_reverse (sllist *p_xs) { sllist xs, ys, *p_xs1, xs1 ; xs = !p_xs ; ys = NULL ; while (xs != NULL) { p_xs1 = &xs->tail ; xs1 = *p_xs1 ; *p_xs1 = ys ; ys = xs ; xs = xs1 } *p_xs = xs ; return ; } /* end of [sllist_reverse] */
// An implementation of quicksort on singly-linked lists fun{a:viewt@ype} list_vt_qsort {n:nat} .< n, 0 >. (lte: (!a, !a) -> bool, xs: list_vt (a, n)): list_vt (a, n) = case+ xs of | cons (!p_x1, !p_xs1) => let val xs1 = !p_xs1 in list_vt_par<a> ( view@ (!p_x1), view@ (!p_xs1) | lte, xs, p_xs1, p_x1, xs1, nil (), nil () ) end | nil () => (fold@ xs; xs) // end of [list_vt_qsort] and // [list_vt_par] for partition list_vt_par {l0,l1:addr} {p,q,r:nat} .< p+q+r, p+1 >. (pf0: a @ l0, pf1: List_vt a? @ l1 | lte: (!a, !a) -> bool, node: list_vt_cons_unfold (l0, l1), node1: ptr l1, p_x0: ptr l0, xs: list_vt (a, p), l: list_vt (a, q), r: list_vt (a, r)) : list_vt (a, p+q+r+1) = case+ xs of | cons (!p_x1, !p_xs1) => let val xs1 = !p_xs1 in if lte (!p_x1, !p_x0) then begin !p_xs1 := l; fold@ xs; list_vt_par<a> (pf0, pf1 | lte, node, node1, p_x0, xs1, xs, r) end else begin !p_xs1 := r; fold@ xs; list_vt_par<a> (pf0, pf1 | lte, node, node1, p_x0, xs1, l, xs) end // end of [if] end // end of [cons] | ~nil () => let var l = list_vt_qsort<a> (lte, l) and r = list_vt_qsort<a> (lte, r) in !node1 := r; fold@ node; r := node; list_vt_append<a> (l, r); l end // end of [nil] // end of [list_vt_par]
The code used for illustration is available here.
The feature of pattern matching in ATS is adopted from ML. However, this feature becomes much more elaborated in the presence of dependent types as well as linear types. We present as follows several examples to illustrate some typical uses of pattern matching in ATS.
Let us first declare a polymorphic datatype list0 as follows for representing lists:
datatype list0 (a:t@ype) = nil0 (a) | cons0 (a) of (a, list0 a)Notice that this is not a dependent datatype. Suppose we need to implement a function zip0 that should only zip together two lists of the same length. This may be done as follows:
fun{a1,a2:t@ype} // [zip0] is a template zip0 (xs: list0 a1, ys: list0 a2): list0 '(a1, a2) = case+ (xs, ys) of // [case+] indicates the exhaustiveness of pattern matching | (cons0 (x, xs), cons0 (y, ys)) => cons0 ('(x, y), zip0 (xs, ys)) | (nil0 (), nil0 ()) => nil0 ()The problem with this implementation is that a pattern matching failure occurs if the function zip0 is mistakenly applied to two lists of unequal length, resulting in a run-time exception. In practice, a special exception is often declared as follows to handle the case of pattern matching failure, enabling more accurate run-time information to be reported.
exception ZipException fun{a1,a2:t@ype} // [zip0] is a template zip0 (xs: list0 a1, ys: list0 a2): list0 '(a1, a2) = case+ (xs, ys) of // [case+] indicates the exhaustiveness of pattern matching | (cons0 (x, xs), cons0 (y, ys)) => cons0 ('(x, y), zip0 (xs, ys)) | (nil0 (), nil0 ()) => nil0 () | (_, _) => $raise ZipExceptionWe can also declare a polymorphic dependent datatype list1 as follows for representing lists:
datatype list1 (a:t@ype, int) = | nil1 (a, 0) | {n:nat} cons1 (a, n+1) of (a, list1 (a, n))Given a type T and an integer I, the type list1(T, I) is for lists of length I in which each element is of type T. The function for zipping two lists of the same length can now be implemented as follows:
fun{a1,a2:t@ype} // [zip1] is a template zip1 {n:nat} (xs: list1 (a1, n), ys: list1 (a2, n)): list1 ('(a1, a2), n) = case+ (xs, ys) of // [case+] indicates the exhaustiveness of pattern matching | (cons1 (x, xs), cons1 (y, ys)) => cons1 ('(x, y), zip1 (xs, ys)) | (nil1 (), nil1 ()) => nil1 ()Note that the typechecker can verify that pattern matching in this implementation is exhaustive.
Given that pattern matching in ATS is done sequentially, that is, from left to right in a given row and from top to bottom for all the clauses in a given case-expression, it is natural to expect that the following code gives an equivalent implementation of zip1:
fun{a1,a2:t@ype} zip1 {n:nat} (xs: list1 (a1, n), ys: list1 (a2, n)): list1 ('(a1, a2), n) = case+ (xs, ys) of // [case+] indicates the exhaustiveness of pattern matching | (cons1 (x, xs), cons1 (y, ys)) => cons1 ('(x, y), zip1 (xs, ys)) | (_, _) => nil1 ()This, however, is not the case as typechecking in ATS does not assume the sequentiality of pattern matching. In this example, the second pattern matching clause is typechecked indepenent of the first one, resulting in a type error. In order to demand that the second clause be typechecked under the assumption that the first clause is not chosen at run-time, the following special syntax is needed:
| (_, _) =>> nil1 ()The special arrow =>> indicates that a clause formed with this arrow needs to be typechecked under the assumption that all of the previous clauses (in the same case-expression) are not chosen at run-time.
In the C code generated from compiling the function zip1, there are two tag checkes: they check whether the first and the second arguments of zip1 are empty. Clearly, only one tag check is necessary as both arguments are of the same length. We can modify the implementation of zip1 as follows so as to use only one check:
fun{a1,a2:t@ype} // only one tag check zip1 {n:nat} (xs: list1 (a1, n), ys: list1 (a2, n)) : list1 ('(a1, a2), n) = begin case+ xs of | cons1 (x, xs) => begin let val+ cons1 (y, ys) = ys in cons1 ('(x, y), zip1 (xs, ys)) end end | _ =>> nil1 () end // end of [zip1]Note that the keyword val+ indicates that the patthern matching following it is exhaustive and thus needs no tag check. This simple example demonstrates in concrete terms that safety can enhance efficiency.
The code used for illustration is available here.
Exceptions can provide a convenient approach to handling abnormal or erroneous cases. This convenience, however, is not without a cost as programs making use of exceptions can often become rather difficult to reason about.
In ATS, the type for exceptions is denoted by exn, which actually is a viewtype of the sort viewtype. So the size of exn equals the size of a pointer. The type exn may be thought of as a dataviewtype (i.e., linear datatype) with which an extensible set of constructors are associated. The syntax for declaring an exception is rather similar to that for declaring a constructor (associated with a datatype). For instance, three exceptions (or more precisely, exception constructors) are declared as follows:
exception Fail // Fail: exn exception Fail_msg of string // Fail_msg: string -> exn // Fail_msgs : {n:nat} (int n, list (string n)) -> exn exception {n:nat} Fail_msgs of (int n, list (string, n))All exceptions in ATS are static and there is no issue of exceptions being generative as is in Standard ML. It is allowed that an exception be declared in a closed scope so that the declared exception becomes inaccessible outside the closed scope. This is analogous to declaring a static variable inside the body of a function in C.
We present as follows an example that involves an exception being raised and then captured. A binary tree is a Braun tree if it is empty or it satisfies the property that its left and right children are Braun trees and the size of the left child minus the size of the right child equals 0 or 1. The following code implements a function that checks whether a binary tree is a Braun tree:
datatype bt = E | B of (bt, bt) // datatype for binary trees fn isBraunTree (bt0: bt): bool = let exception NotBraunTree fun aux (bt: bt): int = case+ bt of | B (bt1, bt2) => let val ls = aux bt1 and rs = aux bt2 in if (ls >= rs && rs+1 >= ls) then ls+rs+1 else $raise NotBraunTree() end | E () => 0 in try let val s = aux bt0 in true end with ~NotBraunTree() => false endThe type of an exception constructor cannot contain free static variables. In other words, each exception constructor can be lifted to the toplevel. For instance, the following code is illegal: the exception constructor Foo cannot be lifted to the toplevel as its type contains a free type variable.
fun{a:t@ype} f (x: a) = let exception Foo of a // Foo: a -> exn in // ... endSometimes, we may want to raise an exception carrying a value of some variable type. This is shown, for instance, in the following illegal code that tries to implement the list subscripting function:
extern fun{a:t@ype} list_iforeach (xs: List a, f: (Nat, a) -> void): void fn{a:t@ype} nth (xs: List a, n: Nat): a = let exception Found of a fn f (i: Nat, x: a): void = if i = n then $raise (Found x) in try let val () = list_iforeach (xs, f) in $raise ListSubscriptException () end with ~Found x => x // end of [try] end // end of [nth]A method to work around the issue is shown in the following code:
fn{a:t@ype} nth (xs: List a, n: Nat): a = let exception Found of () val ans = ref_make_elt<Option a> (None) fn f (i: Nat, x: a): void = if i = n then (!ans := Some x; $raise Found ()) val () = (try let val () = list_iforeach (xs, f) in $raise ListSubscriptException () end with ~Found () => () ) : void // end of [try] in case !ans of | Some x => x | None () => $raise ListSubscriptException () end // end of [nth]
The code used for illustration is available here.
The core of ATS is a functional language in which functions are first-class values. A higher-order function is a function whose arguments also include functions.
Let us use BT to range over base types such as char, double, int, string, etc. A simple type T is formed according to the following inductive definition:
As an example, we implement as follows a 2nd-order function find_root that takes a function f from integers to integers as its argument and searches for a root of f by enumeration:
fn find_root (f: int -<cloref1> int): int = let fun aux (f: int -<cloref1> int, n: int): int = if f (n) = 0 then n else begin if n <= 0 then aux (f, ~n + 1) else aux (f, ~n) end in aux (f, 0) end // end of [fint_root]The function find_root computes f(0), f(1), f(-1), f(2), f(-2), ... until it finds the first integer i satisfying f(i) = 0.
As another example, we implement the Newton-Raphson's method for finding roots of functions on reals:
val epsilon = 1E-6 (* precision *) // Newton-Raphson's method for finding roots // [f1] is a derivative of [f] fn newton_raphson ( f: double -<cloref1> double , f1: double -<cloref1> double , x0: double ) : double = let fun loop ( f: double -<cloref1> double , f1: double -<cloref1> double , x0: double ): double = let val y0 = f x0 in if abs (y0 / x0) < epsilon then x0 else begin let val y1 = f1 x0 in loop (f, f1, x0 - y0 / y1) end end end // end of [loop] in loop (f, f1, x0) end // end of [newton_raphson] // square root function fn sqrt (c: double): double = newton_raphson (lam x => x * x - c, lam x => 2.0 * x, 1.0) // cubic root function fn cbrt (c: double): double = newton_raphson (lam x => x * x * x - c, lam x => 3.0 * x * x, 1.0)
The code used for illustration is available here.
Though ATS is a language based on eager call-by-value evaluation, it also allows the programmer to perform lazy call-by-need evaluation. In ATS, there is a special language construct $delay that can be used to delay or suspend the evaluation of an expression and a special function lazy_force that can be called to resume a suspended computation.
There is a special type constructor lazy of the sort (t@ype) => type in ATS, which forms a (boxed) type when applied to a type. On one hand, given an expression exp of type T, $delay (exp) forms a value of type lazy(T) that represents the suspended evaluation of exp. On the other hand, given a value v of type lazy(T) for some type T, lazy_force(v) resumes the suspended evaluation represented by v and returns a result of type T (if the resumed evaluation terminates). The interface for the (template) function lazy_force is:
fun{a:t@ype} lazy_force : lazy a -> aThe special operatior ! in ATS is overloaded with the function lazy_force.
In the file prelude/SATS/lazy.sats, the following datatype stream_con and type stream are declared for representing (lazy) streams:
datatype stream_con (a:t@ype+) = | stream_nil (a) | stream_cons (a) of (a, stream a) where stream (a:t@ype) = lazy (stream_con a)Also, a number of common functions on streams are implemented in prelude/DATS/lazy.dats.
To simplify the presentation, we use the following syntax to create shorthands for the stream constructors stream_nil and stream_cons:
#define nil stream_nil #define :: stream_consThe following code implements a stream representing the sequence of natural numbers starting from 2:
typedef N2 = [n:int | n >= 2] int n val N2s: stream N2 = from 2 where { fun from (n: N2): stream N2 = $delay (n :: from (n+1)) }More interestingly, the stream of prime numbers can be readily computed as follows:
fun sieve (ns: stream N2):<1,~ref> stream N2 = // [val-] means no warning message from the compiler let val- n :: ns = !ns in $delay (n :: sieve (stream_filter<N2> (ns, lam x => x nmod n > 0))) end val primes: stream N2 = sieve N2s // find the nth prime fn nprime {n: pos} (n: int n): N2 = stream_nth (primes, n-1)Note that the keyword val- indicates to the typechecker that no message warning of nonexhaustive pattern matching should be reported.
As another example, the following code based on lazy evaluation computes Fibonacci numbers:
val // the following values are defined mutually recursively rec fibs_1: stream int64 = $delay (one :: fibs_2) // fib1, fib2, ... and fibs_2: stream int64 = $delay (one :: fibs_3) // fib2, fib3, ... and fibs_3: stream int64 = // fib3, fib4, ... stream_map2<int64,int64,int64> (fibs_1, fibs_2, lam (x, y) => x + y) // find the nth Fibonacci number (counting starts from 1) fn nfib {n:pos} (n: int n): int64 = stream_nth (fibs_1, n-1)
Please find extensive use of lazy evaluation in the implementation of a package for parsing combinators: contrib/parcomb.
The code used for illustration is available here.
The feature of linear lazy evaluation, which is considerably advanced, addresses the issue of freeing up resources held by a lazy value (that is, a thunk representing a delayed computation). Let us first see a concrete example involving the issue. In prelude/SATS/file.sats the following function char_stream_make_file is declared:
// making a lazy char stream out of a file handle fun char_stream_make_file (fil: FILEref):<1,~ref> stream (char)Note that the syntax <1,~ref> indicates that the function may have all kinds of effects except ref. As suggested by its type, the function char_stream_make_file turns a file handle into a (lazy) stream of characters. We give an implementation of this function as follows:
implement char_stream_make_file (fil) = let val c = fgetc0_err (fil) in if c <> EOF then let val c = char_of_int (c) in $delay (stream_cons (c, char_stream_make_file fil)) end else begin let val () = fclose0_exn (fil) in $delay (stream_nil ()) end end // end of [if] end // end of [char_stream_make_file]Clearly, there is an opened file handle inside each character stream formed by calling char_stream_make_file until all the characters in the file are put into the stream. If such a character stream is discarded during evaluation, then the file handle inside it may be left unclosed indefinitely. This can cause a serious problem in a situation where a large number of character streams need to be formed by calling char_stream_make_file. In general, the inability (or the lack of the ability) to directly free the resources held by lazy values often makes it rather difficult or even infeasible to employ lazy evaluation in a setting that requires great precision in resource management.
// making a linear lazy char stream out of a file handle fun char_stream_vt_make_file {m:file_mode} {l:addr} (pf_mod: file_mode_lte (m, r), pf_fil: FILE m @ l | p_fil: ptr l) :<1,~ref> stream_vt (char)As suggested by its type, the function char_stream_vt_make_file turns a file handle into a linear stream of characters. An implementation of this function is given as follows:
implement char_stream_vt_make_file (pf_mod, pf_fil | p_fil) = let val c = fgetc1_err (pf_mod | !p_fil) in if c >= 0 then let // c <> EOF val c = char_of_int (c) in $ldelay ( stream_vt_cons (c, char_stream_vt_make_file (pf_mod, pf_fil | p_fil)) , fclose1_exn (pf_fil | p_fil) ) // end of [$ldelay] end else let val () = fclose1_exn (pf_fil | p_fil) in $ldelay (stream_vt_nil ()) end // end of [if] end // end of [char_stream_vt_make_file]The keyword $ldelay is used to form a linear lazy value. Given two dynamic expressions exp1 and exp2, the linear lazy value $ldelay (exp1, exp2) essentially contains two thunks whose bodies are exp1 and exp2, repectively. The first thunk represents a suspended computation while the second thunk represents a finalizer that can be called to free up the resources held in the first thunk. In the case where exp2 is missing, exp1 is assumed to hold no resources.
The code used for illustration is available here.
The I/O is done in ATS/Anairiats by providing interfaces for various I/O functions in C. In particular, most functions declared in <stdio.h> have their counterparts declared in libc/SATS/stdio.sats.
There is a datasort file_mode and an abstract type constructor FILE in ATS that are declared as follows:
datasort file_mode = | file_mode_r | file_mode_w | file_mode_rw absviewt@ype FILE (file_mode) = $extype "ats_FILE_type"where the external type ats_FILE_type is an alias of the type FILE in C. We now use r, w and rw for file_mode_r, file_mode_w and file_mode_rw in the following presentation. If a file is assigned the type FILE(r), then it is read-only; if it is assigned the type FILE(w), then it is write-only; it is assigned the type FILE(rw), then it can be read from and also written to.
There is also an abstract type constructorfile_mode declared as follows:
abst@ype file_mode (file_mode)Note that file_mode is overloaded: it refers to a type constructor as well as a datasort. The following values are available that represent various file modes:
val file_mode_r : file_mode (r) // = "r" val file_mode_rr : file_mode (rw) // = "r+" val file_mode_w : file_mode (w) // = "w" val file_mode_ww : file_mode (rw) // = "w+" val file_mode_a : file_mode (w) // = "a" val file_mode_aa : file_mode (rw) // = "a+"
fun fopen_exn {m:file_mode} (path: string, m: file_mode m): [l:addr] (FILE m @ l | ptr l)Note that we use _exn in the name of fopen_exn as a convention to indicate that an exception is raised if a call to this function results in an error. There is another function fopen_err in ATS that is assigned the following type:
fun fopen_err {m:file_mode} (path: string, m: file_mode m): [l:addr] (FILE_opt_v (m, l) | ptr l)where FILE_opt_v (m, l) is defined as follows:
viewdef FILE_opt_v (m:file_mode, l:addr) = option_v (FILE m @ l, l <> null)The function fopen_err exactly corresponds to fopen in C: It returns a null pointer in case of error.
fun fclose_exn {m:file_mode} {l:addr} (pf: FILE m @ l | p: ptr l): voidThe function fclose_exn raises an exception if a call to it results in an error. There is another function fclose_err that is assigned the following type:
fun fclose_exn {m:file_mode} {l:addr} (pf: FILE m @ l | p: ptr l): intThe function fclose_err exactly corresponds to fclose in C: It returns 0 in case of success and EOF in case of error.
fun fgetc_err {m:file_mode} (pf: file_mode_lte (m, r) | f: &FILE m): [i:int | i <= UCHAR_MAX] int iwhere UCHAR_MAX is defined to be 255. This function exactly corresponds to fgetc in C: It returns the obtained character (cast into unsigned char) if successful, or it returns EOF in case of failure. The file argument of the function fgets_err, i.e., its first non-proof argument, is passed as a reference (call-by-reference).
Note that file_mode_lte is a prop constructor. Given two file modes m1 and m2, if a proof of the prop file_mode_lte (m1, m2) can be formed, then m1 is less than or equal to m2. The following proofs are available:
prval file_mode_lte_r_r : file_mode_lte (r, r) prval file_mode_lte_rw_r : file_mode_lte (rw, r) prval file_mode_lte_w_w : file_mode_lte (w, w) prval file_mode_lte_rw_w : file_mode_lte (rw, w) prval file_mode_lte_rw_rw : file_mode_lte (rw, rw)In order to read from a file of the type FILE (m), it is necessary to prove that m is less than or equal to r.
There is currently no function fget_exn, which would be given the following type if defined:
fun fgetc_exn {m:file_mode} (pf: file_mode_lte (m, r) | f: &FILE m): [i:nat | i <= UCHAR_MAX] int iThe problem with fgetc_exn is that it is not a function of much use. If implemented, this function should probably be called on a file after it is made certain (e.g., by calling feof) that the file has not reached its end.
fun fputc_err {m:file_mode} (pf: file_mode_lte (m, w) | c: char, f: &FILE m): [i:int | i <= UCHAR_MAX] int iThis function exactly corresponds to fputc in C: It returns the charater being written in case of success, or EOF in case of failure. There is another function fputc_exn that is assigned the following type:
fun fputc_exn {m:file_mode} (pf: file_mode_lte (m, w) | c: char, f: &FILE m): voidIn case of failure, fputs_exn raises an exception.
#define i2c char_of_int1
fn filecopy_exn {m1,m2:file_mode} (
pf1: file_mode_lte (m1, r)
, pf2: file_mode_lte (m2, w)
| fil_s: &FILE m1, fil_d: &FILE m2
) : void = let
fun loop
(fil_s: &FILE m1, fil_d: &FILE m2): void = let
val c = fgetc_err (pf1 | fil_s)
in
if c >= 0 then begin
fputc_exn (pf2 | i2c c, fil_d); loop (fil_s, fil_d)
end // end of [if]
end // end of [loop]
in
loop (fil_s, fil_d)
end // end of [filecopy_exn]
In case of error, filecopy_exn raises an exception.
The following function filecopy_err is a variant of
filecopy_exn. Instead of raising an exception in case of error,
filecopy_err returns EOF.
fn filecopy_err {m1,m2:file_mode} (
pf1: file_mode_lte (m1, r)
, pf2: file_mode_lte (m2, w)
| fil_s: &FILE m1, fil_d: &FILE m2
) : int = let
fun loop
(fil_s: &FILE m1, fil_d: &FILE m2): void = let
val c = fgetc_err (pf1 | fil_s)
in
if c >= 0 then let
val err = fputc_err (pf2 | i2c c, fil_d)
in
loop (fil_s, fil_d)
end // end of [if]
end // end of [loop]
val () = loop (fil_s, fil_d)
in
if ferror (fil_d) = 0 then ~1 else 0
end // end of [filecopy_err]
We now use filecopy_err to implement as follows the functionality of
concatenating a list of files into a single one:
// concatenation of a list of files
implement main {n} (argc, argv) = let
val () = case+ argc of
| 1 => let
val (pf_stdin | p_stdin) = stdin_get ()
val (pf_stdout | p_stdout) = stdout_get ()
val _(*err*) = filecopy_err
(file_mode_lte_r_r, file_mode_lte_w_w | !p_stdin, !p_stdout)
val () = stdout_view_set (pf_stdout | (*none*))
val () = stdin_view_set (pf_stdin | (*none*))
in
// empty
end // end of [1]
| _ (*argc >= 2*) => loop (argc, argv, 1) where {
fun loop {i:nat | i <= n}
(argc: int n, argv: &(@[string][n]), i: int i): void =
if i < argc then let
val name = argv.[i]
val (pfopt | p_ifp) = fopen_err (name, file_mode_r)
in
if p_ifp <> null then let
prval Some_v (pf) = pfopt
val (pf_stdout | p_stdout) = stdout_get ()
val _(*err*) = filecopy_err
(file_mode_lte_r_r, file_mode_lte_w_w | !p_ifp, !p_stdout)
val () = stdout_view_set (pf_stdout | (*none*))
val () = fclose_exn (pf | p_ifp)
in
loop (argc, argv, i+1)
end else let
prval None_v () = pfopt
val () = prerrf ("%s: can't open [%s]\n", @(argv.[0], name))
in
exit {void} (1)
end // end of [if]
end // end of [if]
} // end of [_]
// end of [val]
val (pf_stdout | p_stdout) = stdout_get ()
val err = ferror (!p_stdout)
val () = stdout_view_set (pf_stdout | (*none*))
in
if (err <> 0) then begin
prerrf ("%s: error writing stdout\n", @(argv.[0])); exit {void} (2)
end else begin
exit {void} (0) // exit normally
end // end of [if]
end // end of [main]
The code used for illustration is available here.
When programming in ATS, a programmer may need to write C code occassionally. This is particularly common when the programmer does systems programming. Given that both ATS and C use the same data representation, it is largely straightforward for ATS and C to interact with each other. In particular, there is no need to write "wrappers" that are used in languages like Objective Caml or Haskell. In ATS, the syntax for enclosing C code is
%{ (some C code) %}The symbol %{ can be replaced with either %{^ or %{$. The former and the latter mean that the enclosed C code needs to be put at the top and the bottom of the generated C code, respectively.
For a function to be called in C, we need to assign a global name to this function. For instance, the following syntax declares a function foo that takes a pair of integers and returns a boolean. This function is given a global name "ats_foo", which can be used in C code to refer to foo.
fun foo : (int, int) -> bool = "ats_foo"We may implement foo either in ATS as follows:
implement foo (x, y) = ...or in C as follows:
%{ ats_bool_type ats_foo (ats_int_type x, ats_int_type y) { ... } %}Note that the C types ats_int_type and ats_bool_type are defined in the file ccomp/runtime/ats_types.h
The following example involves a call in ATS to a function that is implemented in C:
// This function computes Fibonacci numbers extern fun fibonacci (n: Nat): Nat = "fibonacci" %{ ats_int_type fibonacci (ats_int_type n) { int res1, res2, tmp ; if (n < 1) return 0 ; res1 = 0 ; res2 = 1 ; while (n > 1) { --n ; tmp = res2 ; res2 += res1 ; res1 = tmp ; } return res2 ; } %} fn fibonacci_usage (cmd: string): void = prerrf ("Usage: %s [integer]\n", @(cmd)) // print an error message implement main (argc, argv) = if argc >= 2 then let val n = int1_of argv.[1] // turning string into integer val () = assert_errmsg (n >= 0, "The integer argument needs to be nonnegative.\n") val res = fibonacci n in printf ("fibonacci (%i) = %i\n", @(n, res)) end else begin fibonacci_usage argv.[0]; exit {void} (1) endIt is also possible to assign names to types and values in ATS, and such names can be used in extenal C code. As an example, the type Tree in the following code is given a name exTree, which can be used in C code to refer to the type Tree.
dataviewtype tree (int) = Nil(0) | {n1,n2:two} Node(1) of (tree n1, int, tree n2) viewtypedef Tree = [n:two] tree n // assigning an external name to [Tree] // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ extern typedef "exTree" = Tree extern fun ref_tree : Tree -> ref Tree = "ref_tree" %{ ats_ref_type ref_tree (ats_ptr_type t) { exTree* r ; r = ats_malloc_gc (sizeof(exTree)) ; *r = (exTree)t ; return r ; } %}The syntax for assigning a name to a value in ATS is given as follow:
extern val $name = $expwhere $name is a string literal (representing a valid identifier in C) and $exp ranges over dynamic expressions in ATS. When used in external C code, the string $name refers to the value of the expression $exp.
In the other direction, we can use $extype "SomeType" in ATS to refer to a type of the name SomeType in C. Similarly, we can use $extval ($typ, "SomeValue") for a value of the name SomeValue in C, where $typ is the supposed type of this value in ATS. For instance, stdout in ATS is defined as follows:
#define stdout $exval ($extype "ats_ptr_type", "stdout")where ats_ptr_type is an alias for the type void* in C.
The code used for illustration is available here.
The paradigm of programming with theorem proving is rich and broad, and it is probably the most innovative feature in ATS. We give an introduction to this paradigm by presenting a few examples, explaining some motivations behind programming with theorem proving as well as demonstrating a means to achieve it in ATS.
The following code implements a function that computes factorials:
// non-tail-recursive fun fact {n:nat} (n: int n): Nat = if n > 0 then n nmul fact (n-1) else 1The type assigned to fact is
{n:nat} int (n) -> NatThat is, the function fact takes a nature number as its argument and, if it terminates, returns a natural number as its result.
Now suppose we want to be more accurate. We want to verify that the code indeed implements the factorial functions. One possibility is to specify the factorial function in the statics of ATS, and then try to assign the following type to fact:
{n:nat} int (n) -> int (fact(n))Note that the name fact is overloaded here to also refer to the factorial function in the statics of ATS. There is, however, a serious problem with this approach: If functions as complex as the factorial function could be specified in the statics of ATS, then it would no longer be possible to support fully automatic constraint-solving. We take a different approach instead.
With standard mathematical notation, the factorial function can be specified as follows:
fact (0) = 1 ; fact (n) = n * fact (n-1) ; if n > 0In ATS, we can declare a dataprop to encode this specification:
// [FACT (n, x)] is inhabited if and only if [fact (n) = x] dataprop FACT (int, int) = | FACTbas (0, 1) | {n:int | n > 0} {r,r1:int} FACTind (n, r) of (FACT (n-1, r1), MUL (n, r1, r))The prop constructor MUL is already defined: Given integers p, q, and pq, MUL(p, q, pq) is inhabited if and only if the product of p and q is pq. The constructors FACTbas (base case) and FACTind (inductive case) correspond to the first and the second clauses in the above specification, respectively. Given integers n and r, the prop FACT(n, r) is inhabited if and only if the factorial of n equals r. Clearly, if a terminating function can be assgined the following type:
{n:nat} int n -> [r:int] (FACT (n, r) | int r)then this function must compute factorials. The code below implements such a function fact1:
// [fact1] implements the factorial function in a non-tail-recursive manner fun fact1 {n:nat} .< n >. (n: int n): [r:int] (FACT (n, r) | int r) = if n > 0 then let val (pf_fac_n1_r1 | r1) = fact1 (n-1) val (pf_mul_n_r1_r | r) = n imul2 r1 in (FACTind (pf_fac_n1_r1, pf_mul_n_r1_r) | r) end else (FACTbas () | 1)When applied to a natural number n, fact1 is guaranteed to return a pair (pf, r), where pf is a proof stating that the factorial of n is r. A crucial property of ATS is the irrelevance of proofs to program evaluation: The dynamic semantics of a program cannot be altered by erasing the proofs in it. So proofs are all erased after compilation and thus are not constructed at run-time.
This example clearly gives some impression of code duplication as the specification (FACT) and the implementation (fact1) are of great similarity to each other. We now present another example, where the specification and the implementation are far different from each other.
The following code gives another implementation of the factorial function, where only tail-recursion is involved:
// tail-recursive fn fact {n:nat} (n: int n): Nat = let fun loop {n:nat} (n: int n, res: Nat): Nat = if n > 0 then loop (n-1, n nmul res) else res in loop (n, 1) endBase on this tail-recursive implementation, we implement as follows a function fact2 that is verified to compute factorials.
// Some lemmas on multiplication are stated (without proofs) extern prval MULlem00 : {x,y:int} () -<prf> [xy:int] MUL (x, y, xy) // 1 * x = x extern prval MULlem10 : {x,y:int} MUL (1, x, y) -<prf> [x==y] void // x * 1 = x extern prval MULlem11 : {x,y:int} MUL (x, 1, y) -<prf> [x==y] void // multiplication is associative: (xy)z = x(yz) extern prval MULlem20 : {x,y,z,xy,yz,xyz:int} (MUL (x, y, xy), MUL (y, z, yz), MUL (xy, z, xyz)) -<prf> MUL (x, yz, xyz) // [fact2] implements the factorial function in a tail-recursive style fn fact2 {n:nat} (n: int n): [r0: int] (FACT (n, r0) | int r0) = let // [loop] is tail-recusive fun loop {n:nat; x:int} .< n >. (n: int n, x: int x) : [r,r0:int] (FACT (n, r), MUL (x, r, r0) | int r0) = if n > 0 then let val (pf_mul_x_n_xn | xn) = x imul2 n val (pf_fac_n1_r1, pf_mul_xn_r1_r0 | res) = loop (n-1, xn) prval pf_mul_n_r1_nr1 = MULlem00 () prval pf_mul_x_nr1_r0 = MULlem20 (pf_mul_x_n_xn, pf_mul_n_r1_nr1, pf_mul_xn_r1_r0) prval pf_fac_n_nr1 = FACTind (pf_fac_n1_r1, pf_mul_n_r1_nr1) in (pf_fac_n_nr1, pf_mul_x_nr1_r0 | res) end else let prval pf_mul_x_1_y = MULlem00 () // x * 1 = y prval () = MULlem11 (pf_mul_x_1_y) // x = y in (FACTbas (), pf_mul_x_1_y | x) end val (pf_fac, pf_mul | res) = loop (n, 1) prval () = MULlem10 (pf_mul) in (pf_fac | res) endIn practice, programming with theorem proving is commonly employed to capture program invariants that would otherwise not be possible in ATS. For instance, aided by programming with theorem proving, we can program a complex data structure like doubly-linked list or doubly-linked tree while having no fear of misusing pointers. This is considered a great strength of ATS.
The code used for illustration is available here.
A casting function in ATS is equivalent to the identify function in terms of dynamic semantics. A call to such a function is evaluated at compile-time, and its argument is returned. For instance, we have the following commonly used casting functions:
castfn int1_of_int (x: int):<> [n:nat] int n castfn string1_of_string (x: string):<> [n:nat] string nThe keyword castfn is for introducing casting functions.
We now present a typical use of casting functions. The following declared function list_concat is intended for concatenating a list of lists:
fun{a:t@ype} list_concat (xss: List (List a)): List aLet us say that we would like to verify that the concatenation of a list of lists yields a list whose length equals the sum of the lengths of the lists in the given list of lists. This, for instance, can be done as follows by introducting a datatype constructor lstlst.
datatype lstlst (a:t@ype+, int, int) =
| {m,t:nat} {n:nat}
lstlst_cons (a, m+1, t+n) of (list (a, n), lstlst (a, m, t))
| lstlst_nil (a, 0, 0) of ()
fun{a:t@ype} _concat {m,t:nat} .<m>.
(xss: lstlst (a, m, t)): list (a, t) = case+ xss of
| lstlst_cons (xs, xss) => list_append (xs, _concat<a> xss)
| lstlst_nil () => list_nil ()
// end of [_concat]
Given a type T and integers I and J, the type
lstlst (T, I, J) is for a list of lists such that the length of the
list is I and each element in the list is a list of values of the
type T and the sum of the lengths of these elements equals J.
The function list_concat is the same as the function _concat in terms of dynamic semantics, and it can be implemented as follows:
implement{a} list_concat (xss) =
_concat<a> (lstlst_of_listlist xss) where {
castfn lstlst_of_listlist
{m:nat} .<m>. (xss: list (List a, m))
:<> [t:nat] lstlst (a, m, t) = case+ xss of
| list_cons (xs, xss) => lstlst_cons (xs, lstlst_of_listlist xss)
| list_nil () => lstlst_nil ()
} // end of [list_concat]
Given that lstlst_of_listlist being implemented as a casting
function, the implementation of list_concat is equivalent to
the following one in terms of dynamic semantics:
implement{a} list_concat (xss) = _concat (xss) // this one does not typecheck
The code used for illustration is available here.
ATS supports memory allocation in the stack frame of a calling function, and it is guaranteed by the type system of ATS that memory thus allocated cannot be accessed once the calling function returns.
fn name_of_month_1 {i:int | 1 <= i; i <= 12} (i: int i): string = let
var !p_arr with pf_arr = @[string](
"Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
)
in
p_arr->[i-1]
end // end of [name_of_month_1]
The following syntax indicates that the starting address of the allocated
array is stored in p_arr while the view of the array is stored in
pf_arr:
var !p_arr with pf_arr = @[string]( "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec" )This allocated array is initialized with the strings representing the names of the 12 months: "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec".
A variant of the function name_of_month_1 is implemeneted as follows:
fn name_of_month_2 {i:int | 1 <= i; i < 12} (i: int i): string = let
var !p_arr with pf_arr = @[string][12]("")
val () = p_arr->[0] := "Jan"
val () = p_arr->[1] := "Feb"
val () = p_arr->[2] := "Mar"
val () = p_arr->[3] := "Apr"
val () = p_arr->[4] := "May"
val () = p_arr->[5] := "Jun"
val () = p_arr->[6] := "Jul"
val () = p_arr->[7] := "Aug"
val () = p_arr->[8] := "Sep"
val () = p_arr->[9] := "Oct"
val () = p_arr->[10] := "Nov"
val () = p_arr->[11] := "Dec"
in
p_arr->[i-1]
end // end of [name_of_month_2]
The following syntax means that the function name_of_month_2
allocates a string array of size 12 in its stack frame and initializes the
array with the empty string:
var !p_arr with pf_arr = @[string][12]("")The starting address and the view of the allocated array are stored in p_arr and pf_arr, respectively. If the following syntax is used:
var !p_arr with pf_arr = @[string][12]()then the allocated array is uninitialized, that is, the view of the proof pf_arr is [string?][12] @ p_arr (instead of [string][12] @ p_arr).
In the following example, the implemented function print_month_name forms a closure in its stack frame, which is then passed to a higher-order function iforeach_array_ptr_clo:
fn print_month_names () = let
var !p_arr with pf_arr = @[string](
"Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
)
var !p_clo with pf_clo = @lam // this closure is allocated in the stack frame
(pf: !unit_v | i: sizeLt 12, x: &string): void =<clo1> (if i > 0 then print ", "; print x)
// end of [var]
prval pf = unit_v ()
val () = iforeach_array_ptr_clo_tsz {string} {unit_v} (pf | !p_clo, !p_arr, 12, sizeof<string>)
prval unit_v () = pf
val () = print_newline ()
in
// empty
end // end of [print_month_names]
Note that the keyword @lam (instead of lam) is used here to
indicate that the closure is constructed in the stack frame of the function
print_month_names.
The code used for illustration is available here.
A state type is a mapping that associates variables with viewtypes. Intuitively, if a program point can be assigned a state type ST, then each variable in the domain of ST must have the type with which ST associates the variable when program execution reaches that point. In ATS, a state type is represented as follows:
(x_1 : VT_1, ..., x_n : VT_n)where for 1 <= i <= n, each variable x_i is assiciated with the viewtype VT_i.
The code used for illustration is available here.
In ATS, there are also constructs for forming for-loops and while-loops directly. While it is tempting for someone familiar with imperative programming to implement loops with these constructs (instead of using tail-recursion), it should be emphasized that the amount of effort involved in constructing loop invariants (if needed) often makes this style of programming difficult for beginners and, sometimes, for experts as well.
fn fact (n: int): int = let
var x: int = n
var res: int = 1 // initialized
val () = while (x > 0) (res := res * x; x := x - 1)
in
res // res = n!
end // end of [fact]
The syntax for a while-expression is given as follows:
while $test $bodywhere $test is an expression of the type bool and $body is an expression of the type void. Usually, $body consists of a sequence of expressions of the type void.
In the following, another implementation of the factorial function is given, where a for-loop is constructed:
// a simple for-loop
fn fact (n: int): int = let
var x: int
var res: int = 1 // initialized
val () = for (x := 1 ; x <= n ; x := x+1) res := res * x
in
res // res = n!
end // end of [fact]
The syntax for a for-expression is given as follows:
for ($init ; $test ; $post) $bodywhere $init is an expression of the type void, $test is an expression of the type bool, $post is an expression of the type void, and $body is an expression of the type void. Usually, $body consists of a sequence of expressions of the type void. Note that each of $init, $test, and $post can be omitted. The meaning of each omission should be evident. The following code gives yet another implemenation of the factorial function:
// a simple for-loop with omissions involving [break]
fn fact (n: int): int = let
var x: int = 1 // initialized
var res: int = 1 // initialized
val () = for ( ; ; ) // infinite loop
if x <= n then (res := res * x; x := x+1) else break
// end of [val]
in
res // res = n!
end // end of [fact]
Note that break in ATS is equivalent to its counterpart in C. Also,
continue is supported in ATS as shown in the following
implementation of the factorial function:
// a simple for-loop with omissions involving [break] and [continue]
fn fact (n: int): int = let
var x: int = 1 // initialized
var res: int = 1 // initialized
val () = for ( ; ; x := x+1) // no loop test
(*
** note that [continue] means to loop again *after* post increment!
*)
if x <= n then (res := res * x; continue) else break
// end of [val]
in
res // res = n!
end // end of [fact]
fn bsearch {n:nat} (
A: &(@[double][n]), n: int n, key: double
) :<> int = let
var l: int = 0 and u: int = n-1; var res: int = ~1
val () = while*
{i,j:int | 0 <= i; i <= j+1; j < n} .<j+1-i>. (l: int i, u: int j) =>
(l <= u) let
val m = l + (u-l) / 2
val sgn = compare (key, A.[m])
in
case+ 0 of
| _ when sgn < 0 => (u := m-1; continue)
| _ when sgn > 0 => (l := m+1; continue)
| _ => (res := m; break)
end // end of [val]
in
res // 0 <= res < n if [key] is found; or res = ~1 if not
end // end of [bsearch]
The annotated invariant for the loop follows the keyword while*, and
it is separated from the rest of the loop by the symbol =>. The
invariant states that there are integers i and j satisfying
0 <= i,
i <= j+1 and
j < n
such that the variables l and u are of
the types int(i) and int(j), respectively, at the entry
point of the loop, that is, the point immediately before the loop test.
A termination metric .<j+1-i>. is provided to verify that
the loop terminates: the metric must decrease whenever the loop goes back
to its entry point.
As another example, an implementation of the factorial function is presented as follows that involves the use of an annotated for-loop:
fn fact {n:nat} (n: int n): int = let
var x: int
var res: int = 1
(*
// the loop invariant indicates that
// the value of [x] is [n+1] at the point where the loop exits
*)
val () = for* {i:nat | i <= n+1} .<n+1-i>.
(x: int i): (x: int (n+1)) => (x := 0; x <= n ; x := x+1) res := res * x
// end of [val]
in
res
end // end of [fact]
The annotated invariant for the for-loop follows the keyword for*,
and it is separated from the rest of the loop by the symbol =>.
This invariant consists of two parts, separated by the colon symbol
(:). The part before the colon symbol essentially means that there is a
natural number i satisfying i <= n+1 such that the variable
x is of the type int(i) at the entry point of the for-loop,
that is, at the point immediately after the loop initialization and before
the loop test. There is also a termination metric in this part whose
meaning should be evident. The part after the colon symbol essentially
means that the variable x is of the type int(n+1) at the exit
point of the loop.
Annotating loops with invariants correctly can be difficult, sometimes. If complex loop invariants are needed, we recommend that the programmer avoid constructing loops directly. Instead, the programmer can implement tail-recursive functions in place of loops. Note that each tail-recursive function in ATS is guaranteed to be compiled into a loop.
The code used for illustration is available here.