Function or Closure?


It is possible in ATS to differentiate at the level of types functions without environment from functions with environment. This is an indispensable feature for interfacing functions in C directly inside ATS as these functions all have no environment. A function with an environment is often referred to as a closure.

A problem with closures

In the programming language C, each function is at the toplevel, and it is naturally represented as a pointer to the code heap where the code for the function is stored. However, inner functions are supported in ATS, and they may appear in the return value of a function call. As an example, the following code in ATS involves an inner function add_x appearing in the return value of an outer function add:
fn add (x: int):<cloref> int -<cloref> int = begin
  let fn add_x (y: int):<cloref> int = x + y in add_x end
end // end of [add]
The syntax <cloref> indicates that the defined functions add and add_x are both (persistant) closure references. Applying add to a given integer i, we obtain a unary function that adds i to its argument. This function is represented as a pair (add_x_env, [x -> i]), where we use [x->i] for something often referred to as an environment that binds x to i, and add_x_env for the (toplevel) function defined in the following (pseudo) code:
fun add_x_env (env, y) = env.x + y
Note that the (pseudo) syntax env.x stands for the selection of the value to which x is bound in the environment env.

We use the name closure to refer to a pair like (add_x_env, [x -> i]). Given that functions may occur as arguments (of other functions) in ATS, it is necessary that all functions be represented uniformly as closures (if functions without environments cannot be differentiated from functions with environments at compile-time). For instance, the previously defined (toplevel) function add needs to be represented as a pair (add_env, []), where [] stands for the empty environment and add_env is the function defined in the following (pseudo) code:

fun add_env (env, x) = (add_x_env, [x -> x])
In functional languages like ML and Haskell, all functions are represented as closures. Unfortunately, this requirement for representing all functions as closures, can cause a serious difficulty when we try to use in ATS a higher-order function implemented in C. Let us see a concrete example.

The function qsort is declared in <stdlib.h> with the following type:

void qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *));
Clearly, qsort demands that its fourth argument be a function (not a closure). In order to use qsort in ATS directly, we need a way to construct functions represented as code pointers (instead of closures).

Types for functions with environment

In ATS, a type of the form (T_1, ..., T_n) -<cloref> T_0 is for a closure reference, that is, a reference to a function paired with an environment that takes n arguments of types T_1, ..., T_n and returns a value of type T_0. The following declaration states that foo is a closure reference of the type (T_1, ..., T_n) -<cloref> T_0:
val foo : (T_1, ...., T_n) -<cloref> T_0

Types for functions without environment

In ATS, a type of the form (T_1, ..., T_n) -<fun> T_0, where fun can be omitted, is for a function without environment that takes n arguments of types T_1, ..., T_n and returns a value of type T_0. The following two equivalent declarations both state that bar is a function of the type (T_1, ..., T_n) -<fun> T_0:
val bar : (T_1, ...., T_n) -<fun> T_0
fun bar (x_1: T_1, ...., x_n: T_n): T_0
The previously mentioned function qsort can be given the following type in ATS:
fun qsort {a:viewt@ype} {n:nat}
  (base: & @[a][n], nmemb: size_t n, size: sizeof_t a, compar: (&a, &a) -<fun> int): void
This type indicates that qsort itself is a function without environment and its fourth argument is also a function without environment. As an example, the following code implements a simple test on qsort:
fn test_qsort () = let
  fun pr_loop {n,i:nat | i <= n} .<n-i>.
    (A: &(@[int][n]), n: size_t n, i: size_t i): void =
    if i < n then begin
      if i > 0 then print ", "; print A.[i]; pr_loop (A, n, i+1)
    end // end of [if]
  // end of [pr_loop]

  // creating a linear array of size 10
  val (pf_gc, pf_arr | p_arr, asz) = $arrsz {int} (1, 9, 2, 8, 3, 7, 4, 6, 5, 0)

  val () = (print "before quicksort:\n")
  val () = pr_loop (!p_arr, asz, 0)
  val () = print_newline ()

  val () = begin
    qsort {int} (!p_arr, asz, sizeof<int>, lam (x, y) => compare (x, y))
  end // end of [val]

  val () = (print "after quicksort:\n")
  val () = pr_loop (!p_arr, asz, 0)
  val () = print_newline ()
in
  array_ptr_free {int} (pf_gc, pf_arr | p_arr)
end // end of [test_qsort]

Implementing a function without enviroment

The following code implements add as a function without enviroment:
fn add0 (x: int):<fun> int -<cloref> int = begin
  let fn add0_x (y: int):<cloref> int = x + y in add0_x end
end // end of [add0]
The syntax :<fun> is an annotation indicating that the type ascribed to add0 is int -<fun> (int -<cloref> int). If we change int -<cloref> int into int -<fun> int, then an error is reported at compile-time as the function add0_x does require a nonempty environment that binds the variable x to some value. If add0 is needed in a place where a closure is expected, we can simply write lam x => add0 (x) instead.

When a function is declared via the keyword fun or fn, it is assumed by default that the function is without environment. For instance, the following code is equivalent to the previous implementation of add0:

fn add0 (x: int): int -<cloref> int = begin
  let fn add0_x (y: int):<cloref> int = x + y in add0_x end
end // end of [add0]
In contrast, the following code, which is used at the beginning of this tutorial, implements add as a closure:
fn add (x: int):<cloref> int -<cloref> int = begin
  let fn add_x (y: int):<cloref> int = x + y in add_x end
end // end of [add]

Linear Closures

As a closure is an aggregated value, memory allocation is required in order to form closures. In ATS, linear closures are supported. As memory allocated for linear closures can be freed explicitly by the programmer, such closures are particularly useful in a situation where automatic garbage collection is not allowed or its use needs to be significantly reduced. We are to present detailed explanation on linear closures elsewhere.


The code used for illustration is available here.