Chapter 13. Persistent Arrays and References

Let us start with references. A reference is precisely an (initialized) singleton array, that is, an array of size 1, and its typical use is for implementing a global variable that can be updated.

The following function is for creating a reference:

// fun {a:t@ype} ref_make_elt(x0: a): ref(a) //

where ref is a type constructor in ATS that takes a type T to form a reference type ref(T) for any reference holding a value of the type T. The functions ref_get_elt and ref_set_elt are for fetching and updating the value stored in a reference, respectively:

// fun {a:t@ype} ref_get_elt(r: ref(a)): a fun {a:t@ype} ref_set_elt(r: ref(a), x: a): void //

A shorthand for ref_make_elt is ref. Also, both ! and [] are overloaded with ref_get_elt and ref_set_elt. For instance, the following code prints out 0, 1, and 3 twice:

// val r0 = ref<int>(0) val () = println! (!r0) val () = (!r0 := !r0 + 1) val () = println! (!r0) val () = (!r0 := !r0 + 2) val () = println! (!r0) // val r1 = ref<int>(0) val () = println! (r1[]) val () = (r1[] := r1[] + 1) val () = println! (r1[]) val () = (r1[] := r1[] + 2) val () = println! (r1[]) //

Programmers often misuse references. This is especially true for those with a background in imperative programming. For instance, the following example shows a typical poor use of references that is often resulted from someone learning functional programming by "translating" code written in imperative style:

// fun fact_ref (n: int): int = let // val i = ref<int>(0) val r = ref<int>(1) // fun loop(): void = if !i < n then (!i := !i+1; !r := !r * !i; loop()) // in let val () = loop() in !r end end (* end of [fact_ref] *) //

There are some obvious drawbacks in this implementation of fact_ref. For evaluating each call to fact_ref, two references are created, which become garbage after the call returns. As a reference is allocated on heap, it is not mapped to a register when compiled. Accessing a reference involves memory traffic, which is much more expensive than accessing a register. Compared to a tail-recursive implementation of the factorial function in functional style, fact_ref is of great inefficiency both time-wise and memory-wise.

There is a type constructor array0 in ATS that takes a type T to form the array type array0(T) for an array storing elements of the type T. We refer to such an array as an array0-value, which is essentially a pair containing a pointer (to the memory location where elements are stored) and an integer (indicating the capacity of the array). For instance, the following function can be called to create an array0-value:

// fun {a:t@ype} array0_make_elt(asz: int, x0: a): array0(a) //

Given a non-negative integer asz and an element x0, array0_make_elt returns an array0-value in which the array is of size asz and each of its cells is initialized with x0.

Another commonly used function for creating an array0-value is array0_tabulate of the following interface:

// fun {a:t@ype} array0_tabulate(asz: int, fopr: cfun(int, a)): array0(a) //

Given a non-negative integer asz and a closure-function fopr, array0_tabulate returns an array0-value in which the array is of size asz and the array cells are initialized with the values of fopr at the valid indices.

The functions array0_get_at and array0_set_at are for fetching and updating the value stored in an array0-value at a given index:

// fun {a:t@ype} array0_get_at(A: array0(a), i: int): a fun {a:t@ype} array0_set_at(A: array0(a), i: int, x: a): void //

Note that array0_get_at raises an exception (ArraySubscriptExn()) if the given index is invalid, that is, not between 0 and the array size minus 1, inclusive. And the same happens with respect to array0_set_at. Also please note that the brackets [] is overloaded with both array0_get_at and array0_set_at. For instance, evaluating the following code prints out two lines: the first consisting of the text 000 and the second consisting of the text 123:

// val A = array0_make_elt<int>(3, 0) // val () = println! (A[0], A[1], A[2]) // val () = A[0] := 1 val () = A[1] := A[0] + 1 val () = A[2] := A[1] + 1 // val () = println! (A[0], A[1], A[2]) //

Like list0-values, there are many commonly used functions for processing array0-values. For instance, the following function array0_foreach corresponds to the previouly presented list0_foreach:

// extern fun {a:t@ype} array0_foreach (A: array0(a), fwork: cfun(a, void)): void // implement {a}(*tmp*) array0_foreach(A, fwork) = ( int_foreach<>(sz2i(A.size()), lam(i) => fwork(A[i])) ) (* end of [array0_foreach] *) //

Given an array0-value A, the expression A.size() is written in dot-notation, which returns the size of the array contained in A. The name sz2i refers to a cast function from the type size_t to the type int.

The following function array0_foldleft corresponds to the previouly presented list0_foldleft:

// extern fun {r:t@ype} {a:t@ype} array0_foldleft (A: array0(a), r0: r, fopr: cfun(r, a, r)): r // implement {r}{a} array0_foldleft (A, r0, fopr) = ( // int_foldleft<r> (sz2i(A.size()), r0, lam(r, i) => fopr(r, A[i])) // ) (* end of [array0_foldleft] *) //

The following function array0_foldright corresponds to the previouly presented list0_foldright:

// extern fun {r:t@ype} {a:t@ype} array0_foldright (A: array0(a), fopr: cfun(a, r, r), r0: r): r // implement {r}{a} array0_foldright (A, fopr, r0) = let val asz = sz2i(A.size()) in int_foldleft<r>(asz, r0, lam(r, i) => fopr(A[asz-i-1], r)) end (* end of [array0_foldright] *) //

Unlike list0_foldright, array0_foldright is tail-recursive.

In practice, matrices (that is, two dimensional arrays) are also a very common data structure. In ATS, there is a type constructor matrix0, which takes a type T to form the type matrix0(T) for a matrix storing elements of the type T. We refer to such a matrix as a matrix0-value, which is essentially a tuple containing a pointer (to the memory location where elements are stored) and two integers (indicating the number of rows and the number of columns of the matrix). For instance, the following function can be called to create a matrix0-value:

// fun {a:t@ype} matrix0_make_elt(nrow: int, ncol: int, x0: a): matrix0(a) //

Given two non-negative integers m and n and an element x0, matrix0_make_elt returns a matrix0-value in which the matrix is of dimension m by n and each of its cells is initialized with x0.

Please note that the so-called row-major representation is chosen for the matrix contained in each created matrix0-value. As the elements in each row are stored adjacently in row-major representation, it can be (much) more efficient to process the elements in the row-by-row fashion (compared to the column-by-column fashion) due to the memory-cache effect.

Like array0_tabulate, the following function matrix0_tabulate is often called to create a matrix0-value:

// fun {a:t@ype} matrix0_tabulate (nrow: int, ncol: int, fopr: cfun(int, int, a)): matrix0(a) //

Given two non-negative integers m and n and a closure-function fopr, matrix0_tabulate returns a matrix0-value in which the matrix is of dimension m by n and the matrix cells are initialized with the values of fopr at the valid indices.

The functions matrix0_get_at and matrix0_set_at are for fetching and updating the value stored in a matrix0-value at a given position (represented by a row index and a column index):

// fun{a:t0p} matrix0_get_at (M: matrix0(a), i: int, j: int): a fun{a:t0p} matrix0_set_at (M: matrix0(a), i: int, j: int, x: a): void //

Note that matrix0_get_at raises an exception (MatrixSubscriptExn()) if one of the given indices is invalid. And the same happens with respect to matrix0_set_at. Also please note that the brackets [] is overloaded with both matrix0_get_at and matrix0_set_at.

Like array0_foreach, the following function matrix0_foreach applies to each element stored in the matrix contained in M a given closure-function fwork:

// extern fun {a:t@ype} matrix0_foreach (M: matrix0(a), fwork: cfun(a, void)): void // implement {a}(*tmp*) matrix0_foreach (M, fwork) = let val nrow = sz2i(M.nrow()) val ncol = sz2i(M.ncol()) in int_cross_foreach<>(nrow, ncol, lam(i, j) => fwork(M[i,j])) end (* end of [matrix0_foreach] *) //

The expression M.nrow() is written in dot-notation, which returns the number of rows in the matrix contained in M. And M.ncol() is for the number of columns in the matrix.

Please find on-line the entirety of the code used in this chapter.