Linear Arrays

Unlike in most programming languages, arrays are not a primitive data structure in ATS. More specifically, persistent arrays can be implemented based on linear arrays, which allow for being freed safely by the programmer, and linear arrays can be implemented based on at-views. Intuitively, the view for an array containing N elements of type T consists of N at-views: T@L0, T@L1, ..., and T@LN-1, where L0 is the starting address of the array and each subsequent L equals the previous one plus the size of T, that is, the number of bytes needed to store a value of the type T. The following declared dataview array_v precisely captures the intuituion:

dataview array_v
  (a:t@ype+, int(*size*), addr(*beg*)) =
  | {n:nat} {l:addr}
    array_v_cons (a, n+1, l) of (a @ l, array_v (a, n, l+sizeof(a)))
  | {l:addr} array_v_nil (a, 0, l)
// end of [array_v]

Given a type T, an integer N and an address L, array_v(T, N, L) is a view for the array starting at L that contains N elements of the type T. As can be readily expected, the function templates for array accessing and array updating are given the following interfaces:

fun{a:t@ype}
arrget {n,i:nat | i < n} {l:addr}
  (pf: !array_v (a, n, l) | p: ptr l, i: int i): a
// end of [arrget]

fun{a:t@ype}
arrset {n,i:nat | i < n} {l:addr}
  (pf: !array_v (a, n, l) | p: ptr l, i: int i, x: a): void
// end of [arrset]

Before implementing arrget and arrset, I present as follows some code that implements a function template to access the first element of a nonempty array:

fun{a:t@ype}
arrgetfst {n:pos} {l:addr} (
  pf: !array_v (a, n, l) | p: ptr l
) : a = x where {
  prval array_v_cons (pf1, pf2) = pf
  // pf1: a @ l; pf2: array_v (a, n-1, l+sizeof(a))
  val x = !p
  prval () = pf := array_v_cons (pf1, pf2)
} // end of [arrgetfst]

Obviously, the function template arrget can be implemented based on arrgetfst:

implement{a}
arrget (pf | p, i) =
  if i > 0 then let
    prval array_v_cons (pf1, pf2) = pf
    val x = arrget (pf2 | p+sizeof<a>, i-1)
    prval () = pf := array_v_cons (pf1, pf2)
  in
    x
  end else
    arrgetfst (pf | p)
  // end of [if]

This implementation is of time-complexity O(n), and it is tail-recursive (after the proofs are erased). However, the very point of having arrays is to support O(1)-time accessing and updating operations. My initial effort spent on implementing such operations immediately dawned on me the need to construct proof functions for supporting view changes (of no run-time cost).

Clearly, an array starting at L that contains N elements of type T can also be thought of as two arrays: one starting at L that contains I elements while the other starting at L+I*sizeof(T) that contains N-I elements, where I is an natural number less that or equal to N. Formally, this statement can be encoded in the type of the proof function array_v_split:

prfun array_v_split
  {a:t@ype} {n,i:nat | i <= n} {l:addr} {ofs:int} (
  pfmul: MUL (i, sizeof(a), ofs), pfarr: array_v (a, n, l)
) : (array_v (a, i, l), array_v (a, n-i, l+ofs))

The other direction of the above statement can be encoded in the type of the proof function array_v_unsplit:

prfun array_v_unsplit
  {a:t@ype} {n1,n2:nat} {l:addr} {ofs:int} (
  pfmul: MUL (n1, sizeof(a), ofs)
, pf1arr: array_v (a, n1, l), pf2arr: array_v (a, n2, l+ofs)
) : array_v (a, n1+n2, l)

With array_v_split and array_v_unsplit, we can readily give implementations of arrget and arrset that are O(1)-time:

implement{a}
arrget (pf | p, i) = x where {
  val tsz = int1_of_size1 (sizeof<a>)
  val (pfmul | ofs) = i imul2 tsz
  prval (pf1, pf2) = array_v_split {a} (pfmul, pf)
  prval array_v_cons (pf21, pf22) = pf2
  val x = ptr_get1<a> (pf21 | p+ofs)
  prval pf2 = array_v_cons (pf21, pf22)
  prval () = pf := array_v_unsplit {a} (pfmul, pf1, pf2)
} // end of [arrget]

implement{a}
arrset (pf | p, i, x) = () where {
  val tsz = int1_of_size1 (sizeof<a>)
  val (pfmul | ofs) = i imul2 tsz
  prval (pf1, pf2) = array_v_split {a} (pfmul, pf)
  prval array_v_cons (pf21, pf22) = pf2
  val () = ptr_set1<a> (pf21 | p+ofs, x)
  prval pf2 = array_v_cons (pf21, pf22)
  prval () = pf := array_v_unsplit {a} (pfmul, pf1, pf2)
} // end of [arrset]

Note that the function int1_of_size1 is called to turn a size (i.e., an integer of the type size_t) into an int (i.e., an integer of the type int). Of course, the proof functions array_v_split and array_v_split are still to be implemented, which I will do when covering the topic of view change.

Given a type T and a natural number N, a proof of the view array_v(T?, N, L) for some address L can be obtained and released by calling the functions malloc and free, respectively, which are to be explained in details elsewhere. I now give as follows an implemention of a function template for array intialization:

typedef natLt (n:int) = [i:nat | i < n] int (i)

fun{a:t@ype}
array_ptr_tabulate
  {n:nat} {l:addr} (
  pf: !array_v (a?,n,l) >> array_v (a,n,l)
| p: ptr (l), n: int (n), f: natLt(n) -<cloref1> a
) : void = let
  fun loop {i:nat | i <= n} {l:addr} .<n-i>. (
    pf: !array_v (a?,n-i,l) >> array_v (a,n-i,l)
  | p: ptr l, n: int n, f: natLt(n) -<cloref1> a, i: int i
  ) : void =
    if i < n then let
      prval array_v_cons (pf1, pf2) = pf
      val () = !p := f (i)
      val () = loop (pf2 | p+sizeof<a>, n, f, i+1)
    in
      pf := array_v_cons (pf1, pf2)
    end else let
      prval array_v_nil () = pf in pf := array_v_nil {a} ()
    end // end of [if]
  // end of [loop]
in
  loop (pf | p, n, f, 0)
end // end of [array_ptr_tabulate]

Given a natural number n, the type natLt(n) is for all natural numbers less than n. Given a type T, the function array_ptr_tabulate<T> takes a pointer to an uninitialized array, the size of the array and a function f that maps each natural number less than n to a value of the type T, and it initializes the array with the sequence of values of f(0), f(1), ..., and f(n-1). In other words, the array contains a tabulation of the given function f after the initialization is over.

Given a type T and an integer N, @[T][N] is a built-in type in ATS for N consecutive values of the type T. Therefore, the at-view @[T][N]@L for any given address L is equivalent to the array-view array_v(T, N, L). By making use of the feature of call-by-reference, we can also assign the following interfaces to the previously presented functions arrget and arrset:

fun{a:t@ype}
arrget {n,i:nat | i < n} (A: &(@[a][n]), i: int i): a

fun{a:t@ype}
arrset {n,i:nat | i < n} (A: &(@[a][n]), i: int i, x: a): void

These interfaces are more concise as they obviate the need to mention explicitly where the array argument is located.

Please find the entirety of the above presented code on-line.