Example: Functional Random-Access Lists

The data structure I implement in this section is based on one presented in the book titled Pure Funtional Data Structures by Chris Okasaki, where more elaboration on data structures of this kind can be found.

Let us first declare a datatype as follows:

typedef pt (a:t@ype) = '(a, a)

datatype
ralist (a:t@ype+, n:int) =
  | RAnil (a, 0)
  | {n:pos} RAevn (a, 2*n) of ralist (pt a, n)
  | {n:nat} RAodd (a, 2*n+1) of (a, ralist (pt a, n))
// end of [ralist]

Given an unboxed type T and a static integer, the type ralist(T, I) is a boxed type for values representing lists of length I. The meaning of the three constructors RAnil, RAevn and RAodd can be briefly explained as follows:

For example, the list of the first 5 positive integers is represented by the following value:

RAodd(1, RAevn(RAodd( '( '(2, 3), '(4, 5) ), RAnil())))

Values constructed by the constructors RAnil, RAevn and RAodd represent lists that support operations of logrithmic time for accessing and updating list elements, and such lists are often referred to as random-access lists.

Note that the datatype ralist is not supported in ML even if the index representing list length is erased. This kind of datatypes are often referred to as nested datatypes, which are also supported in Haskell.

The following defined function ralist_length computes the length of a random-access list:

fun{a:t@ype}
ralist_length // O(log n)
  {n:nat} .<n>.
  (xs: ralist (a, n)): int n =
  case+ xs of
  | RAnil () => 0
  | RAevn xxs => 2 * ralist_length<pt(a)> (xxs)
  | RAodd (_, xxs) => 2 * ralist_length<pt(a)> (xxs) + 1
// end of [ralist_length]

While the implementation of ralist_length is clearly not tail-recursive, this is hardly of any practical concern as the time-complexity of ralist_length is O(log(n)).

Consing means to form a new list with an element and a list such that the element and the list are the head and tail of the new list. The following defined function ralist_cons implements consing for random-access lists:

fun{a:t@ype}
ralist_cons // O(1), amortized
  {n:nat} .<n>.
  (x0: a, xs: ralist (a, n)): ralist (a, n+1) =
  case+ xs of
  | RAnil () => RAodd (x0, RAnil)
  | RAevn (xxs) => RAodd (x0, xxs)
  | RAodd (x1, xxs) =>
      RAevn (ralist_cons<pt(a)> ( '(x0, x1), xxs ))
    // end of [RAodd]
// end of [ralist_cons]

In the worst case (where the length of xs is a power of 2 minus 1), ralist_cons takes O(log(n))-time to finish. However, it can be proven that the amortized time-complexity of consing on random-access lists is O(1).

Unconsing does the opposite of consing: It returns a pair consisting of the head and tail of a given non-empty list. The following defined function ralist_uncons implements unconsing for random-access lists:

fun{a:t@ype}
ralist_uncons // O(1), amortized
  {n:pos} .<n>.
  (xs: ralist (a, n)): (a, ralist (a, n-1)) =
  case+ xs of
  | RAevn (xxs) => let
      val (xx, xxs) = ralist_uncons<pt(a)> (xxs)
    in
      (xx.0, RAodd (xx.1, xxs))
    end // end of [RAevn]
  | RAodd (x, xxs) => (case+ xxs of
(*
// Note: [=>>] is needed for enforcing sequentiality
// during typechecking:
*)
      RAnil () => (x, RAnil) | _ =>> (x, RAevn (xxs))
    ) // end of [RAodd]
// end of [ralist_uncons]

Like ralist_cons, ralist_uncons takes O(log(n))-time to finish when the length of xs is a power of 2. However, its amortized time-complexity is also O(1). It is highly probable for a programmer to implement the second matching clause in the body of ralist_uncons as follows:

  | RAodd (x, xxs) => (x, RAevn (xxs))

For instance, I myself did this. This is a program error as the invariant can potentially be broken that states RAevn being only applied to a value representing a non-empty list. The error is readily caught by the typechecker of ATS but it is most likely to go unnoticed in a setting where the invariant on the constructor RAevn can not be captured at compile-time.

Given a random-access list xs of length n, the elements in it are numbered from 0 to n-1, inclusively. We can find element i in the list xs, where i is assumed to be a natural number less than n, by implementing the following algorithm:

The following function ralist_lookup is implemented precisely according to the given algorithm:

fun{a:t@ype}
ralist_lookup // O(log(n))-time
  {n:int} {i:nat | i < n} .<n>.
  (xs: ralist (a, n), i: int i): a =
  case+ xs of
  | RAevn xxs => let
      val i2 = i / 2
      val lr = i - 2 * i2
      val xx = ralist_lookup<pt(a)> (xxs, i2)
    in
      if lr = 0 then xx.0 else xx.1
    end // end of [RAevn]
  | RAodd (x, xxs) =>
      if i > 0 then let
        val i1 = i - 1
        val i2 = i1 / 2
        val lr = i1 - 2 * i2
        val xx = ralist_lookup<pt(a)> (xxs, i2)
      in
        if lr = 0 then xx.0 else xx.1
      end else x
    // end of [RAodd]
// end of [ralist_lookup]

Clearly, the time complexity of ralist_lookup is O(log(n)).

Given a random-access list xs of length n, an index that is a natural number less than n and an element x0, the following defined function ralist_update returns another random-access that is the same as xs except element i in it being x0:

fun{a:t@ype}
ralist_update // O(log(n))-time
  {n:int} {i:nat | i < n} .<n>. (
    xs: ralist (a, n), i: int i, x0: a
  ) : ralist (a, n) = let
//
  fun{a:t@ype} fupdate
    {n:int} {i:nat | i < n} .<n,1>. (
      xs: ralist (a, n), i: int i, f: a -<cloref1> a
    ) : ralist (a, n) =
    case+ xs of
    | RAevn xxs =>
        RAevn (fupdate2 (xxs, i, f))
    | RAodd (x, xxs) =>
        if i > 0 then
          RAodd (x, fupdate2 (xxs, i-1, f))
        else RAodd (f(x), xxs)
   (* end of [fupdate] *)
//
   and fupdate2
     {n2:int} {i:nat | i < n2+n2} .<2*n2,0>. (
       xxs: ralist (pt(a), n2), i: int i, f: a -<cloref1> a
     ) : ralist (pt(a), n2) = let
     val i2 = i / 2
     val lr = i - 2 * i2
     val f2 = (
       if lr = 0 then
         lam xx => '(f(xx.0), xx.1)
       else
         lam xx => '(xx.0, f(xx.1))
     ) : pt(a) -<cloref1> pt(a)
   in
     fupdate<pt(a)> (xxs, i2, f2)
   end // end of [fupdate2]
//
in
  fupdate (xs, i, lam _ => x0)
end // end of [ralist_update]

Note that the functions fupdate and fupdate2 are higher-order. Given a random-access list xs of length n, an index i that is a natural number less than n and a (closure) function f, fupdate returns another random-access list that is the same as xs except element i in it being f(x), where x is element i in xs. It is straightforward to see that the time-complexity of ralist_update is O(log(n)). I leave the reader to figure out further details on this interesting implementation.

The code employed for illustration in this section plus some additional code for testing is available on-line.