Chapter 7. Functional List-Processing (1)

Lists are by far the most commonly used data structure in functional programming: The functional programming language LISP is even named after LISt Processor. For someone with a background in imperative programming, it is important to note that a functional list is essentially an immutable linked-list: Such a list can never be changed after its creation. In particular, functional list-processing cannot modify any lists being processed.

In ATS, there is a datatype list0 declared as follows:

// datatype list0(a:t@ype) = list0_nil of () | list0_cons of (a, list0(a)) //

Note that t@ype is a sort for static terms representing types of dynamic values of unspecified size. There is also a sort type in ATS, which is for types of boxed dynamic values (of the size of exactly one machine word). Given any type T, list0(T) is for lists consisting of elements of the type T.

Every datatype is associated with a set of data-constructors, which are called for constructing (boxed) values of the datatype. The declaration of list0 indicates that list0_nil and list0_cons are the two constructors assocated with list0; list0_nil is nullary; list0_cons is binary, which takes a given element and a given list to form a new list such that the given element and list are the head and tail of the newly formed list, respectively. For instance, the following code binds xs3 to a list of type list0(int) that contains three elements 3, 2, and 1:

val xs0 = list0_nil{int}() // xs = () val xs1 = list0_cons(1, xs0) // xs = (1) val xs2 = list0_cons(2, xs1) // xs = (2, 1) val xs3 = list0_cons(3, xs2) // xs = (3, 2, 1)

If list0_nil() is used instead, then the typechecker of ATS can only infer that the type of xs3 is list0(T) for some type T such that 3, 2, and 1 are values of T. Note that the type system of ATS is highly expressive and it is beyond the focus of this book to cover advanced use of types in ATS.

In ATS, nil0 and cons0 are declared aliases for list0_nil and list0_cons, respectively. Following the convention of ML, we can use :: for list0_cons as well:

#define :: list0_cons val xs123 = 1 :: 2 :: 3 :: nil0{int}() // xs123 = (1, 2, 3)

In addition, another notation for constructing list0-values is used in the following sample code:

val xs123 = g0ofg1($list{int}(1, 2, 3)) // xs123 = (1, 2, 3) val ys123 = g0ofg1($list{string}("1", "2", "3")) // ys123 = ("1", "2", "3")

Note that $list{T}(...) is for constructing a list-value in which each element is of type T and g0ofg1 is a cast function (of zero run-time cost) that casts a list-value into the corresponding list0-value.

The function for computing the length of a given list0-value can be implemented as follows:

fun {a:t@ype} list0_length (xs0: list0(a)): int = ( case xs0 of | list0_nil() => 0 | list0_cons(x0, xs1) => 1 + list0_length<a>(xs1) ) (* end of [list0_length] *)

Strictly speacking, list0_length is a function template (rather than a function). In the case-expression, there are two pattern matching clauses; each clause consists of a guard, which is pattern, and a body, which is an expression. When the value xs0 matches the guard of a clause, the body of the clause is chosen for subsequent evaluation. If xs0 is empty, then it is constructed with list0_nil and thus matches the pattern list0_nil(). If xs0 is non-empty, then then it is constructed with list0_cons and thus matches the pattern list0_cons(x0, xs1), resulting in the names x0 and xs1 bound to the head and tail of xs0, respectively.

Though the given implementation of list0_length is not tail-recursive, it should be clear that a tail-recursive implementation can be readily done. Note that the function list0_length is of O(n)-time for n being the length of its argument. Often I see someone writing the code list0_length(xs) > 0 for testing whether a given list xs is empty. This practice is terribly inefficient as checking whether a list is empty can be easily done in O(1)-time.

The function for concatenating two given list0-values can be implemented as follows:

fun {a:t@ype} list0_append (xs: list0(a), ys: list0(a)): list0(a) = ( case+ xs of | list0_nil() => ys | list0_cons(x, xs) => list0_cons(x, list0_append<a>(xs, ys)) ) (* end of [list0_append] *)

Given xs and ys, list0_append returns a list that represents the concatenation of xs and ys. Clearly, list0_append is O(n)-time for n being the length of xs. The implementation of list0_append is functional in the sense that it does not alter either xs or ys for the construction of the concatenation. For clarification, I point out that there is a function of the name list_vt_append in ATS that consumes two given linear lists to construct their concatenation. When a call to list_vt_append returns, the two linear lists passed to the call are no longer available for subsequent use.

The function for constructing the reverse of a given list0-value can be implemented as follows:

fun {a:t@ype} list0_reverse (xs: list0(a)): list0(a) = list0_revappend<a>(xs, list0_nil()) and list0_revappend (xs: list0(a), ys: list0(a)): list0(a) = ( case+ xs of | list0_nil() => ys | list0_cons(x, xs) => list0_revappend<a>(xs, list0_cons(x, ys)) ) (* end of [list0_revappend] *)

Given a list0-value xs, list0_reverse returns a newly contructed list0-value that represents the reverse of xs. Given two list0-values xs and ys, list0_revappend returns a newly contructed list0-value that represents the concatenation of the reverse of xs and ys. Clearly, both list0_reverse and list0_revappend are O(n)-time functions, where n is the length of xs.

A commonly used list-combinator is the list0_foldleft function implemented as follows:

// extern fun {r:t@ype} {a:t@ype} list0_foldleft ( xs: list0(a) , r0: r, fopr: cfun(r, a, r)): r // implement {r}{a} list0_foldleft ( xs , r0, fopr) = loop(xs, r0) where { fun loop ( xs: list0(a), r0: r ) : r = ( case+ xs of | list0_nil() => r0 | list0_cons(x, xs) => loop(xs, fopr(r0, x)) ) (* end of [loop] *) } //

where the elements in the given list xs are processed from left to right. Clearly, list0_foldleft is tail-recursive.

As an example, the function for computing the length of a given list0-value can be implemented with a direct call to list0_foldleft:

// fun {a:t@ype} list0_length (xs: list0(a)): int = list0_foldleft<int><a>(xs, 0, lam(r, _) => r + 1) //

As another example, the reverse-append function that returns the link of the reverse of a given list0-value with another given list0-value can also be implemented with a direct call to list0_foldleft:

fun {a:t@ype} list0_revappend ( xs: list0(a), ys: list0(a) ) : list0(a) = list0_foldleft<list0(a)><a>(xs, ys, lam(ys, x) => list0_cons(x, ys))

In contrast to list0_foldleft, the following function list0_foldright processes the elements in a given list from right to left:

// extern fun {a:t@ype} {r:t@ype} list0_foldright ( xs: list0(a) , fopr: cfun(a, r, r), r0: r): r // implement {a}{r} list0_foldright ( xs , fopr, r0) = auxlst(xs) where { fun auxlst (xs: list0(a)): r = ( case+ xs of | list0_nil() => r0 | list0_cons(x, xs) => fopr(x, auxlst(xs)) ) (* end of [auxlst] *) } //

Please note that list0_foldright is not tail-recursive. As an example, the append function for concatenating two given list0-values can be implmented with a direct call to list0_foldright:

// fun {a:t@ype} list0_append (xs: list0(a), ys: list0(a)): list0(a) = list0_foldright<a><list0(a)> (xs, lam(x, ys) => list0_cons(x, ys), ys) //

As another example, the list-concatenate function for concatenating a list of list0-values can also be implemented with a direct call to list0_foldright:

// fun {a:t@ype} list0_concat (xss: list0(list0(a))): list0(a) = list0_foldright<list0(a)><list0(a)> (xss, lam(xs, res) => list0_append<a>(xs, res), list0_nil()) //

In a case where a function (for example, list0_length) can be implemented with a call to either list0_foldleft or list0_foldright, it is clear (unless there is a very special reason) that the former should be chosen as it is tail-recursive but the latter is not.

Often I see a beginner of functional programming giving the following implementation of list0_reverse:

fun {a:t@ype} list0_reverse (xs: list0(a)): list0(a) = ( case+ xs of | list0_nil() => list0_nil() | list0_cons(x, xs) => list0_append<a>(list0_reverse<a>(xs), list0_cons(x, list0_nil())) )

While this implementation is functionally correctly, it is of O(n2)-time complexity and thus can take a prohibitively long time for an input that is not so short (e.g., one containing 10K elements).

As the last example of this chapter, list0_insertion_sort is implemented as follows that applies insertion sort to a given list to return a sorted version of the list:

// extern fun {a:t@ype} list0_insertion_sort ( xs: list0(a), cmp: cfun(a, a, int) ) : list0(a) // implement {a}(*tmp*) list0_insertion_sort (xs, cmp) = let // fun insord ( x0: a , xs: list0(a)): list0(a) = ( case+ xs of | list0_nil() => list0_sing(x0) | list0_cons(x1, xs1) => ( if cmp(x0, x1) < 0 then list0_cons(x0, xs) else list0_cons(x1, insord(x0, xs1)) // end of [if] ) // end of [list0_cons] ) (* end of [insord] *) // in // list0_foldleft<list0(a)><a> (xs, list0_nil(), lam(res, x) => insord(x, res)) // end // end of [list0_insertion_sort] //

Note that the argument cmp is a comparion function that is expected to return -1, 1, or 0 when its first argument is less than, greater than, or equal to its second argument, respectively.

Please find on-line the entirety of the code used in this chapter.