abstype myseq
Naturally, we assign the following interface to [mergesort]:
fun mergesort (xs: myseq): myseq
A straightforward implementation of [mergesort] can be given as follows:
implement mergesort (xs) = let val n = myseq_length (xs) in // if n >= 2 then let val (xs1, xs2) = myseq_split (xs) in myseq_merge (mergesort (xs1), mergesort (xs2)) end else (xs) // end of [if] // end // end of [mergesort]where the functions [myseq_length], [myseq_split] and [myseq_merge] are assigned the following types:
fun myseq_length (xs: myseq): int fun myseq_split (xs: myseq): (myseq, myseq) fun myseq_merge (xs1: myseq, xs2: myseq): myseqand their functionalities are briefly explained below:
In ATS, we can refine [myseq] as follows by making it a type constructor carrying one integer parameter:
abstype myseq(n:int)
Given a static integer N, [myseq(N)] is for myseq-values of length N.
The type assigned to [mergesort] can be refined as follows to capture
the invariant that a given myseq-value and its sorted version have the
same length:
fun mergesort{n:nat} (xs: myseq(n)): myseq(n)
As for the functions [myseq_length], [myseq_split]
and [myseq_merge], they can be assigned the following more informative types:
// fun myseq_length{n:int} (xs: myseq(n)): int(n) // fun myseq_split{n:int | n >= 2} (xs: myseq(n)): [n1,n2:pos | n1+n2==n] (myseq(n1), myseq(n2)) // fun myseq_merge{n1,n2:nat} (xs1: myseq(n1), xs2: myseq(n2)): myseq(n1+n2) //Note that the type assigned to [myseq_split] implies that this function can only be applied to a given myseq-value containing at least 2 items and the two myseq-values it returns are strictly shorter than the given myseq-value.
The above implementation of [mergesort] can be slightly modified into the following version:
implement mergesort (xs) = let // fun sort {n:nat} .<n>. ( xs: myseq(n) ) : myseq(n) = let val n = myseq_length (xs) in if n >= 2 then let val (xs1, xs2) = myseq_split (xs) in myseq_merge (sort (xs1), sort (xs2)) end else (xs) // end of [if] end // end of [sort] // in sort (xs) end // end of [mergesort]Note that [sort] is verified to be terminating based on the termination metric <n> (supplied by the programmer).
When trying to implement [myseq_split], we should be able to quickly realize that the following interface is much more suitable for it:
fun myseq_split{n:int | n >= 2}
(xs: myseq(n), n: int n): (myseq(n/2), myseq(n-n/2))
The interface states that [myseq_split] returns a pair of myseq-values of
length n/2 and n-n/2 when applied to a myseq-value of length n and an
integer of value n. The implementation of [mergesort] can now be slightly
modified as follows:
implement mergesort (xs) = let // fun sort {n:nat} .<n>. ( xs: myseq(n), n: int n ) : myseq(n) = let in if n >= 2 then let val n2 = half (n) val (xs1, xs2) = myseq_split (xs, n) in myseq_merge (sort (xs1, n2), sort (xs2, n-n2)) end else (xs) // end of [if] end // end of [sort] // in sort (xs, myseq_length(xs)) end // end of [mergesort]We now have a specification-like implementation of mergesort that typechecks, which can be thought of as some form of blueprint intended for implementing mergesort on concrete types such as lists and arrays.
fun{a:t0p}
mergesort{n:nat} (xs: list (a, n)): list (a, n)
This interface indicates that [mergesort] is a function template
parameterized over the type of the elements in a list given as its
argument. Let us now focus on [myseq_merge], which is given the
following interface:
fun{a:t0p} myseq_merge {n1,n2:nat} (xs1: list(a, n1), xs2: list(a, n2)): list(a, n1+n2) // end of [myseq_merge]Following is a straightforward implementation of [myseq_merge] on lists:
implement {a}(*tmp*) myseq_merge (xs10, xs20) = let in // case+ xs10 of | cons (x1, xs11) => ( case+ xs20 of | cons (x2, xs21) => let val sgn = gcompare_val<a> (x1, x2) in if sgn <= 0 then cons{a}(x1, myseq_merge<a> (xs11, xs20)) else cons{a}(x2, myseq_merge<a> (xs10, xs21)) // end of [if] end (* end of [cons] *) | nil ((*void*)) => xs10 ) | nil ((*void*)) => xs20 // end // end of [myseq_merge]Note that [gcompare_val] is a generic function template for comparing values.
Please find the entirety of the implementation of mergesort on lists plus some testing code in mergesort_list.dats.
First, as an array in ATS is of C-style, there is no size information attached to the array. So the interface for [mergesort] needs to be modified as follows:
fun{a:t0p}
mergesort
{n:nat}
(xs: arrayref(a, n), n: int n): arrayref(a, n)
This interface means that [mergesort] takes both an array and the size of
the array as its two arguments.
Following is the interface for [myseq_merge] on arrays:
fun{a:t0p} myseq_merge {n1,n2:nat} ( xs1: arrayref(a, n1), xs2: arrayref(a, n2), n1: int(n1), n2: int(n2) ) : arrayref(a, n1+n2) // end of [myseq_merge]There is extensive use of unsafe programming features in my implementation of [myseq_merge] on arrays. Writing code in this way is primarily for the purpose of saving some time as the given implementation of mergesort on arrays is not meant for practical use; it is only meant to make a point that the specification-like implementation of [mergesort] given above can indeed be adapted to handle arrays.
Please find the entirety of the implementation of mergesort on arrays plus some testing code in mergesort_array.dats.
This article is written by Hongwei Xi.