atsdoc


The command atsdoc is a utility in ATS for turning a text file laden with texting function calls into one in which these calls are replaced with the strings represented by their return values. This utility is designed for people who have some basic knowledge of programmming in ATS.

Texting Function Calls

The syntax for texting function calls (TFC's) is given as follows:

funarg    ::= DSTRING | SSTRING | INTEGER | ID | funcall
funarglst ::= /*empty*/ | funarg | funarg "," funarglst
funcall   ::= "#" ID {funres} "(" funarglst ")" | "#" ID {funres} "{" funarglst "}"
funres    ::= "[" ID "]"
where DSTRING and SSTRING are for double-quoted and single-quoted strings, respectively, and INTEGER is for integers of base 10, and ID is for valid identifiers in ATS. For instance, following are some syntactically valid TFC's:
#fact(10)
#timestamp[NOW]()
#foo("#Hello("world")")
#foo("#Hello('world')")
#bar(#version(), 100)
#foolst{itm1, itm2, itm3}

Text Laden with TFC's

Let us coin a word atext to refer to text laden with TFC's. Suppose that following is the content of a file named foo.atxt:

Author: #author()
Time of the last modification: #timestamp[NOW]()

If we execute the following command-line:

atsdoc --outcode foo_atxt.dats --outdata foo_atxt.txt -i foo.atxt
then two files named foo_atxt.dats and foo_atxt.txt are generated. It is also possible to do the same thing by executing the following command-line:
atsdoc --outcode foo_atxt.dats -i foo.atxt > foo_atxt.txt

The content of foo_atxt.dats is listed as follows:

(*
foo.atxt: 10(line=1, offs=10) -- 18(line=1, offs=18)
*)
val __tok1 = author()
val () = theAtextMap_insert_str ("__tok1", __tok1)

(*
foo.atxt: 51(line=2, offs=33) -- 67(line=2, offs=49)
*)
val NOW = timestamp()
val () = theAtextMap_insert_str ("NOW", NOW)
Note that the name of the identifier __tok1 is generated automatically while the name of the identifer NOW is taken from the input. The embedded location information in foo_atxt.dats is present primarily for the purpose of debugging.

The content of foo_atxt.txt is listed as follows:

Author: #__tok1$
Time of the last modification: #NOW$
Note that each marked token in foo_atxt.txt is formed by placing an identifier between the char '#' and the char '$'.

The plan is to compile foo_atxt.dats into an executable that can generate a text file by replacing each marked token in foo_atxt.txt with some text attached to it. However, the main function is not present in foo_atxt.dats. Also, the functions author and timestamp are not available. By embedding proper ATS source code into foo.atxt, we can readily resolve these issues and fulfill the plan.

Let foo2.atxt be a file of the following content:

%{
//
dynload "libatsdoc/dynloadall.dats"
//
staload "libatsdoc/SATS/libatsdoc_atext.sats"
//
%}

%{
fn author () = atext_strcst"John Doe"
%}

%{
staload
UN = "prelude/SATS/unsafe.sats"
staload TIME = "libc/SATS/time.sats"

fn timestamp
  (): atext = let
  var time = $TIME.time_get ()
  val (fpf | x) = $TIME.ctime (time)
  val x1 = sprintf ("%s", @($UN.castvwtp1(x)))
  prval () = fpf (x)
  val x1 = string_of_strptr (x1)
in
  atext_strcst (x1)
end // end of [val]
%}

Author: #author()
Time of the last modification: #timestamp[NOW]()

%{
implement main () = fprint_filsub (stdout_ref, "foo2_atxt.txt")
%}
Any text surrounded by the special symbols '%{' and '%}' is copied into foo2_atxt.dats after the following command-line is executed:
atsdoc -do foo2_atxt.dats -i foo2.atxt > foo2_atxt.txt
The function fprint_filsub is called to replace each marked token in foo2_atxt.txt with the string attached to it.

We can now compile foo2_atxt.dats into foo2 and then dump the output from executing foo2 into foo2.output:

atscc -o foo2 foo2_atxt.dats -latsdoc
./foo2 > foo2.output
As can be expected, following is the content of foo2.output:
Author: John Doe
Time of the last modification: Wed Aug 24 20:31:59 2011

Representation for Texts

The functions author and timestamp presented above do not return strings. Instead, they return values of the type text, which is declared in libatsdoc/SATS/libatsdoc_atext.sats as follows:

datatype atext =
//
  | ATEXTnil of () // empty text
//
  | ATEXTstrcst of string // string constants
  | ATEXTstrsub of string // strings containing marked tokens
//
  | ATEXTapptxt2 of (atext, atext) // text concatenation
  | ATEXTappstr2 of (string, string) // string concatenation
//
  | ATEXTapptxt3 of (atext, atext, atext) // text concatenation
  | ATEXTappstr3 of (string, string, string) // string concatenation
//
  | ATEXTconcatxt of atextlst // text concatenation
  | ATEXTconcatxtsep of (atextlst, atext(*sep*)) // text concatenation with separator
// end of [atext]

where
atextlst = List (atext)
and
stringlst = List (string)
The meaning of all the data constructors associated with the datatype atext should be easily understood except ATEXTstrsub, which indicates that its (string) argument may contain marked tokens, that is, symbols formed by placing identifiers between the two characters '#' and '$'. When stringizing a value of the form ATEXTstrsub(str) for some string str, we must replace each marked token in str with the string it represents. For further details, please see the implementation of fprint_strsub in libatsdoc/DATS/libatsdoc_atext.dats.


Please find on-line all the files involved in the above presentation. The atext file for producing the current html file is also available on-line.