[
a copy of the SuccessorML overview from
    http://www.mpi-sws.org/~rossberg/hamlet/README-succ.txt
]

This version of HaMLet incorporates a number of simple extensions and
changes proposed for Successor ML [www.successor-ml.org].


Lexical Syntax
--------------

* Line Comments.

  The token "(*)" begins a comment that stretches until the end of line.

* Extended Literals.

  Numeric literals may contain underscores to group digits, as in
  1_000_000_000, 3.141_592_653, or 0wx_f300_4588. Furthermore, numeric
  literals in binary notation are supported, e.g. 0wb1101_0010.


Records
-------

* Record Punning (derived form).

  In record expressions, a field of the form id=id can be abbreviated
  to id (in Standard ML that is only allowed in patterns). For instance:

    fn {a, b, c} => {a, b=b+1, c, d=0}

* Record Extension.

  A record r can be extended using the notation {a=2, b=3, ...=r}. For
  instance,

    val r1 = {a=1, b=2}
    val r2 = {c=0, ...=r1}

  binds r2 to {a=1, b=2, c=0}. The same syntax is available in patterns
  to match the "rest" of a record:

    case r2 of {b, ...=r'} => r'

  evaluates to {a=1, c=0}.

  As a derived form, ellipses can appear at any position in the record
  (but only once):

    val r2 = {...=r1, c=0}

  Note that the context must still determine the fields denoted by the
  ellipses (i.e. there is no record polymorphism yet). Those fields may
  not overlap with the enumerated ones.

  Record types can also be formed by extension:

    type 'a t = {a:'a, b:bool}
    type 'a u = {c:char, d:'a list, ...:'a t}

  Now 'a u is equivalent to {a:'a, b:bool, c:char, d:'a list}.

* Record update (derived form).

  Besides extension, a record r can be updated, using the notation
  {r where a=2, b=3}. For example:

    val r1 = {a=1, b=2, c=6}
    val r2 = {r1 where b="hi"}

  binds r2 to {a=1, b="hi", c=6}.

  Again, the context must determine the fields in the base record.
  It must include all updated fields (i.e. it is orthogonal to extension).

  Note that update may change the types of modified fields.


Pattern Matching
----------------

* Disjunctive Patterns.

  So-called "or-patterns" of the form "pat1 | pat2" are supported. For
  example:

    fun fac (0|1) = 1
      | fac   n   = n * fac(n-1)

  The syntax subsumes multiple or-patterns and multiple matches:

    case exp of
      | A | B | C => 1
      | D | E     => 2

* Conjunctive Patterns.

  Layered patterns ("as" patterns) have been generalised to allow arbitrary
  subpatterns on both sides, i.e. "pat1 as pat2". They are useful in
  combination with nested matches (see below), but also have the advantage that
  the order of binding may now be inverted (pat as x), which sometimes is more
  readable.

* Nested matches.

  Patterns may contain nested matching constructs of the form

    pat1 with pat2 = exp

  Such a pattern is matched by first matching pat1, then evaluating exp, and
  matching its result against pat2. Variables bound in pat1 may occur in exp.
  The pattern fails when either pattern does not match. The pattern binds the
  combined set of variables occuring in pat1 and pat2. For instance, consider:

    case xs of [x,y] with SOME z = f(x,y) => x+y+z | _ => 0

  If xs is a two-element list [x,y] such that f(x,y) returns SOME z, then the
  whole expression evaluates to x+y+z, otherwise to 0.

  Nested matches are a very general construct that allows simple "views" and
  pattern guards to be defined uniformly as derived forms (see below). They
  can also be useful in combination with disjunctive patterns,

    case args of x::_ | (nil with x = 0) => ...

  or with guards (see below):

    fun escape #"\"" = "\\\""
      | escape #"\\" = "\\\\"
      | escape (c with n=ord c) if (n < 32) = "\\^" ^ str(chr(n+64))
      | escape c = str c

* Pattern guards (derived form).

  Patterns may contain a boolean guards of the form "if exp", like in

    fun nth(l, n) =
        case (l, n) of
           | (x::_,  0)        => x
           | (_::xs, n) if n>0 => nth(xs, n-1)
           |     _             => raise Subscript

  Guards are also allowed in function declarations:

    fun nth(x::_,  0)          = x
      | nth(_::xs, n) if (n>0) = nth(xs, n-1)
      | nth(  _,   _)          = raise Subscript

  A pattern guard "pat if exp" actually is syntactic sugar for the the nested
  pattern "pat with true = exp" and may hence appear inside other patterns.

* Transformation patterns (derived form).

  Transformation patterns provide "poor man's views" in the form

    ?exp
    ?exp pat

  The first form provides boolean "views":

    fun skipSpace(?isSpace :: cs) = skipSpace cs
      | skipSpace cs = cs

  The parameterised form allows actual matching. Consider a signature for
  queues:

    type 'a queue
    val empty : 'a queue
    val enqueue : 'a * 'a queue -> 'a queue
    val dequeue : 'a queue -> ('a * 'a queue) option

  With transformations, queues can be pattern matched as follows:

    fun process (?dequeue(x,q)) = (digest x; process q)
      | process _ = terminate()

  The transformation may be denoted by an arbitrary expression. Consider a
  simple set ADT:

    type set
    val empty : set
    val insert : int -> set -> set
    val isEmpty : set -> bool
    val has : int -> set -> bool

  The following is possible:

    fun f n ?isempty = f1 ()
      | f n ?(has n) = f2 ()
      | f n _ = f3 ()

  Or another example, with a parameterised transformation:

    (*) val split : char -> string -> (string * string) option

    fun manExp(?(split #"E")(b,e)) = (b,e)
      | manExp s = (s,"1")

  Transformation patterns are actually only sugar for nested matches: the
  form "?exp" expands to "x with true = exp(x)", while the parameterised form
  "?exp pat" expands to "x with SOME(pat) = exp(x)".

  As a minor subtlety, in patterns with multiple subpatterns, patterns to the
  right may refer to variables bound by patterns to the left. For example,

    (x, ?(equals x))
    x as ?(notOccurs x)(T(x1,x2))

  In particular, this allows the function f from above to be expressed without
  a separate case expression.

  Note that HaMLet also features proper views (see below). Simultaneous support
  for both these features should allow evaluating the merits of each approach.

* Optional Bars and Semicolons.

  An optional bar is allowed before the first rule of a match, yielding
  more regular and editing-friendly notation:

    case f n of
       | LESS => f(n-1)
       | EQUAL => n
       | GREATER => f(n+1)

  This is also supported for "fn" and "handle", for function declarations, and
  for datatype declarations. For instance,

    datatype 'a exp =
       | Const  of 'a
       | Var    of string
       | Lambda of string * 'a exp
       | App    of 'a exp * 'a exp

  In a similar vein, optional terminating semicolons are allowed for expression
  sequences. For example, in a let expression:

    fun myfunc1(x, y) =
        let val z = x + y in
          f x;
          g y;
          h z;
        end

  The same applies to parenthesised expressions and sequences.


Value Definitions
-----------------

* Simpler Recursive Bindings.

  Recursive bindings can no longer override constructor status.

  The syntax for recursive bindings has been made less baroque: the "rec"
  keyword must always follow directly after "val", and may no longer be
  repeated. This just rules out pathological cases most programmers might
  not even be aware of.

* Strengthened Value Restriction.

  The value restriction has been extended to demand that patterns in
  polymorphic bindings are exhaustive. This fixes a conceptual bug
  in the language and enables type passing implementations.

* Do declarations.

  The simple derived form

    do exp

  expands to "val () = exp" and allows expressions to be evaluated for their
  side effects within declarations.


Type Definitions
----------------

* Withtype Specifications (derived form).

  The "withtype" syntax for defining mutually recursive datatypes and
  type synonyms is available in signatures (in Standard ML it is only
  supported in structures).

* Proper Scoping for Transparent Type Specifications (derived form).

  Transparent type specifications in signatures have scoping rules
  consistent with the rest of the language, i.e.

    type t = bool
    signature S =
    sig
      type t = int
      and  u = t
    end

  no longer equates u with int.

* Degraded Abstype (derived form).

  The more or less obsolete "abstype" declaration form has been removed
  from the bare language and been redefined as a simple derived form.
  This change does not have much visible effect, but simplifies
  implementations and can be seen as a first step towards removing it
  altogether.

* Abolished "and" in Type Realisations (derived form).

  The syntax of type realisations using "where" does no longer allow
  multiple equations connected with "and". That arcane syntax produced
  an annoying singularity in the language and was never implemented
  correctly by most SML implementations.


Views
-----

* View Declarations.

  Views enable the definition of abstract constructors for arbitrary types that
  can be used in patterns as if they were ordinary datatype constructors. A
  view primarily defines a set of constructors and two functions for converting
  between these and the actual type the view is defined for. For example,
  consider a simple view allowing (positive) integers to be viewed as inductive
  numbers:

    viewtype peano = int as Zero | Succ of int
    with
      fun from Zero     = 0
        | from (Succ n) = n+1
      fun to 0          = Zero
        | to n if (n>0) = Succ(n-1)
        | to n          = raise Domain
    end

  This defines a view for type int. The type constructor peano provides a name
  for this view. Views may be defined for arbitrary types, and there may be
  arbitrarily many views for a given type.

  Given the viewtype definition above, we can construct integers using the
  constructors it introduces:

    val n = Succ(Succ(Succ Zero))   (*) binds n to 3
    val n = Succ 2                  (*) likewise

  The function from given with the view declaration defines how a view
  constructor is converted to the underlying type, and is applied implicitly
  for every occurrence of a view constructor in an expression.

  The inverse function to defines how a value of the underlying type is
  interpreted in terms of the view constructors. It is applied implicitly
  whenever a value of the underlying type is matched against a pattern using
  one of the view's constructors:

    fun fac Zero    = 1
      | fac(Succ n) = Succ n * fac n

  This defines a factorial function on integers. When fac is applied to an
  integer i, the function to is implicitly applied to i first and its result
  is matched against the constructors appearing in the definition of fac.

  The body of a view declaration may contain arbitrary (auxiliary)
  declarations, but must feature the two functions from and to with the
  appropriate types. None of the declarations is visible outside the view
  declaration.

  Views must be used consistently, that is, a match may not use different
  views, or a view and concrete constants of the underlying type, *for the
  same position* in a pattern. For instance, the following is illegal:

    fun fac (0 | 1)  = 1
      | fac (Succ n) = Succ n * fac n

  Thanks to this restriction, the compiler is still able to check
  exhaustiveness and irredundancy of patterns, even in the presence of views.

* View Specifications.

  Views are particularly interesting in conjunction with abstract types. For
  that purpose, it is possible to specify views in signatures:

    signature COMPLEX =
    sig
      type complex
      viewtype cart = complex as Cart of real * real
      viewtype pole = complex as Pole of real * real
    end

  A view specification can either be matched by a corresponding view
  declaration, or by an appropriate datatype definition:

    structure Complex :> COMPLEX =
    struct
      datatype cart = Cart of real * real
      type complex = cart
      viewtype pole = complex as Pole of real * real
      with
        open Math
        fun to(Cart(x,y)) = Pole(sqrt(x*x + y*y), atan2(x,y))
        fun from(Pole(r,t)) = Cart(r*cos(t), r*sin(t))
      end
   end

  The implementation of a viewtype is kept abstract, and both of the above
  views can be used uniformly where appropriate:

    open Complex
    fun add(Cart(x1,y1), Cart(x2,y2)) = Cart(x1+x2, y1+y2)
    fun mul(Pole(r1,t1), Pole(r2,t2)) = Pole(r1*r2, t1+t2)

* View Replication (derived form).

  Instead of opening the structure, a view can also be pulled into scope (and
  thus enable unqualified use of its constructors) by a viewtype replication
  declaration, analogous to SML's datatype replication:

    viewtype cart = viewtype Complex.cart

  Apart from viewtype replication, the name of a view acts as a synonym for
  the underlying representation type -- except inside the view definition
  itself, where it used to denote the (otherwise anonymous) datatype
  representing the view.


Modules
-------

* Higher-order Functors.

  To support higher-order modules, structure expressions are generalised
  to include functor expressions, analogous to function expressions in the
  core:

    fct strid : sigexp => strexp

  Likewise, signature expressions may denote dependent functor signatures:

    fct strid : sigexp1 -> sigexp2

  As a derived form, non-dependent functor signatures (where strid does
  not occur in sigexp2) may be abbreviated as follows:

    sigexp1 -> sigexp2

  SML's functor declarations are degraded to a mere derived forms,
  analogous to function declarations with "fun" in the core language. They
  support curried functors:

    functor strid (strid1 : sigexp1) ... (stridN : sigexpN) = strexp

  For uniformity, and to avoid subtle syntax, the identifier classes for
  structures and functors are merged. As another derived form, SML/NJ
  compatible syntax is provided for functor descriptions in signatures:

    functor strid (strid1 : sigexp1) ... (stridN : sigexpN) : sigexp

  Functor application syntax is generalised to

    strexp1 strexp2

  as in the core. Parentheses are allowed anywhere in structure and
  signature expressions. The derived form allowing a parenthesised
  declaration as a functor argument is maintained and generalised by
  enabling

    ( strdec )

  to abbreviate a structure in all contexts. For symmetry,

    ( spec )

  can be used to abbreviate a signature. Particularly, it can
  abbreviate a functor argument:

    fct (spec) => strexp
    fct (spec) -> sigexp

  which is also allowed in the functor declaration and specification
  derived forms, generalising the similar derived form known from SML.

  The semantics is kept simple. All functors are fully generative.

* Nested Signatures.

  Signatures are allowed as structure members. This implies the presence
  of qualified signature identifiers A.B.S, and the addition of (manifest)
  signature specifications in signatures:

    signature S =
    sig
      type t
      signature T = sig val x : t end
    end
    structure M : S =
    struct
      type t = int
      signature T = sig val x : int end
    end
    structure N : M.T = struct val x = 79 end

  A signature definition matches a signature specification if and only if
  they denote equivalent signatures.

* Local Modules.

  Structure, functor and signature declarations are allowed in local scope:

    fun sortWithoutDups compare =
        let
          structure Set = MkSet(type t = string; val compare = compare)
        in
          Set.toList o foldr Set.insert Set.empty
        end

  Naturally, types defined in local scope may not leak.

  Furthermore, as a derived form, open declarations may contain arbitrary
  module expressions:

    fun sortWithoutDups compare =
        let
          open MkSet(type t = string; val compare = compare)
        in
          toList o foldr insert empty
        end

* First-class modules.

  Modules can be wrapped up as first-class values, by giving a module
  expression and an appropriate signature:

    val p = pack Int : INTEGER

  The type of such a value is

    val p : pack INTEGER

  To unwrap a package, another signature constraint is necessary, e.g.:

    fun four x =
        let
          structure I = unpack x : INTEGER
        in
          I.toString(I.fromString "4")
        end

  Or, using generalised "open":

    fun four x =
        let
          open unpack x : INTEGER
        in
          toString(fromString "4")
        end

  The type of four is inferred as "pack INTEGER -> string".

----------

Please see INSTALL.txt for brief instructions and the doc directory for
the full documentation. See LICENSE.txt for licensing information.

All HaMLet sources (c) 1999-2007 Andreas Rossberg.

http://www.ps.uni-sb.de/hamlet/#successor-ml
