Previous: Macros in concert with modules, Up: Module system


3.4 Static type system

Scheme48 supports a rudimentary static type system. It is intended mainly to catch some classes of type and arity mismatch errors early, at compile-time. By default, there is only extremely basic analysis, which is typically only good enough to catch arity errors and the really egregious type errors. The full reconstructor, which is still not very sophisticated, is enabled by specifying an optimizer pass that invokes the code usage analyzer. The only optimizer pass built-in to Scheme48, the automatic procedure integrator, named auto-integrate, does so.

The type reconstructor attempts to assign the most specific type it can to program terms, signalling warnings for terms that are certain to be invalid by Scheme's dynamic semantics. Since the reconstructor is not very sophisticated, it frequently gives up and assigns very general types to many terms. Note, however, that it is very lenient in that it only assigns more general types: it will never signal a warning because it could not reconstruct a very specific type. For example, the following program will produce no warnings:

     (define (foo x y) (if x (+ y 1) (car y)))

Calls to foo that are clearly invalid, such as (foo #t 'a), could cause the type analyzer to signal warnings, but it is not sophisticated enough to determine that foo's second argument must be either a number or a pair; it simply assigns a general value type (see below).

There are some tricky cases that depend on the order by which arguments are evaluated in a combination, because that order is not specified in Scheme. In these cases, the relevant types are narrowed to the most specific ones that could not possibly cause errors at run-time for any order. For example,

     (lambda (x) (+ (begin (set! x '(3)) 5) (car x)))

will be assigned the type (proc (:pair) :number), because, if the arguments are evaluated right-to-left, and x is not a pair, there will be a run-time type error.

The type reconstructor presumes that all code is potentially reachable, so it may signal warnings for code that the most trivial control flow analyzer could decide unreachable. For example, it would signal a warning for (if #t 3 (car 7)). Furthermore, it does not account for continuation throws; for example, though it is a perfectly valid Scheme program, the type analyzer might signal a warning for this code:

     (call-with-current-continuation
       (lambda (k) (0 (k))))

The type system is based on a type lattice. There are several maximum or `top' elements, such as :values, :syntax, and :structure; and one minimum or `bottom' element, :error. This description of the type system makes use of the following notations: E : T means that the term E has the type, or some compatible subtype of, T; and Ta <= Tb means that Ta is a compatible subtype of Tb — that is, any term whose static type is Ta is valid in any context that expects the type Tb —.

Note that the previous text has used the word `term,' not `expression,' because static types are assigned to not only Scheme expressions. For example, cond macro has the type :syntax. Structures in the configuration language also have static types: their interfaces. (Actually, they really have the type :structure, but this is a deficiency in the current implementation's design.) Types, in fact, have their own type: :type. Here are some examples of values, first-class or otherwise, and their types:

     cond : :syntax
     
     (values 1 'foo '(x . y))
       : (some-values :exact-integer :symbol :pair)
     
     :syntax : :type
     
     3 : :exact-integer
     
     (define-structure foo (export a b) ...)
     foo : (export a b)

One notable deficiency of the type system is the absence of any sort of parametric polymorphism.

— type constructor: join type ...
— type constructor: meet type ...

Join and meet construct the supremum and infimum elements in the type lattice of the given types. That is, for any two disjoint types Ta and Tb, let Tj be (join Ta Tb) and Tm be (meet Ta Tb):

For example, (join :pair :null) allows either pairs or nil, i.e. lists, and (meet :integer :exact) accepts only integers that are also exact.

(More complete definitions of supremum, infimum, and other elements of lattice theory, may be found elsewhere.)

— type: :error

This is the minimal, or `bottom,' element in the type lattice. It is the type of, for example, calls to error.

— type: :values
— type: :arguments

All Scheme expressions have the type :values. They may have more specific types as well, but all expressions' types are compatible subtypes of :values. :Values is a maximal element of the type lattice. :Arguments is synonymous with :values.

— type: :value

Scheme expressions that have a single result have the type :value, or some compatible subtype thereof; it is itself a compatible subtype of :values.

— type constructor: some-values type ...

Some-values is used to denote the types of expressions that have multiple results: if E1 ... En have the types T1 ... Tn, then the Scheme expression (values E1 ... En) has the type (some-values T1 ... Tn).

Some-values-constructed types are compatible subtypes of :values.

Some-values also accepts `optional' and `rest' types, similarly to Common Lisp's `optional' and `rest' formal parameters. The sequence of types may contain a &opt token, followed by which is any number of further types, which are considered to be optional. For example, make-vector's domain is (some-values :exact-integer &opt :value). There may also be a &rest token, which must follow the &opt token if there is one. Following the &rest token is one more type, which the rest of the sequents in a sequence after the required or optional sequents must satisfy. For example, map's domain is (some-values :procedure (join :pair :null) &rest (join :pair :null)): it accepts one procedure and at least one list (pair or null) argument.

— type constructor: procedure domain codomain
— type constructor: proc (arg-type ...) result-type

Procedure type constructors. Procedure types are always compatible subtypes of :value. Procedure is a simple constructor from a specific domain and codomain; domain and codomain must be compatible subtypes of :values. Proc is a more convenient constructor. It is equivalent to (procedure (some-values arg-type ...) result-type).

— type: :boolean
— type: :char
— type: :null
— type: :unspecific
— type: :pair
— type: :string
— type: :symbol
— type: :vector
— type: :procedure
— type: :input-port
— type: :output-port

Types that represent standard Scheme data. These are all compatible subtypes of :value. :Procedure is the general type for all procedures; see proc and procedure for procedure types with specific domains and codomains.

— type: :number
— type: :complex
— type: :real
— type: :rational
— type: :integer

Types of the Scheme numeric tower. :integer <= :rational <= :real <= :complex <= :number

— type: :exact
— type: :inexact
— type: :exact-integer
— type: :inexact-real

:Exact and :inexact are the types of exact and inexact numbers, respectively. They are typically met with one of the types in the numeric tower above; :exact-integer and :inexact-real are two conveniences for the most common meets.

— type: :other

:Other is for types that do not fall into any of the previous value categories. (:other <= :value) All new types introduced, for example by loophole, are compatible subtypes of :other.

— type constructor: variable type

This is the type of all assignable variables, where type <= :value. Assignment to variables whose types are value types, not assignable variable types, is invalid.

— type: :syntax
— type: :structure

:Syntax and :structure are two other maximal elements of the type lattice, along with :values. :Syntax is the type of macros or syntax transformers. :Structure is the general type of all structures.

3.4.1 Types in the configuration language

Scheme48's configuration language has several places in which to write types. However, due to the definitions of certain elements of the configuration language, notably the export syntax, the allowable type syntax is far more limited than the above. Only the following are provided:

— type: :values
— type: :value
— type: :arguments
— type: :syntax
— type: :structure

All of the built-in maximal elements of the type lattice are provided, as well as the simple compatible subtype :values, :value.

— type: :boolean
— type: :char
— type: :null
— type: :unspecific
— type: :pair
— type: :string
— type: :symbol
— type: :vector
— type: :procedure
— type: :input-port
— type: :output-port
— type: :number
— type: :complex
— type: :real
— type: :rational
— type: :integer
— type: :exact-integer

These are the only value types provided in the configuration language. Note the conspicuous absence of :exact, :inexact, and :inexact-real.

— type constructor: procedure domain codomain
— type constructor: proc (arg-type ...) result-type

These two are the only type constructors available. Note here the conspicuous absence of some-values, so procedure types that are constructed by procedure can accept only one argument (or use the overly general :values type) & return only one result (or, again, use :values for the codomain), and procedure types that are constructed by proc are similar in the result type.