Typing in Scheme

I’ve long complained about Scheme’s typing system: it uses the bizarre mix of monomorphism (implying we, as programmers, know (or assume we know) the types of everything when we’re coding, and therefore at compile time too) and dynamic typing (meaning the implementation won’t actually help you check that your assumptions about types are correct). But I always thought of it as a natural consequence of the Scheme design philosophy: ‘Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.’ (The opening words of recent RnRS standards.)

Scheme’s type system is barely even a primitive part of the language — type introspection is done by using type predicates, one per type, and there’s nothing like Common Lisp’s type-of (Python type(), Ruby Object#class, etc) that returns anything that abstractly represents a type itself. Types, in other words, are not first-class objects, nor even second-class objects — they’re not objects at all, but properties of objects which can be enquired about. Scheme even has the notion (unique among programming languages, to my knowledge) of ‘disjoint types’ — which basically means a set of predicates of which only one will answer ‘true’ when asked if it corresponds to the type of a given object. (There are also non-disjoint types, as in the numeric tower.)

You can thus do ad-hoc polymorphism when you’re writing or designing a procedure (like the procedures in SRFI 13 which take a predicate matching a single character — but, as part of their design, you can also pass a character or a set of characters and it works as shorthand for a predicate which either only matches that character, or matches any character in that set of characters, respectively). That is, as long as you think about what types you want a procedure to expect when you’re planning it, you can make it accept multiple types using the type predicate procedures — but even this kind of polymorphism is historically rare in Scheme. (R7RS is introducing it for a small number of types with related operations, like hash tables and association lists.) Monomorphism is the rule: hence we don’t have one polymorphic ref but rather monomorphic list-ref, vector-ref, string-ref procedures, and not one polymorphic map, but a map that works on lists, plus vector-map, string-map, etc.

Parametric polymorphism, where I could later define, say, an inverse character set type and pass that to SRFI 13 procedures, doesn’t exist at all in standard Scheme, and the ways of doing it (like monkey patching procedures) are technically possible but generally frowned on.

Static type annotations and polymorphic procedure dispatch are both language features which can’t be expressed simply enough either in the S-expression notation or the idea of a procedure which Scheme has. Common Lisp generic functions, although a well-thought out and arguably successful design, are also complicated and fundamentally opaque.¹ Consider also that RnRS worked, until R5RS, by unanimity, and only features that could be unanimously agreed upon made it into the language. There are about ten different ways of doing both of these, and you’d never get a whole committee to agree on any one of them.

But if Scheme were willing to keep pervasive monomorphism — that is, to hold on to its separate list-ref, vector-ref etc. procedures — I think that could be enforced through a 100% inferring type checker, with no annotations at all. (Of course, the moment you do this, you don’t have RnRS Scheme any more, you have a new programming language in the Scheme tradition — so there’s no chance of this coming to R7RS, for example, unless maybe as an optional type checker like Python’s MyPy.)

I suspect part of the reason why e.g. Haskell and Python need type annotations to work sensibly is because they attempt to do both parametric polymorphism and static typing. If your procedures are monomorphic anyway, it presumably gets a lot easier to infer types correctly. The only thing that makes me concerned that this might not work very well are sequence types, where it might not be obvious if the type system has inferred that a list is e.g. a list of symbols vs a list of general datums. Association lists might also be rather complex.

It would probably be necessary to use type annotations in record type definitions, but you could also design a Scheme-y way of making that work. It would also be nice to make all values in the language immutable, so something like:

(define-value-type <ice-cream>
  make-ice-cream ice-cream?
  (ice-cream-flavour symbol?)
  (ice-cream-extras (lambda (x) (all symbol? x)))
  (ice-cream-price number?))

(make-ice-cream 'vanilla
                '(chocolate-sprinkles syrup)
                4.50) ;; works

(make-ice-cream "vanilla" 0 'free) ;; signals an error because the types aren’t right

make-ice-cream can be inferred to return an <ice-cream> value; ice-cream-flavour a symbol, ice-cream-extras a list of symbols, and ice-cream-price a number. A utility procedure (or more likely, syntax) for creating new values which inherit some, but not all, of the slot data of a previous value would also be handy if everything’s immutable:

(define old-ice-cream (make-ice-cream 'strawberry '() 4.00)

(make-from old-ice-cream
           (ice-cream-extras '(rainbow-sprinkles))
           (ice-cream-price 4.50)) ;; a strawberry ice cream with rainbow sprinkles, costing 50 cents extra

When types can be inferred, checked, and also defined by userland procedures, what you get is emergent typing, or dependent typing. This type system would by definition be undecidable, as any Turing-complete user code can be used in the definition of types. But there are so many undecidable type systems out there that this doesn’t really seem to matter in practice.

See also Ado (which is based on a more typical Hindley-Milner type algebra and is thus (hopefully) decidable and sound etc.).

Footnotes:

I should mention the one effort at polymorphic dispatch which I think managed to be more-or-less Scheme-y: T’s operations system, but that achieved simplicity at the cost of losing the ability to dispatch over the type of anything but the first argument; also, you had to decide in advance that your procedure was going to be a polymorphic operation, and polymorphism couldn’t be bolted on to procedures later. Not being able to dispatch over anything but the first argument means you couldn’t use it to implement a polymorphic map, for example, since map’s first argument is the procedure, not the sequence to be mapped over; probably for this reason, T’s standard library remained mostly monomorphic, with polymorphism available for user code.

John Cowan also points to Chicken’s fast-generic module as an example of a Scheme implementation of polymorphic procedures which is Scheme-y.