I spent this summer implementing a class system for Typed Lua, as part of the Google Summer of Code 2016. To try out the new class sytem, clone my fork of the TypedLua repository, cd to the root directory of the cloned repository, and run the my modified version of the Typed Lua compiler via “tlc source_filename”.

I haven’t merged any of my changes yet, and still plan to clean up some of the code within the next few weeks. Below is a description of the features that I have added.

The Basics

The following syntax is used to declare a class:

class Vector2
  x : number
  y : number

  constructor new(x : number, y : number)
    self.x = x
    self.y = y
  end

  method DotProduct(other : Vector2) : number
    return self.x*other.x + self.y*other.y
  end

  method Offset(offset : Vector2)
    self.x = self.x + offset.x
    self.y = self.y + offset.y
  end
end

The Vector2 class first declares two data members x and y, both having Lua’s 64-bit floating-point type number.

Next, it declares a constructor named new. The body of the constructor must assign values to all data members that the class defines. Unlike languages such as Java, where all variables can be assigned to a nil value, variables in Typed Lua are generally non-nil, and so leaving data members with a the default value nil would not be safe.

Finally, two methods called DotProduct and Offset are defined. Notice that the x and y fields of the invoked object are accessed via the self variable. This class system does not yet have encapsulation, so we can access the x and y fields of other externally. Omitting a return type annotation on a method declaration sets the expected return type to an infinite tuple of nils, which is appropriate for methods which, like Offset, do not return values.

Now let’s try instantiating the class and calling some of its methods.

local v = Vector2.new(1,0)
local u = Vector2.new(0,1)
print(u:DotProduct(v))

A class declaration adds a class value of the same name into scope. A class value is a table which contains all of the class’s constructors. In this case, our class table is Vector2, which contains one constructor new. Running tlc on the concatenation of the above code blocks should produce a lua file which prints 0 when it is run.

Inheritance

We can inherit using the extends keyword. Inheritance creates a child class which reuses code from the parent class and has an instance type that is a subtype of the parent class’s instance type.

local debugVectorId = 0
class DebugVector2 extends Vector2
  const id : string

  constructor new(x : number, y : number)
    self.x = x
    self.y = y
    self.id = tostring(debugVectorId)
    debugVectorId = debugVectorId + 1
    print("DebugVector #" .. self.id .. " created with x = " .. tostring(x) .. " and y = " .. tostring(y))
  end

  method Offset(offset : Vector2)
    local strInitial = "(" .. tostring(self.x) .. ", " .. tostring(self.y) .. ")"
    self.x = self.x + offset.x
    self.y = self.y + offset.y
    local strFinal = "(" .. tostring(self.x) .. ", " .. tostring(self.y) .. ")"
    print("DebugVector #" .. self.id .. " offset from " .. strInitial .. " to " .. strFinal)
  end
end

v = DebugVector2.new(1,0)
u = DebugVector2.new(0,1)
print(v:DotProduct(u))
u:Offset(v)
print(v:DotProduct(u))

Vector2’s fields x and y, as well as the methods Offset and DotProduct, are automatically included into the DebugVector2 class. However, DebugVector2 overrides the Offset with its own implementation.

Concatenating the above code block and then running the result should give the following output.

DebugVector #0 created with x = 1 and y = 0
DebugVector #1 created with x = 0 and y = 1
0
DebugVector #1 offset from (0, 1) to (1, 1)
1

The super keyword

We can improve the above code by reusing the functionality of the parent class via the super reference.

constructor new(x : number, y : number)
  super.new(x, y)
  self.id = tostring(debugVectorId)
  debugVectorId = debugVectorId + 1
  print("DebugVector #" .. self.id .. " created with x = " .. tostring(x) .. " and y = " .. tostring(y))
end

method Offset(offset : Vector2)
  local strInitial = "(" .. tostring(self.x) .. ", " .. tostring(self.y) .. ")"
  super:Offset(offset)
  local strFinal = "(" .. tostring(self.x) .. ", " .. tostring(self.y) .. ")"
  print("DebugVector #" .. self.id .. " offset from " .. strInitial .. " to " .. strFinal)
end

A constructor may call exactly one superclass constructor on its first line using the syntax “super.constructorname(arguments)”. If this happens, then the child class constructor does not need to initialize inherited members.

A superclass method may be called from anywhere inside a child class method, using the syntax “super:methodname(arguments)”.

Interfaces

We may want to declare a type without any associated implementation that describes a set of operations which several classes share in common. Such a type is called an interface, and is declared using the following syntax.

interface Showable
  method ToString() : () => (string)
end

We can associate a class to an interface by using an implements clause:

class Vector2 implements Showable
...
  method ToString() : string
    return "(" .. self.x .. ", " .. self.y .. ")"
  end
...
end

Instances of class Vector2 can then be used in contexts where instances of the Showable interface are expected.

Typedefs

Typedefs are a lightweight alternative mechanism for defining types. Consider the following type, which represents a linked list of numbers.

typedef NumList = { "val" : number, "next" : NumList }?

Here we have defined a typedef called NumList, which expands to { val : number, next : NumList }?. Like classes and interfaces, typedefs may contain recursive references to themselves. This can be useful for defining data-description schemas and data structures of unbounded size. NumList, for example, describes finite number lists of any size; nil, { val = 1, next = nil }, and { val = 1, next = { val = 1, next = nil } } are each NumLists. They represent the empty list, [1], and [1,1] respectively.

There’s an important difference between classes and interfaces on the one hand, and typedefs on the other. Classes and interfaces are nominal types, whereas typedefs are structural ones. Subtyping relations between nominal types are declared explicitly by the programmer, whereas subtyping relations between structural types are deduced by the subtyping algorithm. For example, this implies that instances of the following to classes are not interchangeable:

class Foo1
  x : boolean
  constructor new(x : boolean)
    self.x = x
  end
end

class Foo2
  x : boolean
  constructor new(x : boolean)
    self.x = x
  end
end

In fact, passing an instance of Foo1 into a function that expects an instance of Foo2 will generate a type error:

local fooFunction(foo : Foo2) : boolean
  return foo.x
end

local f1 = Foo1.new(true)
fooFunction(f1)

The following code block, however, is considered well-typed by the Typed Lua compiler.

typedef NumList1 = { "val" : number, "next" : NumList1 }?
typedef NumList2 = { "val" : number, "next" : NumList2 }?

local function listFunction(l1 : NumList1)
  local l2 : NumList2 = l1
  if l2 then
    print(tostring(l2.val) .. "\n")
    listFunction(l2.next)
  else
    print("done.")
  end
end

Mutually recursive types

By default a type definition can only refer to previously defined types. But what if we want to have two types A and B such that A is defined in terms of B and B is defined in terms of A? One of these types must be defined first; how can it reference the other one?

To handle this, several mutually recursive type definitions can be chained together with the and keyword, as in the following code block.


typedef WeirdList1 = { "val" : boolean, "next" : WeirdList2 }?
and typedef WeirdList2 = { "val" : number, "next" : WeirdList1 }?

and can be used to join an arbitrary collection of typedefs, classes, and interfaces into a mutually recursive bundle.

Global typenames and aliases

When a type is declared, it is given both a global name and an alias. The global name is the name specified by the programmer, with the name of the module containing the definition prepended to it. The global name of a class Foo defined in a module Bar is Bar.Foo. Once a global name is generated, it lasts throughout the duration of the typechecking process. Referring to all types by their global names would be tedious, and so an alias mechanism is also provided. An alias is a short name that is translated into a global name. All type definitions generate an alias associating the name that the programmer typed with the generated global name. Our Foo class would generate an alias Foo that maps to the global name Bar.Foo.

Unlike global names, aliases are locally scoped. In particular, they are not preserved across multiple files. Types defined in external files must therefore be referred to by their global names.

Class value lookup

The expression class(typename) will evaluate to the class value of the class named typename. This is useful for instantiating classes defined in external modules.

A module Foo could define a class Test:

class Test
  constructor new()
  end
end

and then Test could be instantiated in a module Bar as follows:

require("Foo")
local t = class(Foo.Test).new()

Generics classes

Classes, interfaces, functions, and methods can all be parameterized by types, a feature often called generics. Among other things, this allows us to create collections that are parameterized by their element types.

class Stack<ElementType>
  contents : {ElementType}

  constructor new()
    self.contents = {}
  end

  method push(x : ElementType)
    self.contents[#self.contents + 1] = x
  end

  method pop() : ElementType
    local ret = self.contents[#self.contents]
    self.contents[#self.contents] = nil
    assert(ret)
    return ret
  end
end

NOTE: The above code snippet will not compile in my fork, because assertions have not been integrated with occurrence typing. This functionality has been implemented in a separate fork.

A class’s type parameter’s appear in a comma-separated list between angle brackets after the class name. The above Stack class contains a single type parameter called ElementType. We can instantiate the above class as follows:

local stack = Stack.new<number>()
stack.push(3)
print(stack.pop())

The type parameters of a class definition are wrapped around each of its constructors. We can instantiate these parameters by providing type arguments between angle brackets before the constructor’s normal arguments. In the above code, we instantiate ElementType with number. This produces an instance whose type matches the instance type described by Stack, but with each occurrence of ElementType replaced with number.

Generic functions and methods

Programmers may define their own type-parameterized functions. This can be useful for standard library functions.

local function init_array<ElementType>(x : ElementType, size : number) : {ElementType}
  local ret : {ElementType} = {}
  for i = 1, size do
    ret[i] = x
  end
  return ret
end

local r = init_array<string>("thumbs up", 2)
for i=1,#r do
  local s = r[i]
  if s then
    print(s)
  end
end

We can also provide type parameters for methods.

class Foo
  method id<T>(x : T) : T
    return x
  end
end

interface Bar
  method id<T> : (T) => (T)
end

Type parameter bounds

We can restrict type parameters so that they can only be instantiated with types which are a subtype of a specified type. This specified type is referred to as the bound of the type parameter that it is attached to. Suppose that we have the following interface:

interface Showable
  method Show : () => (string)
end

We can use type parameter bounds to implement a function which prints out a string representation of all elements of an array, assuming that the element type of the array implements the Showable interface.

local function show_array<ElementType <: Showable>(x : {ElementType})
  for i=1,#x do
    local e = x[i]
    if e then
      print(e:Show())
    end
  end
end

As the above example demonstrates, we attach a bound to a type parameter by writing “<: BoundType” after the type parameter’s occurrence in the parameter list.

Generic subtyping

We can inherit from and implement generic classes.

interface Container<ElementType>
  method GetElements : () => ({ElementType})
end

class Array<ElementType> implements Container<ElementType>
  contents : {ElementType}

  constructor new(contents : {ElementType})
    self.contents = contents
  end

  method GetElements() : {ElementType}
    return self.contents
  end
end

An implements clause generally consists of a nominal type applied to arbitrary type arguments which may include occurrences of the parameters of the class being defined. When class parameters occur in the implements clause, a family of subtyping relations is generated; the above implements clause implies that for all types ElementType, Array is a subtype of Container.

As the next example demonstrates, the parameters of a class do not have to correspond to the parameters of the nominal types that it implements. The below class definition implies that for all pairs of types KeyType and ValueType, Map<KeyType, ValueType> is a subtype of Container.

class Map<KeyType, ValueType> implements Container<ValueType>
  contents : { any : ValueType }

  constructor new()
    self.contents = {}
  end

  method set(key : KeyType, val : ValueType)
    self.contents[key] = val
  end

  method get(key : KeyType) : ValueType?
    return self.contents[key]
  end

  method GetElements() : {ValueType}
    local ret : {ValueType} = {}
    for k,v in pairs(self.contents) do
      ret[#ret + 1] = v
    end
    return ret
  end
end

The arguments to a nominal type constructor occurring in an implements clause can be arbitrary types, both structural and nominal. The next example demonstrates the use of a structural type, namely Point2, as an argument in an implements clause.

typedef Point2 = { x : number, y : number }

class Polygon implements Container<Point2>
  points : {Point2}

  constructor new(points : {Point2})
    self.points = points
  end

  method GetElements() : {Point2}
    return self.points
  end
end

Recursive inheritance and shapes

Suppose we want to bound a type parameter in a way that requires values of its argument to be comparable to each other. We might use the following interface, which describes types that are comparable to some type T.

interface Ordered<T>
  method lessThanEq : (T) => (boolean)
end

Then we would implement this interface with the folowing class.

class NumSet implements Ordered<NumSet>
  contents : { number : boolean? }

  method lessThanEq(other : NumSet) : boolean
    for k,v in pairs(self.contents) do
      if not other.contents[k] then
        return false
      end
    end
    return true
  end

  constructor new(contents : { number : boolean? })
    self.contents = contents
  end
end

However, the compiler generates an error from the above implements clause. The reason for this is that it creates a cycle in the type graph. The type graph is a graph in which nominal typenames are nodes. Every implements and extends clause
adds an edge from the class being defined (NumSet in this case), to every nominal typename occurring in the implements or extends clause (here this includes both Ordered and NumSet) labeled with the name of the outer nominal type of the implements or extends clause (Ordered in this case). To ensure that our subtyping algorithm terminates, we forbid any implements or extends clause which introduces cycles into this graph.

Still, there’s a need for recursive inheritance as described above. To deal with this, we include another kind of nominal type definition in addition to classes and interfaces: shapes. A shape definition is exactly like an interface definition, but the word interface is replaced with shape:

shape Ordered<T>
  method lessThanEq : (T) => (boolean)
end

When searching for cycles in our type graph, we ignore those edges that are labeled with shapes. After changing Ordered to a shape, the NumSet class should compile without producing any errors. Occurrences of shape types are restricted to the outer level of type bounds and implements clauses; for a somewhat technical reason, this ensures that our subtyping algorithm terminates even when our type graph has cycles which include shape edges.

To utilize recursively inheriting types, we use type bounds which refer to the variable being bounded.

local comparable<T <: Ordered<T>>(x : T, y : T) : boolean
  return (x:lessThanEq(y) or y:lessThanEq(x))
end

Variance annotations

To give nominal types greater subtyping flexibility, we allow the user to provide variance annotations for the type parameters of classes, interfaces, and shapes. Prefixing a type parameter with + designates that the parameter is covariant, whereas prefixing a parameter with - designates that it is contravariant.

interface Describable<+DescriptionType>
  method Describe : () => (DescriptionType)
end

The above interface indicates that its type parameter DescriptionType is covariant by prefixing it with a +. What this implies for subtyping is that if A and B are types and B is a subtype of A, the Ord<B> is a subtype of Ord<A>. Covariant means with change. A covariant parameter is one whose subtyping precision changes in the same direction as the type that contains it; Ord<B> is more precise than Ord<A> only when B is more precise than A.

interface Accumulator<-ValueType>
  method Accumulate : (ValueType) => ()
end

Here is an interface for classes which describe objects which can “absorb” values of some type ValueType. ValueType has been marked as contravariant by prefixing it with a -. Contravariance means against change. A contravariant type parameter is one whose subtyping precision changes in the opposite direction of the subtyping precision of the type that contains it; if B is a subtype of A then Accumulator<A> is a subtype of Accumulator<B>.

The positions of the occurrences of a type parameter in a type definition dictate which variance annotations it can have. Roughly, types variables occurring only in input positions are allowed to be contravariant, whereas type variables occurring only in output positions are allowed to be covariant.

DescriptionType is allowed to be covariant because it occurs as a method return type, and classifies values that instances of the class are outputting into an external context. Suppose DetailedDescription is a subtype of Description; then retrieving a DetailedDescription from the Describe method of Describable<Description> is perfectly acceptable, because we were expecting a Description, and values of type DetailedDescription can be used in any context where values of type Description are expected.

On the other hand, ValueType is allowed to be contravariant because it occurs as a method input type. Suppose OddInteger is a subtype of Integer. Then passing an OddInteger into the Accumulate method of an object of type Accumulator<Integer> is perfectly acceptable, and so we can use an object of type Accumulator<Integer> where an object of type Accumulator<OddInteger> is expected.

Additional Features

  • A class can implement multiple interfaces by using a comma-separated list for an implements clause.
class Foo extends Interface1, Interface2
  • A nominal subtyping edge may be added without introducing any new nominal type definitions using an implements statement.
...
a = 5
Foo implements Bar
a = a + 1
  • A class may be used as an interface.
class Foo ... end
class Bar implements Foo ... end

Failed subtyping queries now generate explanations of why the specified subtyping judgment does not hold. These can currently be a bit messy, which is something I hope to improve soon.