SIP-35 - Opaque types

Language

This proposal has been implemented with some changes in Scala 3.0.0.

Authors: Erik Osheim and Jorge Vicente Cantero

Supervisor and advisor: Sébastien Doeraene

History

Date Version
Sep 20th 2017 Initial Draft

Introduction

This is a proposal to introduce syntax for type aliases that only exist at compile time and emulate wrapper types.

The goal is that operations on these wrapper types must not create any extra overhead at runtime while still providing a type safe use at compile time.

Some use cases for opaque types are:

  • Implementing type members while retaining parametricity. Currently, concrete type definitions are treated as type aliases, i.e. they are expanded in-place.

  • New numeric classes, such as unsigned integers. There would no longer need to be a boxing overhead for such classes. This is similar to value types in .NET and newtype in Haskell. Many APIs currently use signed integers (to avoid overhead) but document that the values will be treated as unsigned.

  • Classes representing units of measure. Again, no boxing overhead would be incurred for these classes.

  • Classes representing different entities with the same underlying type, such as Id and Password being defined in terms of String.

We expand on all these points in our motivation section.

For a definition of boxing and previous state-of-the-art, we recommend reading SIP-15.

Implementation note

Opaque types have been implemented in Scala 3.0.0 with some changes compared to this proposal.

Opaque types

Motivation

Authors often introduce type aliases to differentiate many values that share a very common type (e.g. String, Int, Double, Boolean, etc.). In some cases, these authors may believe that using type aliases such as Id and Password means that if they later mix these values up, the compiler will catch their error. However, since type aliases are replaced by their underlying type (e.g. String), these values are considered interchangeable (i.e. type aliases are not appropriate for differentiating various String values).

One appropriate solution to the above problem is to create case classes which wrap String. This works, but incurs a runtime overhead (for every String in the previous system we also allocate a wrapper, or a “box”). In many cases this is fine but in some it is not.

Value classes, a Scala feature proposed in SIP-15, were introduced to the language to offer classes that could be inlined in some scenarios, thereby removing runtime overhead. These scenarios, while certainly common, do not cover the majority of scenarios that library authors have to deal with. In reality, experimentation shows that they are insufficient, and hence performance-sensitive code suffers (see Appendix A).

In the following example, we show the current limitations of value classes and motivate the need of compile-time wrapper types.

An example

Imagine we are working with floating-point data that are logarithmically-scaled (e.g. sensor data). We might have gigabytes of data which would be convenient to represent in terms of arrays of primitive doubles (rather than boxing each value).

We might choose to use value classes as follows:

package object valueclass {
  class Logarithm(val exponent: Double) extends AnyVal {
    def toDouble: Double = math.exp(exponent)
    def +(that: Logarithm): Logarithm = Logarithm(toDouble + that.toDouble)
    def *(that: Logarithm): Logarithm = new Logarithm(exponent + that.exponent)
  }

  object Logarithm {
    def apply(x: Double): Logarithm = new Logarithm(math.log(x))
  }
}

package object usesites {
  // 1e7 is approximately (e ** 16.11809565095832),
  // so x is encoded as 16.11809565095832.
  val x: Logarithm = Logarithm(1e7)
}

Users of this library can use new Logarithm(...) to wrap Double values (and access alternate implementations of + and *) without allocating Logarithm values. This is confirmed when users have a look at the generated bytecode for Logarithm and notice that the signature of apply is redefined as def apply(x: Double): Double and that val x: Logarithm = Logarithm(1e7) is instead val x: Double = Logarithm.apply(1e7).

Value classes can rewrite many instances of Logarithm in terms of Double, including local variables, method parameters, return types, and most fields. SIP-15 lays out the exact circumstances when these rules occur, and extension methods ensure that boxing and unboxing occurs transparently.

Unfortunately, this transparent boxing is relatively easy to trigger. Some very common situations where Logarithm instances will be allocated at runtime include:

  • use in arrays (e.g. Array[Logarithm])
  • use in generic collections (e.g. List[Logarithm])
  • parameters or return values of anonymous functions
  • calling equals or hashCode on a Logarithm

Concretely, consider:

val x = Logarithm(1e5)
val xs = List(Logarithm(12345.0), Logarithm(67890.0)).map(_ + x)

When looking at the bytecode, the author will notice that the the emitted code boxes Double values into allocated Logarithm values, stores those in the list, then produces a Function1[Logarithm, Logarithm] which unboxes each logarithm and allocates a new result. Users who expect use of value classes to be allocation-free (and equivalent to working with the underlying values) will be sorely-disappointed in these cases.

Appendix A illustrates more cases where this unintended boxing/unboxing happens, hurting the performance of the code.

Boxing/unboxing of value classes happens anywhere in the program where the type signatures are generic and require the runtime to pass a java.lang.Object.

Given the frequency of these scenarios in real-world code, we would like to introduce the notion of compile-time wrapper types. Compile-time wrapper types do not depend on the runtime and its capabilities, and are always erased at runtime. The previous code snippet can be implemented using opaque types and will produce code that never boxes/unboxes.

In exchange, opaque types disallow the redefinition of equals and hashCode, which in our experience is often unnecessary.

Definition

Let’s continue with the example in our motivation section, and define Logarithm with an opaque type:

package object opaquetypes {
  opaque type Logarithm = Double
}

The opaque type definition is akin to the one of type aliases, except that it is prefixed by a new opaque keyword.

Opaque types are always accompanied by type companions. Type companions define the public API of an opaque type, and they are defined in the same way that class companions are. It is possible to define an opaque type without a type companion, but then the opaque type is useless.

Let’s define a type companion for our previous opaque type Logarithm:

package object opaquetypes {
  // The previous opaque type definition
  opaque type Logarithm = Double

  object Logarithm {
    // These are the ways to lift to the logarithm type
    def apply(d: Double): Logarithm = math.log(d)

    def safe(d: Double): Option[Logarithm] =
      if (d > 0.0) Some(math.log(d)) else None

    // This is the first way to unlift the logarithm type
    def exponent(l: Logarithm): Double = l

    // Extension methods define opaque types' public APIs
    implicit class LogarithmOps(val `this`: Logarithm) extends AnyVal {
      // This is the second way to unlift the logarithm type
      def toDouble: Double = math.exp(`this`)
      def +(that: Logarithm): Logarithm = Logarithm(math.exp(`this`) + math.exp(that))
      def *(that: Logarithm): Logarithm = Logarithm(`this` + that)
    }
  }
}

The above Logarithm type companion contains the following definitions:

  • Methods to lift the type from Double to Logarithm (i.e. apply and safe)
  • Methods to lower the type from Logarithm to Double (i.e. exponent)
  • Extension methods to unlift the type from Logarithm to Double (i.e. toDouble)
  • Extension methods to define more operations on the type (i.e. like + and *)

The key peculiarity of opaque types is that they behave like normal type aliases inside their type companion; that is, users can convert from the type alias and its equivalent definition interchangeably without the use of explicit lift and unlift methods. We can say that opaque types are “transparent” inside their type companion.

However, the story changes for users of this API. Outside of their type companions, opaque types are not transparent and, therefore, the Scala compiler fails to compile code that pretends they are. A common example is:

val l: Logarithm = 1.0

which fails to compile with a type mismatch error:

<console>:11: error: type mismatch;
 found   : Double
 required: Logarithm
       val l: Logarithm = 1.0
                          ^

The same happens for val d: Double = l where l is an instance of Logarithm.

The idea, then, is to let library authors create wrapper types and their API in a concrete, isolated part of their code and force users to use this API to lift to and unlift from the opaque type.

By design, downstream users can define more operations on these opaque types via their own extension methods, but they cannot create a new API to lift/unlift them, e.g. users need to use the library-provided toDouble, Logarithm.safe and Logarithm.apply.

The following code showcases legit uses of the Logarithm opaque type:

package object usesites {
  import opaquetypes._
  val l = Logarithm(1.0)
  val l2 = Logarithm(2.0)
  val l3 = l + l2
  val d = l3.toDouble
  val l3: Logarithm = (1.0).asInstanceOf[Logarithm]
}

While the following fails to typecheck:

package object usesites {
  import opaquetypes._
  val l: Logarithm = Logarithm(1.0)
  val d: Double = l // fails to typecheck
  val l2: Logarithm = 1.0 // fails to typecheck
}

For the sake of completeness, this is how you extend these opaque types with more operations:

package object usesites {
  // ...
  implicit class UserOps(`this`: Logarithm) extends AnyVal {
    def **(that: Logarithm): Logarithm = Logarithm(`this` * that)
  }
}

Note that the rationale for this encoding is to allow users to convert from the opaque type and the underlying type in constant time. Users have to be able to tag complex type structures without having to reallocate, iterate over or inspect it.

Formal definition

The Scala Language doesn’t have a concept of either opaque types or type companions. In the following section, we formalize our previous definitions and specify the required changes to the Scala Language Specification.

Opaque type definition

An opaque type follows the same structure as an alias type but it requires the use of a new opaque modifier.

LocalModifier     ::=  ‘abstract’
                    |  ‘final’
                    |  ‘sealed’
                    |  ‘implicit’
                    |  ‘lazy’
                    |  ‘opaque’

This new modifier is then used to qualify the type alias definition:

Def        ::= [‘opaque‘] ‘type’ {nl} TypeDef
TypeDef    ::=  id [TypeParamClause] ‘=’ Type

Opaque modifiers are only valid for type definitions. Note that contrary to type alias definitions, opaque type definitions cannot be overridden.

Here’s a formal definition of opaque types:

An opaque type t = T defines t to be an alias name for the type T only in the scope of the opaque type companion t. The left hand side of an opaque type may have a type parameter clause, e.g. type t[tps] = T. The scope of a type parameter extends over the right hand side T and the type parameter clause tps itself.

As per this definition, opaque types modify the type equivalence relationship explained in the 3.5.1. Equivalence section of the Scala Language Specification. In particular, the next item qualifies the type equivalence for opaque types:

If t is defined by an opaque type opaque type t = T, then t is not equivalent to T unless t occurs in the template of the opaque type companion.

In the Implementation notes, we explain how this can be achieved in the implementation.

Opaque type companion

We define a type companion in a similar way companion classes are described in the Scala Language Specification in 5.5. Object Definitions:

Generally, a companion module of an opaque type is an object which has the same name as the opaque type and is defined in the same scope and compilation unit. Conversely to companion classes, the companion class of the module does not have a notion of companion type.

These opaque type companions are also part of the implicit search scope of the opaque type t. Therefore, uses of extension methods defined in the opaque type companion do not require users to import the opaque type companion explicitly.

Technical discussions

In our current proposal, we make several trade-offs. Next, we defend these trade-offs and propose alternative ways to achieve the same (if any).

opaque as a modifier

opaque has been added as a modifier to a type def for simplicity. We believe that a new keyword for this feature is worthwhile.

Note that adding opaque as a modifier prevents the use of opaque anywhere in the users’ program, which could possibly break someone’s code. To fix this scenario, we could create a Scalafix rule that will rewrite any place where opaque is an identifier.

For those SIP Committee members wary of adding a new keyword to the language, we propose an alternative approach. Instead of defining opaque types with the opaque modifier, opaque types may also be defined by combining the existing new and type keywords, e.g. new type Logarithm = Double. This option would be akin to the syntax used in Haskell for wrapper types, e.g. newtype.

Type companions

This proposal only adds the notion of type companions for opaque types. After discussing with members of the SIP Committee and Scala compiler hackers, we believe that a notion of type companions extended to type aliases would not work because the typechecker aggressively dealiases**, and it is not possible to link to the type companion symbol once type aliases have been dealiased.

** Note that dealiasing means beta reducing a type alias.

Static extension methods

The extension methods synthesized for implicit value classes are not static. We have not measured if the current encoding has an impact in performance, but if so, we can consider changing the encoding of extension methods either in this proposal or in a future one.

More examples

Type tagging

This example focuses on using type parameters to opaque types (one of which is a phantom type). The goal here is to be able to mark concrete types (S) with arbitrary tags (T) which can be used to distinguish them at compile-time.

package object tagging {

  // Tagged[S, T] means that S is tagged with T
  opaque type Tagged[S, T] = S

  object Tagged {
    def tag[S, T](s: S): Tagged[S, T] = s
    def untag[S, T](st: Tagged[S, T]): S = st

    def tags[F[_], S, T](fs: F[S]): F[Tagged[S, T]] = fs
    def untags[F[_], S, T](fst: F[Tagged[S, T]]): F[S] = fst

    implicit def taggedClassTag[S, T](implicit ct: ClassTag[S]): ClassTag[Tagged[S, T]] =
      ct
  }

  type @@[S, T] = Tagged[S, T]

  implicit class UntagOps[S, T](st: S @@ T) extends AnyVal {
    def untag: S = Tagged.untag(st)
  }

  implicit class UntagsOps[F[_], S, T](fs: F[S @@ T]) extends AnyVal {
    def untags: F[S] = Tagged.untags(fs)
  }

  implicit class TagOps[S](s: S) extends AnyVal {
    def tag[T]: S @@ T = Tagged.tag(s)
  }

  implicit class TagsOps[F[_], S](fs: F[S]) extends AnyVal {
    def tags[T]: F[S @@ T] = Tagged.tags(fs)
  }

  trait Meter
  trait Foot
  trait Fathom

  val x: Double @@ Meter = (1e7).tag[Meter]
  val y: Double @@ Foot = (123.0).tag[Foot]
  val xs: Array[Double @@ Meter] = Array(1.0, 2.0, 3.0).tags[Meter]

  val o: Ordering[Double] = implicitly
  val om: Ordering[Double @@ Meter] = o.tags[Meter]
  om.compare(x, x) // 0
  om.compare(x, y) // does not compile
  xs.min(om) // 1.0
  xs.min(o) // does not compile

  // uses ClassTag[Double] via 'Tagged.taggedClassTag'.
  val ys = new Array[Double @@ Foot](20)
}

There are a few interesting things to note here.

First, as above we expect that tagging and untagging will not cause any boxing of these values at runtime, even though Tagged is generic. We also expect that the Array[Double @@ Meter] will be represented by Array[Double] at runtime.

Second, notice that Ordering[Double] and ClassTag[Double] are not automatically in scope for Tagged[Double, Meter]. Opaque types currently need to “re-export” (or otherwise provide) their own implicit values.

It would be possible to automatically provide ClassTag instances, using an implicit val in the case of opaque types wrapping concrete types (e.g.opaque type X = Double) and implicit def in cases such as above.

Fix-point type

Here’s an interesting little example which defines the recursive opaque type Fix:

package object fixed {
  opaque type Fix[F[_]] = F[Fix[F]]

  object Fix {
    def fix[F[_]](unfixed: F[Fix[F]]): Fix[F] = unfixed
    def unfix[F[_]](fixed: Fix[F]): F[Fix[F]] = fixed
  }

  sealed abstract class TreeU[R]

  type Tree = Fix[TreeU]

  object TreeU {
    def cata[A](t: Tree)(f: TreeU[A] => A): A =
      f(Fix.unfix(t) match {
        case Branch(l, r) => Branch(cata(l)(f), cata(r)(f))
        case Leaf(s) => Leaf(s)
      })

    case class Branch[R](left: R, right: R) extends TreeU[R]
    case class Leaf[R](label: String) extends TreeU[R]
  }

  def leaf(s: String): Tree = Fix.fix(Leaf(s))
  def branch(lhs: Tree, rhs: Tree): Tree = Fix.fix(Branch(lhs, rhs))

  val tree: Tree = branch(branch(leaf("a"), leaf("b")), leaf("c"))

  println(tree)
  // Branch(Branch(Leaf(a), Leaf(b)), Leaf(c))
}

This is an interesting example which is intended to show that opaque types (unlike type aliases) have an independent existence, even within the companion. This means that unlike type aliases, Fix[F] should not result in an infinite expansion in the above code.

The Fix type is useful to implementing recursion schemes, or just for creating recursive structures which are parameterized on their recursive type.

Explicit nullable types

There have previously been proposals to provide a “zero-cost Option” using value classes. Opaque types make this very straight-forward by bounding the underlying type (A) with AnyRef.

package object nullable {
  opaque type Nullable[A <: AnyRef] = A

  object Nullable {
    def apply[A <: AnyRef](a: A): Nullable[A] = a

    implicit class NullableOps[A <: AnyRef](na: Nullable[A]) {
      def exists(p: A => Boolean): Boolean =
        na != null && p(na)
      def filter(p: A => Boolean): Nullable[A] =
        if (na != null && p(na)) na else null
      def flatMap[B <: AnyRef](f: A => Nullable[B]): Nullable[B] =
        if (na == null) null else f(na)
      def forall(p: A => Boolean): Boolean =
        na == null || p(na)
      def getOrElse(a: => A): A =
        if (na == null) a else na
      def map[B <: AnyRef](f: A => B): Nullable[B] =
        if (na == null) null else f(na)
      def orElse(na2: => Nullable[A]): Nullable[A] =
        if (na == null) na2 else na
      def toOption: Option[A] =
        Option(na)
    }
  }
}

This example provides most of the benefits of using Option at API boundaries with libraries that use null (such as many Java libraries). Unlike a value class, we can guarantee that there will never be a wrapper around the raw values (or raw nulls).

Notice that Nullable[Nullable[B]] is not a valid type, because Nullable[B] is not known to be <: AnyRef. This is a key difference between a type like Option (which is parametric and can easily wrap itself) and a type like Nullable (which only has one null value to use).

Custom instances

Currently, many libraries (including Scalding and Algebird) define wrapper types which change the kind of aggregation used for that type. This is useful in frameworks like MapReduce, Spark, Flink, Storm, etc. where users describe computation in terms of mapping and aggregation.

The following example shows how opaque types can make using these wrappers a bit more elegant:

package object groups {
  trait Semigroup[A] {
    def combine(x: A, y: A): A
  }

  object Semigroup {
    def instance[A](f: (A, A) => A): Semigroup[A] =
      new Semigroup[A] {
        def combine(x: A, y: A): A = f(x, y)
      }
  }

  type Id[A] = A

  trait Wrapping[F[_]] {
    def wraps[G[_], A](ga: G[A]): G[F[A]]
    def unwraps[G[_], A](ga: G[F[A]]): G[A]
  }

  abstract class Wrapper[F[_]] { self =>
    def wraps[G[_], A](ga: G[A]): G[F[A]]
    def unwraps[G[_], A](gfa: G[F[A]]): G[A]

    final def apply[A](a: A): F[A] = wraps[Id, A](a)

    implicit object WrapperWrapping extends Wrapping[F] {
      def wraps[G[_], A](ga: G[A]): G[F[A]] = self.wraps(ga)
      def unwraps[G[_], A](ga: G[F[A]]): G[A] = self.unwraps(ga)
    }
  }

  opaque type First[A] = A
  object First extends Wrapper[First] {
    def wraps[G[_], A](ga: G[A]): G[First[A]] = ga
    def unwrap[G[_], A](gfa: G[First[A]]): G[A] = gfa
    implicit def firstSemigroup[A]: Semigroup[First[A]] =
      Semigroup.instance((x, y) => x)
  }

  opaque type Last[A] = A
  object Last extends Wrapper[Last] {
    def wraps[G[_], A](ga: G[A]): G[Last[A]] = ga
    def unwrap[G[_], A](gfa: G[Last[A]]): G[A] = gfa
    implicit def lastSemigroup[A]: Semigroup[Last[A]] =
      Semigroup.instance((x, y) => y)
  }

  opaque type Min[A] = A
  object Min extends Wrapper[Min] {
    def wraps[G[_], A](ga: G[A]): G[Min[A]] = ga
    def unwrap[G[_], A](gfa: G[Min[A]]): G[A] = gfa
    implicit def minSemigroup[A](implicit o: Ordering[A]): Semigroup[Min[A]] =
      Semigroup.instance(o.min)
  }

  opaque type Max[A] = A
  object Max extends Wrapper[Max] {
    def wraps[G[_], A](ga: G[A]): G[Max[A]] = ga
    def unwrap[G[_], A](gfa: G[Max[A]]): G[A] = gfa
    implicit def maxSemigroup[A](implicit o: Ordering[A]): Semigroup[Max[A]] =
      Semigroup.instance(o.max)
  }

  opaque type Plus[A] = A
  object Plus extends Wrapper[Plus] {
    def wraps[G[_], A](ga: G[A]): G[Plus[A]] = ga
    def unwrap[G[_], A](gfa: G[Plus[A]]): G[A] = gfa
    implicit def plusSemigroup[A](implicit n: Numeric[A]): Semigroup[Plus[A]] =
      Semigroup.instance(n.plus)
  }

  opaque type Times[A] = A
  object Times extends Wrapper[Times] {
    def wraps[G[_], A](ga: G[A]): G[Times[A]] = ga
    def unwrap[G[_], A](gfa: G[Times[A]]): G[A] = gfa
    implicit def timesSemigroup[A](implicit n: Numeric[A]): Semigroup[Times[A]] =
      Semigroup.instance(n.times)
  }

  opaque type Reversed[A] = A
  object Reversed extends Wrapper[Reversed] {
    def wraps[G[_], A](ga: G[A]): G[Reversed[A]] = ga
    def unwrap[G[_], A](gfa: G[Reversed[A]]): G[A] = gfa
    implicit def reversedOrdering[A](implicit o: Ordering[A]): Ordering[Reversed[A]] =
      o.reverse
  }

  opaque type Unordered[A] = A
  object Unordered extends Wrapper[Unordered] {
    def wraps[G[_], A](ga: G[A]): G[Unordered[A]] = ga
    def unwrap[G[_], A](gfa: G[Unordered[A]]): G[A] = gfa
    implicit def unorderedOrdering[A]: Ordering[Unordered[A]] =
      Ordering.by(_ => ())
  }
}

The example demonstrates using an abstract class (Wrapper) to share code between opaque type companion objects. Like the tagging example, we can use two methods (wraps and unwraps) to wrap and unwrap A types, even if nested in an arbitrary context (G[_]). These methods cannot be implemented in Wrapper because each opaque type companion contains the only scope where its particular methods can be implemented.

Similarly to the tagging example, these types are zero-cost wrappers which can be used to tag how to aggregate the underlying type (for example Int).

Probability interval

Here’s an example that demonstrates how opaque types can limit the underlying type’s range of values in a way that minimizes the required error-checking:

package object prob {
  opaque type Probability = Double

  object Probability {
    def apply(n: Double): Option[Probability] =
      if (0.0 <= n && n <= 1.0) Some(n) else None

    def unsafe(p: Double): Probability = {
      require(0.0 <= p && p <= 1.0, s"probabilities lie in [0, 1] (got $p)")
      p
    }

    def asDouble(p: Probability): Double = p

    val Never: Probability = 0.0
    val CoinToss: Probability = 0.5
    val Certain: Probability = 1.0

    implicit val ordering: Ordering[Probability] =
      implicitly[Ordering[Double]]

    implicit class ProbabilityOps(p1: Probability) extends AnyVal {
      def unary_~ : Probability = Certain - p1
      def &(p2: Probability): Probability = p1 * p2
      def |(p2: Probability): Probability = p1 + p2 - (p1 * p2)

      def isImpossible: Boolean = p1 == Never
      def isCertain: Boolean = p1 == Certain

      import scala.util.Random

      def sample(r: Random = Random): Boolean = r.nextDouble <= p1
      def toDouble: Double = p1
    }

    val caughtTrain = Probability.unsafe(0.3)
    val missedTrain = ~caughtTrain
    val caughtCab = Probability.CoinToss
    val arrived = caughtTrain | (missedTrain & caughtCab)

    println((1 to 5).map(_ => arrived.sample()).toList)
    // List(true, true, false, true, false)
  }
}

Outside of the Probability companion, we can be sure that the underlying Double values fall in the interval [0, 1], which means we don’t need to include error-checking code when working with Probability values. We can be sure that adding this kind of compile-time safety to a program doesn’t add any additional cost (except for the error-checking that we explicitly want).

This example is somewhat similar to Logarithm above. Other properties we might want to verify at compile-time: NonNegative, Positive, Finite, Unsigned and so on.

Immutable (i.e. write-once) arrays

Often performance sensitive code will use arrays via the following pattern:

  1. Allocate an array of a fixed, known size.
  2. Initialize the array via code which mutates it.
  3. Return the array, which is now treated as immutable.

In this pattern, the vast majority of time is spent in the third step, where the array’s compactness and speed of iteration/access provide major wins over other data types.

This pattern is currently only enforced by convention. However, opaque types create an opportunity to provide more safety without incurring any overhead:

package object ia {

  import java.util.Arrays

  opaque type IArray[A] = Array[A]

  object IArray {
    @inline final def initialize[A](body: => Array[A]): IArray[A] = body

    @inline final def size(ia: IArray[A]): Int = ia.length
    @inline final def get(ia: IArray[A], i: Int): A = ia(i)

    // return a sorted copy of the array
    def sorted(ia: IArray[A]): IArray[A] = {
      val arr = Arrays.copyOf(ia, ia.length)
      scala.util.Sorting.quickSort(arr)
      arr
    }

    // use a standard java method to search a sorted IArray.
    // (note that this doesn't mutate the array).
    def binarySearch(ia: IArray[Long], elem: Long): Int =
      Arrays.binarySearch(ia, elem)
  }

  // same as IArray.binarySearch but implemented by-hand.
  //
  // given a sorted IArray, returns index of `elem`,
  // or a negative value if not found.
  def binaryIndexOf(ia: IArray[Long], elem: Long): Int = {
    var lower: Int = 0
    var upper: Int = IArray.size(ia)
    while (lower <= upper) {
      val middle = (lower + upper) >>> 1
      val n = IArray.get(ia, middle)

      if (n == elem) return middle
      else if (n < elem) first = middle + 1
      else last = middle - 1
    }
    -lower - 1
  }
}

This example is a bit different from others: there’s no attempt to enrich the IArray type with syntactic conveniences. Rather, the goal is to show that traditional, “low-level” code with Array, Int, etc. can be written with opaque types without sacrificing any performance.

Our other examples enrich existing data types with new functionality. This example serves to constrain the operations used with a type (but without introducing any overhead/indirection, which a traditional wrapper would).

Differences with value classes

Most of the above examples can also be implemented using value classes. This section will highlight some differences between these hypothetical encodings.

Used as a type parameter

In many cases an author would introduce opaque types or value classes to add extra type safety to a “stringly-typed” API, by replacing instances of the String type with a more specific type.

For example:

package object pkg {

  import Character.{isAlphabetic, isDigit}

  class Alphabetic private[pkg] (val value: String) extends AnyVal

  object Alphabetic {
    def fromString(s: String): Option[Alphabetic] =
      if (s.forall(isAlphabetic(_))) Some(new Alphabetic(s))
      else None
  }

  opaque type Digits = String

  object Digits {
    def fromString(s: String): Option[Digits] =
      if (s.forall(isDigit(_))) Some(s)
      else None

    def asString(d: Digits): String = d
  }
}

The code here is relatively comparable. However, when changing String to Alphabetic in code, the following types would be changed (i.e. boxed):

  • Array[Alphabetic]
  • Option[Alphabetic] (e.g. the result of Alphabetic.fromString)
  • Vector[Alphabetic]
  • Alphabetic => Boolean
  • Map[Alphabetic, Int]
  • Ordering[Alphabetic]
  • (Alphabetic, String)
  • Monoid[Alphabetic]

In many cases users won’t mind the fact that this code will box, but there will certainly be an impact on the bytecode produced (and possibly the runtime performance).

By contrast, replacing String with Digits is guaranteed to have no impact (all occurrences of Digits are guaranteed to be erased to String). Aside from the ergonomics of calling the fromString and asString methods, there’s no runtime impact versus using the underlying type.

One wrinkle to the above is that built-in primitive types will naturally box in some situations (but not others). For example List[Double] will be represented as a List[java.lang.Double], (Double, Double) will be represented as a scala.Tuple2$mcDD$sp, and so on. In these cases, an opaque type will exhibit the same behavior.

Default visibility

By default, a value class’ constructor and accessor are public. It is possible to restrict access, using a private constructor and private accessor, but this makes the comparison between opaque types and value classes less attractive:

package object pkg {
  opaque type XCoord = Int

  case class private[pkg] YCoord (private[pkg] val n: Int) extends AnyVal

  // in both cases, we'd need public factory constructors
  // to allow users to produce values of these types.
}

Opaque types’ default behavior is more appropriate for information-hiding when defining wrapper types.

LUB differences

Value classes extend AnyVal by virtue of the syntax used to define them. One reason this is necessary is that value classes cannot be null (otherwise this could create ambiguities between their boxed and unboxed representations when wrapping AnyRef values).

By contrast, when seen from the “outside” opaque types extend Any. Their bounds are the same as those of a type parameter or type member without explicit bounds, i.e. A <: Any >: Nothing.

This is not a major difference (for example, under -Xlint inferring either type will generate a warning) but does it illustrate that an opaque type is standing in for an unknown type (i.e. anything) whereas a value class introduces its own semantics which remain in the type system even if we hope to never see the instances:

class Letters(val toString: String) extends AnyVal
class Digits(val toInt: Int) extends AnyVal

// inferred to List[AnyVal]
val ys = List(new Letters("abc"), new Digits("123"))

// inferred to List[String].
List[AnyVal] val xs = List("abc", "123")

Through covariance List[String] <: List[AnyRef] (and List[Any]) but it is not a List[AnyVal].

Size of artifacts produced

Since value classes do have a runtime representation, they do increase the size of runtime artifacts produced (whether a JAR file, a JavaScript file, or something else). Their methods are also compiled to multiple representations (i.e. they support both the boxed and unboxed forms via extensions methods). Again, this comes at a cost.

By contrast, opaque types have no inherent runtime footprint. The opaque type’s companion is present at runtime, but it usually contains validation and data transformation code which would have been required even if the author had just stuck to the underlying type, and doesn’t add any extra extension methods.

Implementation notes

To implement opaque types, we need to modify three compiler phases: parser, namer and typer. At the time of this writing, it is unclear if later phases like erasure must be changed as well, but we think this should not be necessary.

There are several key ideas in the current, work-in-progress implementation:

  • To meet the type equivalence relationship for opaque types, we synthesize two implicit conversions inside the opaque type companion, if they do not already exist. If opaque type t = T, then two implicit conversions are synthesized, one from t to T is synthesized and another for the other way around. The body of these methods will use t.asInstanceOf[T] and vice versa **.

  • Phases after typer always dealias opaque types. This way, erasure and codegen can unwrap opaque types out of the box and, at the bytecode level, their underlying representation is used instead.

All these ideas are open to refinements by the SIP Committee.

On removing asInstanceOf at compile-time

** Note that these asInstanceOf can be removed at compile-time, but there is no precedent of doing so in the Scalac compiler. However, it is not clear whether leaving these casts will have an impact on performance – the underlying virtual machine may remove the operation based on type analysis due to the fact that the cast is from Double => Double via Class Hierarchy Analysis. Despite not having a precedence, the authors of the proposal incline to remove these checks at compile-time. This will also require a modification to the spec.

We’re also assuming that implicit enrichment via value types is a sufficient way to provide zero-cost extension methods. We’ll need to benchmark opaque types + value class enrichment to ensure that there aren’t hidden performance problems.

Cross-platform

We believe that by implementing opaque types early in the pipeline, Scala.js and Scala Native can compile them out-of-the-box. Thus, we do not expect opaque types to have problems for different backends, since erasure will always see the dealiased types.

Conclusion

We believe that opaque types fit in the language nicely. The combination of type aliases and value classes (for the zero runtime overhead of extension methods) result in compile-time wrapper types that address the performance issues of value classes.

As a summary, opaque types are:

  • A subset of type aliases that are transparent only within the type companion.
  • Extensible (users can define their own extension methods).
  • Fit for performance-sensitive code (no boxing/unboxing by design).
  • A lightweight way to add extra type-safety to strings, numbers, etc.

References

  1. (Appendix A) The High Cost of AnyVal classes