Type-safety of the Python 3.9+ Annotated type

Python has made a lot of progress with gradual typing, a flexible type system, and type-checkers that scale well to very large codebases.

One nice feature that was introduced in Python 3.9 allows developers to add annotations as additional metadata about a type. The syntax is covered in the documentation of the typing module, briefly:

name: Annotated[Type, ...]

Specifies an annotated variable name of type Type, with some additional information ... that is available to both a type-checker and to libraries at runtime.

There are a lot of Python frameworks that use descriptors and metaclasses to provide domain specific languages for defining things like object relational models and complex serialization. That is a lot of complexity and Annotated could really help to simplify things and push adoption of Python’s type annotations.

But I don’t think we’ll see broad adoption into type-checked Python codebases until we can make it type-safe.

The problem with Annotated⌗

One of the example use cases for Annotated from the Python documentation is:

Annotated[int, ValueRange(3, 10), ctype("char")]

I’ll focus on ValueRange, which in this case is added for some notional “range checking” support, which could happen at any of type-checking, runtime, or both. But since we’re talking about a type system, there’s a problem here.

The type of ValueRange is dependent on the type of the annotated variable, but we have no way to express this in Python’s type annotation system.

It’s possible to write:

Annotated[str, ValueRange(3, 10)]

And the type-checker has no way to assert that ValueRange(3, 10) should not be applied to str. You’ll get green signals from your type-checker, and it will blow up at runtime when an str <= int or str >= int comparison is attempted.

It would be much nicer if we could write something like:

class ValueRange(Generic[T]):
    def __init__(self, value_range: Sequence[T]) -> None:
        self.value_range = value_range

    def is_in_range(self, value: T) -> bool:
        return value in self.value_range

Then use it like:

Annotated[int, ValueRange(range(3, 10))]
Annotated[str, ValueRange(["first", "second", "third"])

With full type-safety, so our type-checker will complain early if we try to write:

Annotated[bool, ValueRange(["True", "False")]

As an added bonus, our IDE will give us nice squiggly lines so we can immediately spot the problem.

Solving the problem⌗

We could solve this problem by introducing a type-checking only concept, like Generic:

Annotates[Q]

When used as part of a class’ bases, this would mean that:

When the type appears as part of the tail of an Annotated[...].
Ensure the head of the Annotated satisfies Q.

For example, an annotation that declares a numeric (i.e. int or float) annotated value will always be positive could be defined as:

class Positive(Annotates[float]):
    pass

And used like:

Annotated[int, Positive()]
Annotated[float, Positive()]

But trying to check Annotated[str, Positive()] would fail as str is neither int nor float.

Returning to our ValueRange example, we should be able to express generic annotations too:

class ValueRange(Generic[T], Annotates[T]):
    def __init__(self, value_range: Sequence[T]) -> None:
        self.value_range = value_range

    def is_in_range(self, value: T) -> bool:
        return value in self.value_range

In this case, the type checker will infer the type of a ValueRange[T] from its constructor arguments, and would check that inferred type against the head of the Annotated[...].

Where next⌗

I’m not sure! This has just been annoying me long enough that I wanted to write about it.

If nobody already has a solution, I suppose this could be a PEP, but I’m not sure where to even start with that. Any pointers would be appreciated.