Exploring TypedDict in Python 3.8
This post will explore the new TypedDict feature in Python and explore leveraging TypedDict combined with the static analysis tool mypy to improve the robustness of your Python code.
PEP-589
TypedDict was proposed in PEP-589 and accepted in Python 3.8.
A few key quotes from PEP-589 can provide context and motivation for the problem that TypedDict is attempting to address.
This PEP proposes a type constructor typing.TypedDict to support the use case where a dictionary object has a specific set of string keys, each with a value of a specific type.
More generally, representing pure data objects using only Python primitive types such as dictionaries, strings and lists has had certain appeal. They are are easy to serialize and deserialize even when not using JSON. They trivially support various useful operations with no extra effort, including pretty-printing (through str() and the pprint module), iteration, and equality comparisons.
This particular section of the PEP is interesting and suggests that TypedDict can be particularly useful for retrofitting legacy code (with type annotations).
Dataclasses are a more recent alternative to solve this use case, but there is still a lot of existing code that was written before dataclasses became available, especially in large existing codebases where type hinting and checking has proven to be helpful. Unlike dictionary objects, dataclasses don’t directly support JSON serialization, though there is a third-party package that implements it
The reference implementation was defined in mypy_extensions and can be installed in Python 3.7 (e.g., pip install mypy_extensions), or using typing.TypedDict in Python 3.8.
These following examples are run with mypy 0.711 and examples shown below can be obtained from this gist.
Motivation: Dictionary-Mania
Here’s a common example where a type-checking tool (e.g., mypy) won’t be able to help you catch type errors in your code.
1 | def example_0() -> int: |
However, with TypedDict, you can define this a structural-typing-ish interface to dict for a specific data model.
Using Python < 3.8 will require from mypy_extensions import TypedDict whereas, Python >= 3.8 will require from typing import TypedDict.
Let’s create a simple Movie data model example and explore how mypy can be used to help catch type errors.
Example 1: Basic Usage of TypedDict
1 | class Movie(TypedDict): |
To enable runnable code that purposely has errors that can be caught by mypy, let’s define a helper function to require a specific Exception type to be raised.
1 | import logging |
Example 2: Exploring Mutating and Assignment of TypedDicts
Let’s mutate the Movie TypedDict instance and explore how mypy can catch type errors during assignment.
1 | def example_02() -> int: |
There’s a few interesting items to note.
mypywill catch assignment errors- The current version of
mypywill get a bit confused withdictmethods, such as.clear(). Moreover,.clear()will also yieldKeyErrors (related, seetotal=Falsekeyword of theTypedDict) mypywill only allow merging dicts that are the same type. You can’t mixTypedDictand a raw dict withoutmypyraising an issue
Example #3: TypedDicts total Keyword Argument
There’s a total keyword to the TypedDict that communicates that the dict does not need to be completely well formed. This is particularly interesting in how the mypy interpets the types.
For example, X with alpha, beta and gamma as ints, will be
1 | class X(TypedDict, total=False): |
Lets dive deeper using a variation of the previously defined Movie example using total=False to explore how mypy interprets the ‘incomplete’ data model.
1 | class Movie2(TypedDict, total=False): |
Finally, let’s explore how isinstance works with TypedDict
Example 4: TypedDict and isinstance
1 | def example_04() -> int: |
The important item to note here is that you can NOT use isinstance with TypedDict. Python will raise a runtime error of TypeError. Specifically the error you’ll see is show below.
1 | TypeError: TypedDict does not support instance and class checks |
Summary
TypedDict+mypycan be valuable to help catch type errors in Python and can help with heavy dictionary-centric interfaces in your application or libraryTypedDictcan be used in Python 3.7 usingmypy_extensionspackageTypedDictcan be used in Python 2.7 usingmypy_extensionsand the 2.7 ‘namedtuple-esque’ syntax style (e.g.,Movie = TypedDict('Movie', {'title':str, year:int}))- Using the
total=Falsekeyword toTypedDictcan introduce large wholes the static typechecking process yieldingKeyErrors. The keywordtotal=Falseshould be used judiciously (if at all) isinstanceshould not be used withTypedDictas it will raise a runtimeTypeErrorexception- Be mindful when using
TypeDictmethods such asclear() TypeDictintroduces a new (somewhat) competing data modeling alternative to dataclasses, typing.NamedTuple, “classic” classes and third-party libraries, such as pydantic and attrs. It’s not completely clear to me how all these different competing data model abstractions models are going to age gracefully
I believe TypedDict can be a value tool to help improve clarity of interfaces, specifically in legacy code that is a bit dictionary-mania heavy. However, for new code, I would suggest avoid using TypedDict in favor of the thin data models, such as pydantic and attrs.
Best to you and your Python-ing.
P.S. A runnable form of the code used in the post can be found in this gist.