Logical Types ============= Fastavro supports the following official logical types: * decimal * uuid * date * time-millis * time-micros * timestamp-millis * timestamp-micros * local-timestamp-millis * local-timestamp-micros Fastavro is missing support for the following official logical types: * duration How to specify logical types in your schemas -------------------------------------------- The docs say that when you want to make something a logical type, you just need to add a `logicalType` key. Unfortunately, this means that a common confusion is that people will take a pre-existing schema like this:: schema = { "type": "record", "name": "root", "fields": [ { "name": "id", "type": "string", }, ] } And then add the uuid logical type like this:: schema = { "type": "record", "name": "root", "fields": [ { "name": "id", "type": "string", "logicalType": "uuid", # This is the wrong place to add this key }, ] } However, that adds the `logicalType` key to the `field` schema which is not correct. Instead, we want to group it with the `string` like so:: schema = { "type": "record", "name": "root", "fields": [ { "name": "id", "type": { "type": "string", "logicalType": "uuid", # This is the correct place to add this key }, }, ] } Custom Logical Types -------------------- The Avro specification defines a handful of logical types that most implementations support. For example, one of the defined logical types is a microsecond precision timestamp. The specification states that this value will get encoded as an avro `long` type. For the sake of an example, let's say you want to create a new logical type for a microsecond precision timestamp that uses a string as the underlying avro type. To do this, there are a few functions that need to be defined. First, we need an encoder function that will encode a datetime object as a string. The encoder function is called with two arguments: the data and the schema. So we could define this like so:: def encode_datetime_as_string(data, schema): return datetime.isoformat(data) # or def encode_datetime_as_string(data, *args): return datetime.isoformat(data) Then, we need a decoder function that will transform the string back into a datetime object. The decoder function is called with three arguments: the data, the writer schema, and the reader schema. So we could define this like so:: def decode_string_as_datetime(data, writer_schema, reader_schema): return datetime.fromisoformat(data) # or def decode_string_as_datetime(data, *args): return datetime.fromisoformat(data) Finally, we need to tell `fastavro` to use these functions. The schema for this custom logical type will use the type `string` and can use whatever name you would like as the `logicalType`. In this example, let's suppose we call the logicalType `datetime2`. To have the library actually use the custom logical type, we use the name of `-`, so in this example that name would be `string-datetime2` and then we add those functions like so:: fastavro.write.LOGICAL_WRITERS["string-datetime2"] = encode_datetime_as_string fastavro.read.LOGICAL_READERS["string-datetime2"] = decode_string_as_datetime And you are done. Now if the library comes across a schema with a logical type of `datetime2` and an avro type of `string`, it will use the custom functions. For a complete example, see here:: import io from datetime import datetime import fastavro from fastavro import writer, reader def encode_datetime_as_string(data, *args): return datetime.isoformat(data) def decode_string_as_datetime(data, *args): return datetime.fromisoformat(data) fastavro.write.LOGICAL_WRITERS["string-datetime2"] = encode_datetime_as_string fastavro.read.LOGICAL_READERS["string-datetime2"] = decode_string_as_datetime writer_schema = fastavro.parse_schema({ "type": "record", "name": "root", "fields": [ { "name": "some_date", "type": [ "null", { "type": "string", "logicalType": "datetime2", }, ], }, ] }) records = [ {"some_date": datetime.now()} ] bio = io.BytesIO() writer(bio, writer_schema, records) bio.seek(0) for record in reader(bio): print(record)