fastavro.read¶
-
class
reader
(fo: Union[IO, fastavro.io.json_decoder.AvroJSONDecoder], reader_schema: Union[str, List[T], Dict[KT, VT], None] = None, return_record_name: bool = False)¶ Iterator over records in an avro file.
Parameters: - fo – File-like object to read from
- reader_schema – Reader schema
- return_record_name – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself
Example:
from fastavro import reader with open('some-file.avro', 'rb') as fo: avro_reader = reader(fo) for record in avro_reader: process_record(record)
The fo argument is a file-like object so another common example usage would use an io.BytesIO object like so:
from io import BytesIO from fastavro import writer, reader fo = BytesIO() writer(fo, schema, records) fo.seek(0) for record in reader(fo): process_record(record)
-
metadata
¶ Key-value pairs in the header metadata
-
codec
¶ The codec used when writing
-
writer_schema
¶ The schema used when writing
-
reader_schema
¶ The schema used when reading (if provided)
-
class
block_reader
(fo: IO, reader_schema: Union[str, List[T], Dict[KT, VT], None] = None, return_record_name: bool = False)¶ Iterator over
Block
in an avro file.Parameters: - fo – Input stream
- reader_schema – Reader schema
- return_record_name – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself
Example:
from fastavro import block_reader with open('some-file.avro', 'rb') as fo: avro_reader = block_reader(fo) for block in avro_reader: process_block(block)
-
metadata
¶ Key-value pairs in the header metadata
-
codec
¶ The codec used when writing
-
writer_schema
¶ The schema used when writing
-
reader_schema
¶ The schema used when reading (if provided)
-
class
Block
(bytes_, num_records, codec, reader_schema, writer_schema, named_schemas, offset, size, return_record_name=False)¶ An avro block. Will yield records when iterated over
-
num_records
¶ Number of records in the block
-
writer_schema
¶ The schema used when writing
-
reader_schema
¶ The schema used when reading (if provided)
-
offset
¶ Offset of the block from the beginning of the avro file
-
size
¶ Size of the block in bytes
-
-
schemaless_reader
(fo: IO, writer_schema: Union[str, List[T], Dict[KT, VT]], reader_schema: Union[str, List[T], Dict[KT, VT], None] = None, return_record_name: bool = False) → Union[None, str, float, int, decimal.Decimal, bool, bytes, List[T], Dict[KT, VT]]¶ Reads a single record written using the
schemaless_writer()
Parameters: - fo – Input stream
- writer_schema – Schema used when calling schemaless_writer
- reader_schema – If the schema has changed since being written then the new schema can be given to allow for schema migration
- return_record_name – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself
Example:
parsed_schema = fastavro.parse_schema(schema) with open('file', 'rb') as fp: record = fastavro.schemaless_reader(fp, parsed_schema)
Note: The
schemaless_reader
can only read a single record.
-
is_avro
(path_or_buffer: Union[str, IO]) → bool¶ Return True if path (or buffer) points to an Avro file. This will only work for avro files that contain the normal avro schema header like those create from
writer()
. This function is not intended to be used with binary data created fromschemaless_writer()
since that does not include the avro header.Parameters: path_or_buffer – Path to file